コンテンツにスキップ

AdaBoost は、それぞれの標本に対し、弱分類器英語版 ${\displaystyle t}$ を、${\displaystyle t=1}$ から ${\displaystyle t=T}$ まで順に適用し、それぞれの分類器が正解したか否かを判断する。間違って分類された標本に対応する重み ${\displaystyle D_{t}}$ は、より重くされる（あるいは、正しく分類された標本の場合は、重みを減らす）。これらの標本に対する重みから、次の t のループでは正しい分類器を早く探す事が出来る。

## 二分類のアルゴリズム

Given: ${\displaystyle (x_{1},y_{1}),\ldots ,(x_{m},y_{m})}$ where ${\displaystyle x_{i}\in X,\,y_{i}\in Y=\{-1,+1\}}$

Initialize ${\displaystyle D_{1}(i)={\frac {1}{m}},i=1,\ldots ,m.}$

For ${\displaystyle t=1,\ldots ,T}$:

• Find the classifier ${\displaystyle h_{t}:X\to \{-1,+1\}}$ that minimizes the error with respect to the distribution ${\displaystyle D_{t}}$:
${\displaystyle h_{t}={\underset {h_{j}\in {\mathcal {H}}}{\operatorname {argmin} }}\;\epsilon _{j}}$, where ${\displaystyle \epsilon _{j}=\sum _{i=1}^{m}D_{t}(i)[y_{i}\neq h_{j}(x_{i})]}$
• if ${\displaystyle \epsilon _{t}>0.5}$ then stop.
• Choose ${\displaystyle \alpha _{t}\in \mathbf {R} }$, typically ${\displaystyle \alpha _{t}={\frac {1}{2}}{\textrm {ln}}{\frac {1-\epsilon _{t}}{\epsilon _{t}}}}$ where ${\displaystyle \epsilon _{t}}$ is the weighted error rate of classifier ${\displaystyle h_{t}}$.
• Update:
${\displaystyle D_{t+1}(i)={\frac {D_{t}(i)\exp(-\alpha _{t}y_{i}h_{t}(x_{i}))}{Z_{t}}}}$

where ${\displaystyle Z_{t}}$ is a normalization factor (chosen so that ${\displaystyle D_{t+1}}$ will be a probability distribution, i.e. sum one over all x).

Output the final classifier:

${\displaystyle H(x)={\textrm {sign}}\left(\sum _{t=1}^{T}\alpha _{t}h_{t}(x)\right)}$

The equation to update the distribution ${\displaystyle D_{t}}$ is constructed so that:

${\displaystyle -\alpha _{t}y_{i}h_{t}(x_{i}){\begin{cases}<0,&y(i)=h_{t}(x_{i})\\>0,&y(i)\neq h_{t}(x_{i})\end{cases}}}$

Thus, after selecting an optimal classifier ${\displaystyle h_{t}\,}$ for the distribution ${\displaystyle D_{t}\,}$, the examples ${\displaystyle x_{i}\,}$ that the classifier ${\displaystyle h_{t}\,}$ identified correctly are weighted less and those that it identified incorrectly are weighted more. Therefore, when the algorithm is testing the classifiers on the distribution ${\displaystyle D_{t+1}\,}$, it will select a classifier that better identifies those examples that the previous classifer missed.

## ブースティングの統計的理解

ブースティングは凸集合の関数上に関する凸損失関数の最小化とみなすことができる[2]。特に、損失関数を最小化するために指数関数の損失関数:

${\displaystyle \sum _{i}e^{-y_{i}f(x_{i})}}$

および関数に対して探索を行う:

${\displaystyle f=\sum _{t}\alpha _{t}h_{t}}$

を用いる。

## 脚注

1. ^ Yoav Freund, Robert E. Schapire (1995年). “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”. July 9 2010閲覧。
2. ^ T. Zhang, "Convex Risk Minimization", Annals of Statistics, 2004.

## 外部リンク

• icsiboost, an open source implementation of Boostexter
• NPatternRecognizer , a fast machine learning algorithm library written in C#. It contains support vector machine, neural networks, bayes, boost, k-nearest neighbor, decision tree, ..., etc.