Outlier Detection by Boosting Regression Trees - Université Paris Nanterre Accéder directement au contenu
Article Dans Une Revue Journal of Statistical Research of Iran JSRI Année : 2006

Outlier Detection by Boosting Regression Trees

Résumé

A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev's inequality applied to the maximum over the boosting iterations of the average number of appearances in bootstrap samples. So the procedure is noise distribution free. It allows to select outliers as particularly hard to predict observations. A lot of well-known bench data sets are considered and a comparative study against two well-known competitors allows to show the value of the method.

Dates et versions

hal-01633703 , version 1 (13-11-2017)

Identifiants

Citer

Nathalie Chèze, Jean-Michel Poggi. Outlier Detection by Boosting Regression Trees. Journal of Statistical Research of Iran JSRI, 2006, 3 (1), pp.1--22. ⟨10.18869/acadpub.jsri.3.1.1⟩. ⟨hal-01633703⟩
48 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More