Skip to Main content Skip to Navigation
Journal articles

Outlier Detection by Boosting Regression Trees

Abstract : A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev's inequality applied to the maximum over the boosting iterations of the average number of appearances in bootstrap samples. So the procedure is noise distribution free. It allows to select outliers as particularly hard to predict observations. A lot of well-known bench data sets are considered and a comparative study against two well-known competitors allows to show the value of the method.
Document type :
Journal articles
Complete list of metadatas

https://hal-univ-paris10.archives-ouvertes.fr/hal-01633703
Contributor : Administrateur Hal Nanterre <>
Submitted on : Monday, November 13, 2017 - 11:58:36 AM
Last modification on : Wednesday, September 16, 2020 - 4:05:36 PM

Links full text

Identifiers

Citation

Nathalie Chèze, Jean-Michel Poggi. Outlier Detection by Boosting Regression Trees. Journal of Statistical Research of Iran JSRI, 2006, 3 (1), pp.1--22. ⟨10.18869/acadpub.jsri.3.1.1⟩. ⟨hal-01633703⟩

Share

Metrics

Record views

138