Random Forest

Random Forest is a relatively young but powerful statistical method. It proved its applicability to QSAR tasks solutions. Random Forest have a lot of advantages which allow to effectively analyzing huge datasets:

  • fast learning algorithm,
  • internal procedure of estimation of predictive ability,
  • robust to over-fitting and “noise” in data.
More detailed description of Random Forest method and examples of QSAR tasks solved by this method can be found in a short presentation here.

We developed Random Forest software (CF) which suits for QSAR application. Main features of the CF program are:

  • single and multi-task learning algorithms,
  • applicability domain,
  • Y scrambling procedure,
  • virtual screening,
  • multithreading.
The lastest version of CF program (fully functional, ver. 2.13, updated 25-12-2012) can be downloaded here
OS: Windows XP/Vista/7

The lastest version of the manual for CF program can be downloaded here

Original article of Random Forest method:
Breiman L., Random Forests. Machine Learning 2001, 45, (1), 5-32. - link

Our paper "Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity"
J. Chem. Inf. Model., 2009, 49 (11), pp 2481–2488 - link

© Pavel Polishchuk 2010-2017