AMBIT applicability domain estimation examples
ambit-model package is an implementation of methods described in
- Jaworska, J., Nikolova-Jeliazkova, N., & Aldenberg, T. (2005). QSAR applicabilty domain estimation by projection of the training set descriptor space: a review. Alternatives to Laboratory Animals ATLA, 33(5), 445–459.
- Nikolova-Jeliazkova, N., & Jaworska, J. (2005). An approach to determining applicability domains for QSAR group contribution models: an analysis of SRC KOWWIN. Alternatives to Laboratory Animals ATLA, 33(5), 461–470.
- Netzeva, T. I., Worth, A., Aldenberg, T., Benigni, R., Cronin, M. T. D., Gramatica, P., … Yang, C. (2005). Current status of methods for defining the applicability domain of (quantitative) structure-activity relationships. The report and recommendations of ECVAM Workshop 52. Alternatives to Laboratory Animals ATLA.
- Jaworska, J., & Nikolova-Jeliazkova, N. (2007). How can structural similarity analysis help in category formation? SAR and QSAR in Environmental Research, 18(3-4), 195–207. doi:10.1080/10629360701306050
The appdomain project is a command line application, demonstrating how to use ambit-model package. Alternatively, the applicability domain algorithms are implemented in Ambit Discovery desktop application as well as REST web services in Ambit web application.
The applicability domain is estimated based on the data in the training set only (independent of the model). The applicability domain estimation is reported for the test set. You may specify one and the same file as both test and training set. The input file formats are recognised by extension (e.g. .csv, .sdf, .cml).
The result file consists of all the properties in the test set, the predicted metric by the applicability domain method and a flag indicating if the molecule is out of domain ( 0 - in domain, 1 - out of domain). The output file tipe is recognised by extension (e.g. .csv, .sdf, .cml).
>java -jar ambit-appdomain-jar-with-dependencies.jar -h
Ambit applicability domain estimation by ambit-models package
usage: net.idea.example.ambit.appdomain.MainApp
-d,--demo Training and test CSV files from PubMed:1732103
-f,--descriptors <list> Comma delimited list of field names (as in the
input files) to be used as descriptors.
Example -f log_P,eLumo,eHomo,IL
-h,--help Ambit applicability domain estimation by
ambit-models package
-m,--method <method> Applicability domain estimation method:
_modeRANGE (PCARanges)
_modeLEVERAGE (Leverage)
_modeEUCLIDEAN (Euclidean distance)
_modeCITYBLOCK (City-block distance)
_modeMAHALANOBIS (Mahalanobis distance)
_modeDENSITY (Probability density)
_modeFINGERPRINTS_CONSENSUS
(Tanimoto Fingerprints (consensus))
_modeFINGERPRINT_MISSINGFRAGMENTS
(Tanimoto Fingerprints (consensus))
Example:
-m _modeFINGERPRINTS Default value: _modeFINGERPRINTS_CONSENSUS
-o,--output <output> Output file (CSV,SDF)
-r,--threshold <value> 1.0 : all compounds from training set
considered in the applicability domain (default); 0.9 : 90% of compounds
from training set
-s,--test <file> Test file (CSV,SDF)
-t,--training <file> Training file (CSV,SDF)
Reads the demo files and apply applicability domain by Tanimoto consensus fingerprint, assuming all of the training set compounds are in the app.domain. Results saved in result.sdf file. :
java -jar example-ambit-appdomain-jar-with-dependencies.jar -m _modeFINGERPRINTS_CONSENSUS -d mutagenicity -o result.csv
Reads the demo files and apply applicability domain by probability density estimation, assuming all of the training set compounds are in the app.domain. :
java -jar example-ambit-appdomain-jar-with-dependencies.jar -m _modeDENSITY -d mutagenicity
Reads training and test CSV files and apply applicability domain by probability density estimation, assuming 90% of the training set compounds are in the app.domain. :
java -jar example-ambit-appdomain-jar-with-dependencies.jar -m _modeDENSITY -t Debnath_smiles.csv -s Glende_smiles.csv -f log_P,eLumo,eHomo,IL -r 0.9
-
Mutagenicity pubmed:17514565 (structures and descriptors) Training set Test set
-
EPI Suite KOWWIN files (structures only) Training set Test set
-
Visualisation http://ideaconsult.github.io/examples-ambit/appdomain/
-
Please use the issue tracker to report bugs https://github.com/ideaconsult/examples-ambit/issues
-
Announcements and discussions at Google+ page
-
Download 2.0.0 release
-
Download from Maven repository
<dependency>
<groupId>net.idea.examples.ambit</groupId>
<artifactId>ambit-appdomain</artifactId>
<version>2.0.0</version>
</dependency>
<repository>
<id>nexus-idea-releases</id>
<url>https://nexus.ideaconsult.net/content/repositories/releases</url>
</repository>
>mvn clean package
The executable file is at target/ambit-appdomain-jar-with-dependencies.jar
>java -jar target/ambit-appdomain-jar-with-dependencies.jar -h