MaxTox
Contact: indira ghosh
Categories: Prediction, Descriptor calculation
Exposed methods:
| descriptor calculation, predict | |
|---|---|
| Input: | SDF files containing structure + activity (toxicity) |
| Output: | MCS Score |
| Input format: | SDF(MDL) |
| Output format: | Comma separated values and SDF |
| User-specified parameters: | minimum number of matching atoms and bonds minimum number of ring atoms, hetero-atoms |
| Reporting information: | MCS , similarity matrices for building model |
Description:
Comparing the query molecule to each cluster (EP based) and finding an MCS score with respect to molecules
of each cluster [JWR02].
Using MCS score(s) in a Machine Learning algorithm, to generate predictive models.
The software is primarily implemented in the JAVA and will be developed for a Linux based system. MaxTox
software is dependent on the open source chemistry development kit (CDK)
(http://sourceforge.net/projects/cdk) and OpenBabel (http://openbabel.org). MaxTox may provide a basic
graphical user interface (GUI) in future. Currently it is executed via the command line.
The input format accepted by MaxTox is the widely used MDL file format
(http://www.mdl.com/downloads/public/ctfile/ctfile.jsp). MaxTox output formats are program specific plain
text files and MCS in format SDF format.
Background (publication date, popularity/level of familiarity, rationale of approach, further comments)
Published in 2006, [PRA06] elaborates the scope of the hypothesis, that
it may be possible to find a set of common scaffold(s) from diverse compound set
which contribute significantly (positively/negatively) towards the biological activity. In
the present algorithm, we propose to extend this hypothesis to derive a predictive
toxicity score. This score will be based on MCS (Maximum Common Substructure)
score with respect to clusters of compounds (based on toxicological endpoints).
Type of Descriptor:
MCS and similarities
Interfaces:
Priority: Low
Development status:
Homepage: http://www.maxtox.org
Dependencies:
External components: Chemistry Development Kit (CDK), OpenBabel, R
Technical details
Data: No
Software: Yes
Programming language(s): Java, C++
Operating system(s):
Input format: SDF(MDL)
Output format: Comma separated values and SDF
License: GPL
References
References:
[BRO73] Bron, C.; Kerbosch, Finding All Cliques of an Undirected Graph. J. Commun. ACM 1973, 16, 575-577.
[JWR02] John W Raymond ,Eleanor J. G. Ardiner, Peter Willet, RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs, The Computer Journal, 45 (6). pp. 631-644. ISSN 1460-2067
[PRA06] Prakash, O.; Ghosh, I, Developing an Antituberculosis Compounds Database and Data Mining in the Search of a Motif Responsible for the Activity of a Diverse Class of Antituberculosis Agents, J. Chem. Inf. Model., 2006, 46, 17-23

