Contact: indira ghosh
Categories: Prediction, Descriptor calculation
|descriptor calculation, predict|
|Input:||SDF files containing structure + activity (toxicity)|
|Output format:||Comma separated values and SDF|
|User-specified parameters:||minimum number of matching atoms and bonds minimum number of ring atoms, hetero-atoms|
|Reporting information:||MCS , similarity matrices for building model|
Comparing the query molecule to each cluster (EP based) and finding an MCS score with respect to molecules
of each cluster [JWR02].
Using MCS score(s) in a Machine Learning algorithm, to generate predictive models.
The software is primarily implemented in the JAVA and will be developed for a Linux based system. MaxTox
software is dependent on the open source chemistry development kit (CDK)
(http://sourceforge.net/projects/cdk) and OpenBabel (http://openbabel.org). MaxTox may provide a basic
graphical user interface (GUI) in future. Currently it is executed via the command line.
The input format accepted by MaxTox is the widely used MDL file format
(http://www.mdl.com/downloads/public/ctfile/ctfile.jsp). MaxTox output formats are program specific plain
text files and MCS in format SDF format.
Background (publication date, popularity/level of familiarity, rationale of approach, further comments)
Published in 2006, [PRA06] elaborates the scope of the hypothesis, that
it may be possible to find a set of common scaffold(s) from diverse compound set
which contribute significantly (positively/negatively) towards the biological activity. In
the present algorithm, we propose to extend this hypothesis to derive a predictive
toxicity score. This score will be based on MCS (Maximum Common Substructure)
score with respect to clusters of compounds (based on toxicological endpoints).
Type of Descriptor:
MCS and similarities
Programming language(s): Java, C++
Input format: SDF(MDL)
Output format: Comma separated values and SDF
[BRO73] Bron, C.; Kerbosch, Finding All Cliques of an Undirected Graph. J. Commun. ACM 1973, 16, 575-577.
[JWR02] John W Raymond ,Eleanor J. G. Ardiner, Peter Willet, RASCAL: Calculation of Graph Similarity using Maximum Common Edge Subgraphs, The Computer Journal, 45 (6). pp. 631-644. ISSN 1460-2067
[PRA06] Prakash, O.; Ghosh, I, Developing an Antituberculosis Compounds Database and Data Mining in the Search of a Motif Responsible for the Activity of a Diverse Class of Antituberculosis Agents, J. Chem. Inf. Model., 2006, 46, 17-23