Products & Demos
CICC Chemical Informatics Data Portal
Databases
Quantum Mechanical Database (Varuna)
An integrated system that includes a depository for computational chemistry and a modeling environment, including automated execution of calculations, computational resources management and visualization. Currently has about 500 compounds.
- Web client for a structure search of the database (returns the input and 3D coordinate output file data)
- Demonstration of .net workflow
- Site for web services
Local NIH Developmental Therapeutics Program (DTP) Database
This is a local database containing the NIH Developmental Therapeutics Program (DTP) data that can be used for data mining. The database will require the ability to similarity search, and the ability to extract biological fingerprints and gene expression data.
Web Services Link(s):
Local PubChem Database
This is a local copy of PubChem that can be used for data mining. The database will require the ability to handle complex data in PubChem. It will prototype new architectures.
Web Services Link(s):
These services are essentially wrapped queries. Naturally there may be queries that you'd like to see but are not present. If so let Rajarshi Guha know.
- Structure (Usage) Provides methods to get Pubchem Compound information
- Synonyms (Usage)] Provides methods to get synonyms given a compound or substance CID (PubChem's Chemical Identifier). (SMILES support coming)
- Derived properties (Usage) Gets calculated properties (SLogP and SMRef) given a compound CID. Can also search via exact values and ranges (but this is very slow at the moment).
- Docking results (Usage) Provides methods to get the ligand and target structures for PubChem compounds based on CID, sorted score values, or by SMARTS patterns. Ligands are returned in SDF format. (Currently only the ligand structures are accessible and the actual score values are coming soon.)
- 3D structures (Usage) Provides access to MMFF94 optimized 3D structures for PubChem compounds. Structures are returned in SD format and can be accessed by CID or by SMARTS patterns.
Local PubChem Dock Database
The PubChem Dock database aims to store the results of large-scale docking calculations. The results being stored include the PDB structure of the targets, 3D structures of docked ligands, and the docking scores. We currently evaluate four (arbitrarily chosen) scoring functions provided by Openeye's fred, namely:
- chemgauss3
- shapegauss
- oechemscore
- plp
For each scoring function we save the total score as well as the component scores. The database currently has docking results against six proteins (1YC4, 1R1P, 1YC3, 1YC1, 1XP6, 1QKT). We plan on populating it with docking results for families of proteins. One possible use is to screen ligands over families using a similarity approach.
Web Services Link(s):
The database can be accessed via a web form as well as with web services. (Usage)
Workflows
We are developing computational workflows using our web service infrastructure and the open-source Taverna workflow tool. These emphasis is on developing workflows which encapsulate important processes in chemoinformatics and drug design, use diverse kinds of information together in novel ways, and which are of demonstrated scientific merit.
Below are descriptions of some of the workflows that we have developed, along with example output.
Workflow 1 - Finding relationships between compounds and proteins
NIH SIM SEARCH -> FILTER -> OMEGA -> FRED -> JMOL/HTML
Download Taverna Workflow File
This workflow is a sequence of performing a similarity search on the NIH DTP Human Tumor data, filtering the results based on Pharmacokinetic properties (FILTER), converting to 3D (OMEGA), docking into a pre-defined protein (FRED) and visualizing (JMOL). This workflow opens up various possibilities, including:
- Finding similar structures in the DTP to existing ligands for tumor-related proteins from the PDB, and correlation of docking scores with cell-line assay results. Resultant hypothesisizing about which proteins are involved in which tumors
- Testing the possible effectiveness of DTP compounds in other areas (e.g. Alzheimer's disease - see Alzheimer's Workflow) by docking structures to PDB proteins from that therapeutic area.
- Integration of this workflow with other tools such as Sentient Desktop - see example of using Workflow 1 with Alzheimers Disease in Sentient.
Workflow 2 - HTS data organization and flagging
NIH SCREEN RETRIEVE -> FILTER -> TOXICITY FLAG -> SERIES GENERATION (Divkm) -> VISUALIZATION (VOPlot, 2Dviewer)
This workflow demonstrates how screening data can be flagged and organized for human analysis. The compounds and data values for a particular screen are retrieved, and then are filtered to remove compounds with reactive groups, etc. ToxTree is used to flag the potential toxicities of compounds. Divkmeans is used to add a column of cluster numbers. Finally, the results are visualized using VOPlot and the 2D viewer applet.


