Molecular informatics

From LIMSWiki
Jump to navigationJump to search
A graphical example of protein-ligand docking using informatics tools

Molecular informatics is an integrative field of science that examines "chemical and biological data on both the molecular and systemic level" using a wide variety of information technologies.[1]

Molecular informatics is somewhat related to pharmacoinformatics in so much as it's used often in the field of drug design and discovery for "lead compound identification, drug target identification, and hit-to-lead optimization."[1][2][3] Other applications include protein-ligand and protein-protein docking as well as biomolecular design.


The field of molecular informatics arguably grew out of molecular modeling[4][5][6], a tool of chemoinformatics that uses theoretical methods and computational techniques to replicate the behavior of molecules, often as a three-dimensional representation. Gradually, the fields of biology, chemistry, and informatics began to integrate, as the editors of the journal Molecular Informatics noted in a 2012 editorial:

"Later on, when the few protein structures available could be analysed with the first graphical molecular modelling packages, the automated docking of ligands into the binding cavities of proteins offered a means to generate hypotheses of protein-ligand interactions at the atomic level. These were exciting times for some of the chemists and biologists that envisaged the wealth of opportunities that integrating informatics into those traditional disciplines could offer for gaining a deeper understanding, but also widening the scope, of how small molecules interact with macromolecules."[6]

As informatics and molecular modeling began playing a more important role in chemical and biological research, the fields of bioinformatics and chemoinformatics emerged, with the concept of molecular informatics in turn forming around them.[6][2][7]


Molecular informatics can help tackle problems and tasks such as the following[1][6][8][9]:

  • documenting and studying protein-ligand and protein-protein docking
  • designing, studying, and identifying pharmaceutical solutions for health problems
  • identifying "genetic and epigenetic signals related to diseases"
  • calculating chemical properties of molecular and sub-molecular interactions
  • mining small molecule and drug target databases
  • using machine learning techniques to generate models of candidate molecule properties
  • grouping molecules into practical divisions by biological and physicochemical properties


Structural databases, modeling software, sequence analysis tools, and data mining tools make up a significant portion of the information technology used in molecular informatics. Major databases that map proteins to the small molecules they interact with include European Bioinformatics Institute's ChEMBL, Canadian Institutes of Health Research's DrugBank and Human Metabolome Database, and National University of Singapore's Therapeutic Target Database.[10] Major molecular design and modeling software includes Dassault Systèmes' Discovery Studio, Fujitsu's Scigress, and Schrödinger's Maestro. Data mining tools and techniques include the use of neural networks that classify, model, and auto-associate data; cluster-based compound analyses that use cluster algorithms to find and group potential compounds; and decision trees that use partitioning algorithms to better sort through heterogeneous data types.[11][12]

Learning and research centers like the Icahn School of Medicine at Mount Sinai, University of Cambridge, and Washington University in St. Louis feature molecular informatics departments with informatics portals that give access to databases, software tools, and even advanced IT support. This includes Mount Sinai's Molecular Informatics Core (MIC), the Cambridge Crystallographic Data Centre (CCDC), and Washington University's Molecular Informatics Portal (MIP).[13][14][15]

More recently, with the advent of "Big Data," molecular informatics is expanding its scope, turning to machine-learning tools for predictive modeling, improved 3-D visualization tools for analyzing molecular structure and function, and system-wide approaches to experimental procedures.[9]

See also

Further reading


  1. 1.0 1.1 1.2 Baumann, Knut; Becker, Gerhard F.; Mestres, Jordi; Schneider, Gisbert (January 2011). "Molecular Informatics - The First Year". Molecular Informatics 30 (1): 3. doi:10.1002/minf.201190001. 
  2. 2.0 2.1 Flower, Darren R. (2002). "Molecular Informatics: Sharpening Drug Design's Cutting Edge". Drug Design: Cutting Edge Approaches. Royal Society of Chemistry. pp. 1–52. ISBN 9780854048168. Retrieved 06 January 2022. 
  3. Bender, Andreas; Glen, Robert C. (October 2004). "Molecular similarity: a key technique in molecular informatics". Organic & Biomolecular Chemistry 2 (22): 3204–3218. doi:10.1039/B409813G. 
  4. Glen, Robert (December 2002). Developing tools and standards in molecular informatics - Interview by Susan Aldridge. pp. 2745–2747. doi:10.1039/B207793K. 
  5. Korkin, Dmitry (2003). A New Model for Molecular Representation and Classification: Formal Approach Based on the ETS Framework. University of New Brunswick Fredericton. pp. 652. ISBN 0612988686. Retrieved 06 January 2022. 
  6. 6.0 6.1 6.2 6.3 Baumann, Knut; Ecker, Gerhard F.; Mestres, Jordi; Schneider, Gisbert (January 2012). "Molecular Informatics - A Leading Discipline in a Complex Emerging Field". Molecular Informatics 31 (1): 3. doi:10.1002/minf.201290001. 
  7. Eriksson, L.; Johansson, E.; Kettaneh-Wold, N.; Trygg, J.; Wikström, C.; Wold, S. (2006). "Chapter 21: Chem- and Bioinformatics". Multi- and Megavariate Data Analysis, Part 2, Advanced Applications and Method Extensions. MKS Umetrics AB. pp. 85–97. ISBN 9789197373036. 
  8. Bender, Andreas; Glen, Robert C. (2004). "Molecular similarity: a key technique in molecular informatics". Organic & Biomolecular Chemistry 2 (22): 3204–3218. doi:10.1039/B409813G. PMID 15534697. 
  9. 9.0 9.1 Baumann, Knut; Becker, Gerhard F.; Mestres, Jordi; Schneider, Gisbert (January 2015). "Systems Approaches and Big Data in Molecular Informatics". Molecular Informantics 34 (1): 2. doi:10.1002/minf.201580131. 
  10. Southan, Christopher; Sitzmann, Markus; Muresan, Sorel (2013). "Comparing the Chemical Structure and Protein Content of ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Database". Molecular Informatics 32 (11–12): 881–897. doi:10.1002/minf.201300103. 
  11. Ertl, Peter. "Cheminformatics and its Role in the Modern Drug Discovery Process" (PDF). Novartis. Retrieved 17 June 2015. 
  12. Pahwa, Payal; Papreja, Manju (March 2014). "Validation of SOMFA using Data Mining Technique" (PDF). International Journal of Soft Computing and Engineering 4 (ICCIN-2K14). ISSN 2231-2307. Archived from the original on 23 May 2015. Retrieved 06 January 2022. 
  13. "Experimental Therapeutics Institute > Molecular Informatics Core". Icahn School of Medicine at Mount Sinai. 12 March 2012. Archived from the original on 28 February 2015. Retrieved 06 January 2022. 
  14. "The CCDC Profile". University of Cambridge. Retrieved 06 January 2022. 
  15. "Institute for Informatics - Projects". Washington University in St. Louis. Retrieved 06 January 2022.