Current status
Chemical entities of biological interest – such as small molecule metabolites – are essential building blocks of biological systems with crucial roles in human health and disease. Structural classification of these chemical entities is the first step to understanding these roles, but the number and complexity of known and possible chemical structures make any manual classification effort unfeasible. The chemoinformatics characterization of small molecules (e.g., prediction of their key properties and calculation of their descriptors) enhances their utilization in other molecular biology fields.
To effectively utilize large databases of chemical entities in biology, users need well-designed interfaces that guarantee the interoperability of services and seamless processing of chemical data. This is especially important for analyses that concern e.g. drug design and ligand docking, which should employ robust, reproducible and well-benchmarked screening methods.
Key domains and services in ELIXIR CZ are:
- Support and development of federated services at a global scale. ELIXIR CZ has a history of building interoperability tools for the utilization of major chemical databases (including ChEBI, ChEMBL, PubChem and PDBChem) in a larger context. The ELIXIR CZ development team has recently focused on fast and user-friendly tools for the similarity- and substructure-based retrieval of chemical entities and building SPARQL endpoints for federated use by other services.
- Classification of chemical entities. In collaboration with ELIXIR CH, ELIXIR CZ is currently developing open source tools for the automated structural classification of chemical entities using the reference ontology ChEBI). ChEBI is an expert-curated ontology of small molecules of biological interest, mainly small-molecule metabolites and sugar polymers.
- Integration of chemical biology resources. ELIXIR CZ is cooperating with EMBL-EBI to develop services for the chemoinformatic characterization of small molecules using methods derived from QSAR/QSPR descriptors. ELIXIR CZ also integrates selected large databases of small molecules into available computational tools and workflows to make the use of these databases more comfortable and convenient.
Challenges and goals of ELIXIR CZ
In a pan-European context and with strong collaboration with Swiss, UK and EBI partners, ELIXIR CZ is ready to answer new challenges in the development of specialized workflows and focus on creating tools that map chemical structures to arbitrary classifications and characterization. This is absolutely necessary for the successful integration of the chemical space into ELIXIR core data resources and other globally available resources.
Most of the efforts will be focused on semantic interoperability, which defines current trends in chemical biology. This interoperability provides an essential bridge between pure biological disciplines and the chemical space. Several labs of ELIXIR CZ are participating in this trend with IOCB playing a leading role, in which it will be responsible for assisting the smooth cross-discipline utilization of various tools developed for structural bioinformatics, proteomics, metagenomics and human genomics.
ELIXIR CZ strategy goals in Chemical Biology are as follows:
- Goal 1: Integration of chemical datasets with any resources using ChEBI ontology
- Goal 2: Interoperability with other ELIXIR Core Data Resources via SPARQL endpoints, making federated cross-database queries possible
- Goal 3: Application of the new results to workflows for automated drug design utilizing chemical information. The interoperability will provide relevant information to be automatically used for the “in silico” screening of a large body of chemical information
- Goal 4: Seamless extraction of useful information for computational methods at the interface of proteomics and cheminformatics, supported by specialized software tools for proteome profiling