Beyond Words: The Expanding Role of Language Models in BiologyWhite paper
It's a characteristic of human nature to find new ways to use recently discovered tools, especially when scientists adapt them to address practical problems that spark their curiosity.
Transformer-based pre-trained language models (PLMs) such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformers (GPT) have greatly changed the contemporary landscape of artificial intelligence (AI), and the real-world applications of such PLMs are continually expanding. Although originally designed for working with natural languages, these models have proven equally effective in handling languages of nature, such as nucleotide and amino acid sequences.
In this brief article, we aim to illustrate how these language models are reshaping the domains of biomedical text mining, functional genomics*, and protein engineering**, thus opening up fresh avenues for research and the development of innovative biologics-based therapies.
Omar Kantidze is an experienced team leader and scientist with a PhD in Molecular Biology, extensive knowledge, and a proven 15-year track record of accomplishment in functional genomics and molecular cell biology.
At Quantori, Omar plays a key role in developing internal scientific research, establishing collaborations with academic partners, and consulting on complex life science-related projects. His vast academic background makes Omar an integral part of Quantori's scientific achievements and reputation.
Previously, Omar served as the Head of the Cellular Genomics Department at the academic research organization and as a Scientific Director at the major medical research center.
Omar received his PhD and Dr.Sc. Degrees from the Institute of Gene Biology and a Master's degree from the Lomonosov State University.
Scientific Publications
Supervised machine learning for microbiomics: Bridging the gap between current and best practices
Toward a responsible future: recommendations for AI-enabled clinical decision support
Explainable AI to identify radiographic features of pulmonary edema
Identifying the capabilities for creating next-generation registries: a guide for data leaders and a case for “registry science”
Structure Seer – a machine learning model for chemical structure elucidation from node labelling of a molecular graph
Perfect prosthetic heart valve: generative design with machine learning, modeling, and optimization
Excess mortality in Ukraine during the course of COVID-19 pandemic in 2020–2021
Use of semi-synthetic data for catheter segmentation improvement
A multi-reference poly-conformational method for in silico design, optimization, and repositioning of pharmaceutical compounds illustrated for selected SARS-CoV-2 ligands
Novel Efficient Multistage Lead Optimization Pipeline Experimentally Validated for DYRK1B Selective Inhibitors
AnFiSA: an Open-Source Computational Platform for the Analysis of Sequencing Data for Rare Genetic Disease
PyVaporation: A Python Package for Studying and Modelling Pervaporation Processes
Automatic Scoring of COVID-19 Severity in X-ray Imaging Based on a Novel Deep Learning Workflow
Indirect supervision applied to COVID-19 and pneumonia classification
Analysis of 329,942 SARS-CoV-2 Records Retrieved from GISAID Database
Quantori is excited to share research findings that are available on Cold Spring Harbor Laboratory's bioRxiv preprint server for biology "Analysis of 329,942 SARS-CoV-2 records retrieved from GISAID database"