Background image

GCP Expertise
Case studies

Article Clasterization Tool to Improve the Bibliography Search

Challenge

The customer approached Quantori to enhance their bibliography search by developing a tool for article clustering. Although they had a basic prototype on Streamlit, they needed a high-performance production system, as no existing bibliographic tools provided clustering of publications. The goal was to create a complex clustering feature, optimizing data recording for graphs, and reduce the search program's time and memory usage, which initially handled only one request at a time.

Solution

Quantori Team built a high-performance bibliographic tool using the optimized datasets provided by the customer. We developed a front-end platform with enhanced UX and built the back-end infrastructure on GCP.

Outcome

The article clustering functionality helps identify the most relevant references and core articles, with metrics for ranking authors and publications within clusters. It supports exporting search results to CSV, including separate clusters, 'seed papers,' and search parameters. The system features a user-friendly interface and provides more precise bibliographic search and filtering, delivering unique data quickly.


Calculation of 15,000 Human Genomes using GCP

Challenge

The R&D company needed experts in GCP and genotyping to process terabytes of human genome data and compare it with reference sequences. This would help speed up the discovery phase. 

Solution

To support the project, Quantori Team introduced the GCP infrastructure and the Terra Baer portal environment. Next, we evaluated cutting-edge bioinformatics technologies, including DRAGEN, DeepVariant, and GATK. We established a pipeline to retrieve data from Answer ALS and the 1000 Genomes Project (1KGP) and input it into GATK in GVCF format, streamlining the data processing workflow.

Outcome

We implemented a high-performance joint genotyping pipeline, resulting in the successful genotyping of approximately 5,000 genomes from Answer ALS and the 1000 Genomes Project (1KGP).