Drug Discovery Industry

Elucidata helps R&D teams by structuring biomedical data at an unprecedented scale with human-level accuracy to support the growing data needs in the drug discovery domain.

The Problem

Approximately 30% of all the world's data volume is now being generated by the healthcare industry, (according to RBC Capital Markets). But data being available is not the same as data being usable. Cleaning data, or data wrangling, continues to be an ongoing challenge, requiring extensive knowledge and expertise in every industry and organization. Instead of focusing on more valuable scientific work, scientists are often compelled to devote approximately 80% of their time to monotonous data curation tasks. This task demands substantial investments of time and resources, rendering it impractical and difficult to scale.

The Solution


Committed to changing this situation, Elucidata has developed Polly, a biomedical data platform powered by BioNLP (Biomedical Natural Language Processing). It curates both public and proprietary biomedical data into a F.A.I.R (Findable, Accessible, Interoperable, Reusable) resource with rich metadata annotations. Polly helps bioinformaticians and biologists:

  • Save at least 2000 hours of data wrangling time annually. Offload routine data sourcing and cleaning tasks to Polly while your discovery team focuses on higher-value science.
  • Save 80% on the total cost of sourcing and curating public data. Polly provides your team with a curated and searchable collection of transcriptomics datasets that are ready to be modeled and analyzed.
  • Hit critical milestones 75% faster. Spend more time evaluating target genes or biomarkers on Polly instead of data wrangling

The Technology

Polly employs a suite of BERT-like models called Polly-BERT, trained on extensively curated biomedical data, to establish connections between metadata and relevant ontologies. This advanced approach achieves accuracy levels comparable to human performance for various Named Entity Recognition (NER) tasks. Polly has cleaned and linked around 125 TB of biomedical data originating from approximately 30 diverse sources.

Traction in the Industry

Polly has enabled the detection of multiple validated drug targets across immunology, oncology, and metabolic disorders using ML-ready data and a scalable data infrastructure that enables easy downstream analysis. The platform's remarkable data curation capabilities, coupled with its collection of meticulously curated RNA-Seq datasets, have positioned it as a catalyst for enhancing productivity in pharmaceutical research.

At present, numerous leading pharmaceutical companies, as well as several smaller biotech firms, are leveraging Elucidata's technology and services to expedite their discovery programs. Elucidata also offers bespoke services tailored to the research needs of the company. This helps R&D teams access the best-quality data with rich metadata annotations at the dataset, sample, and feature levels.


Find out how Elucidata helped an early-stage, oncology-focused pharmaceutical company, studying the effect of gene perturbation on cell fate conversion, fast-track gene target identification and validation within months here: https://bit.ly/42aria1

Reach out to Elucidata at https://bit.ly/3MzlWPY

About Drug Discovery Innovation Programme 2023

Drug Discovery Innovation Programme is an invitation-only and one of the best platforms to learn the latest insights and develop lasting business relationships.

This year’s Drug Discovery Innovation Programme will highlight the challenges discovery pipelines have faced due to COVID-19 and will put a spotlight on thec adoption of technology to finding the solutions.

With over 100+ attendees, learn how modernization in R&D processes is fundamentally changing what the drug discovery research will look like in the next two to five years.

So, join us in 2023 for an in-person experience and 2-days of top-level strategic content and the current scientific insights, networking, and discussions from leading global pharmaceutical R&D executives.

Companies in attendance for 2023 will include Servier Pharmaceuticals, Monte Rosa Therapeutics, University of Oxford, WPD Pharmaceuticals, AISA Therapeutics, Anima Biotech, PDC*line Pharma, Eli Lillly and Company, Symphogen, IRB Barcelona, Axonis Therapeutics, Genentech, Arakis Therapeutics, Johnson & Johnson, Amgen, Revitale Pharma, Progenra Inc, CERo Therapeutics, Merck and much more.

So, why wait? Register with us to learn more!