by Ravindra Warang

15 minutes

GPT-Rosalind, Explained: What OpenAI's First Life Sciences Model Actually Changes for Pharma R&D

GPT-Rosalind explained OpenAI's first life sciences model, its benchmarks, gated access, and implications for global pharma R&D teams.

GPT-Rosalind, Explained: What OpenAI's First Life Sciences Model Actually Changes for Pharma R&D

A British chemist's data, taken without her credit, helped reveal the structure of DNA in 1953. Seventy-three years later, the model OpenAI named after her has just been released into the world's pharma labs and it has been built behind the kind of gates Rosalind Franklin spent her career being kept outside of.

A model named after a woman who was kept out

In May 1952, in a basement laboratory at King's College London, a 31-year-old chemist named Rosalind Franklin took a single X-ray photograph of a fibre of DNA. She labelled it Photo 51. The image of a cross of dark spots against a pale background was the most precise picture of DNA's structure that had ever been captured.

A few months later, without her permission, the photograph was shown to two researchers at Cambridge: James Watson and Francis Crick. They saw it once. From that single viewing, they reverse-engineered the double helix. The Nobel Prize that followed in 1962 went to Watson, Crick, and Franklin's colleague Maurice Wilkins. Franklin had died of ovarian cancer four years earlier, at thirty-seven, and the prize is not awarded posthumously. Her name is barely mentioned in their acceptance speeches.

On 16 April 2026, OpenAI released its first life sciences reasoning model and named it after her. The choice was deliberate. The irony, less so.

GPT-Rosalind is, by any reading, one of the most carefully gated AI launches the industry has seen. It is available only in the United States, only to vetted enterprise customers, only through what OpenAI calls a Trusted Access Program. The launch partners are Amgen, Moderna, Novo Nordisk, the Allen Institute, Thermo Fisher Scientific, Dyno Therapeutics. The model is named after a woman who, in her own time, was kept outside almost every gate that mattered.

For pharma leaders watching from India, from Europe, from any laboratory not on that partner list — the question this launch poses is not what GPT-Rosalind can do. It is what it means that OpenAI has decided, this carefully, who gets to use it.

What the model actually is

Strip the marketing away, and GPT-Rosalind is a reasoning model fine-tuned for biology.

That distinction matters more than it sounds. GPT-5.4, OpenAI's flagship generalist, is strong across coding, writing, analysis, and broad scientific reasoning. It can explain CRISPR. It cannot, with any confidence, evaluate a guide RNA design and reason about its predicted off-target activity. GPT-Rosalind is built to do the latter.

OpenAI has been precise about where the model is useful. Early-stage discovery: target identification, target validation, mechanism understanding, literature synthesis, and "omics" interpretation — the analysis of genomic, transcriptomic and proteomic datasets. Not formulation. Not commercial. Not patient-facing. The model lives in the first 24 to 36 months of a drug's life, when a hypothesis is still a question.

It is delivered through three surfaces :ChatGPT Enterprise, Codex (OpenAI's coding environment, now extended with a Life Sciences research plugin), and the OpenAI API for internal research workflows. SOC 2 Type 2, HIPAA-aligned, BAAs available, role-based access controls, no training on customer data. The compliance scaffolding has been built before the model went out, not after.

One technical clarification, because this is where most early coverage got it wrong: GPT-Rosalind does not fold proteins. It is an orchestrator. When a 3D structure is needed, AlphaFold 3 or Chai-1 is called. When a sequence search is needed, it calls a database. The model's contribution is not raw computation; it is a judgment about which specialist to ask, and how to read the answer.

Think of it less as a new microscope and more as a senior research associate who knows which microscope to use.

GPT-Rosalind raises governance questions regulated pharma is still learning to answer. 

The epistemic contract for AI in GxP goes deeper than most labs expect.

→ Read: AI In GxP Pharma: The New Epistemic Contract | Dr. Ajaz S. Hussain


What the benchmarks really say

OpenAI published numbers on three evaluations. They are worth reading carefully and with the calibration that experienced research leaders bring to any vendor-published score.

Infographic comparing GPT-Rosalind AI model performance against GPT-5.4 GPT-5 Grok and Gemini across BixBench LABBench2 and Dyno Therapeutics benchmarks

BixBench is a bioinformatics benchmark from Edison Scientific. It places an AI agent in front of an empty Jupyter notebook and asks it to do real computational biology, across 53 analysis scenarios and 296 questions. GPT-Rosalind scored Pass@1 of 0.751. GPT-5.4 scored 0.732. GPT-5 scored 0.728. Grok 4.2 scored 0.698. Gemini 3.1 Pro scored 0.550.

Two points clear of the best generalist are meaningful. It is not a generational leap.

LABBench2 is the broader test, roughly 1,900 tasks across eleven research-task families covering literature retrieval, sequence manipulation, protocol design and more. GPT-Rosalind beat GPT-5.4 on six of eleven families. The biggest gap appeared in CloningQA, the end-to-end design of DNA and enzyme reagents for molecular cloning protocols.

GPT-5.4 still won on five of eleven. Honest reading: this is a useful tool, not a finished one.

The Dyno Therapeutics evaluation is the one to take seriously. Dyno used unpublished, previously unseen RNA sequences of what OpenAI called "uncontaminated" data, meaning sequences the model had no chance of having seen during training. On sequence-to-function prediction, the model's best-of-ten outputs ranked above the 95th percentile of human experts. On sequence generation, it reached roughly the 84th percentile.

For a model fed only what it had never been trained on, ranking above 95% of human experts is the kind of result that, if it holds across larger and varied evaluations, changes how an early-discovery team allocates its time. That is not a benchmark win. That is a workflow signal.

What it actually does in a lab

The capabilities cluster around the parts of early discovery most strangled by manual data work.

A target-discovery team asks: Is Protein X a viable therapeutic target in disease Y? The model ingests the question and reasons across the published literature, public genomic and proteomic databases, and the team's internal experimental data. It surfaces supporting evidence. It surfaces contradictions. It flags the gaps. It generates competing hypotheses, evaluates them, and proposes follow-up experiments. It interprets multi-omics outputs. It writes the kind of structured synthesis that, until now, took a senior biologist three days.

What separates this from earlier AI assistants is two things: tool use, and skepticism.

Tool use, because GPT-Rosalind is built to call other systems — structure predictors, sequence search tools, statistical environments — and integrate their outputs into its reasoning. It is the orchestrator, not the bottleneck.

Skepticism, because GPT-Rosalind is the first OpenAI model trained to reject weak hypotheses. Not to soften them. Not to help the user find supporting evidence. Reject them. For pharma teams who have lost months chasing AI-suggested targets that did not survive lab validation, this calibration may turn out to be the single most valuable change of all.

There is a separate piece of this launch that most coverage missed, and it matters more outside the United States than inside it. The Codex Life Sciences research plugin is freely available on GitHub. It works with OpenAI's mainline models i.e GPT-5.4 and others — not just GPT-Rosalind. It connects to more than fifty scientific tools and data sources spanning human genetics, functional genomics, protein structure, biochemistry, clinical evidence, and public study discovery. It includes a research router that synthesises evidence-backed answers, and subagents that work in parallel where data lanes are independent.

For a research team that does not qualify for Trusted Access, which is most of the world, the plugin alone covers a substantial portion of the practical workflow gains. Not all of them. Most of them.


The Trusted Access Program — and why it is the real story

The most distinctive thing about GPT-Rosalind is not what the model does. It is who is allowed to use it.

To qualify for Trusted Access, an organisation must satisfy three conditions: legitimate scientific research with clear public benefit; governance, compliance and abuse-prevention controls already in place; access limited to approved users in secure, well-managed environments. At launch, eligibility is restricted to qualified enterprise customers in the United States. During the research preview, usage does not draw down on a customer's existing OpenAI credits — subject to abuse guardrails.

OpenAI is not selling a model. It is operating a vetted programme. The launch partner list is short and deliberate: Amgen, Moderna, Novo Nordisk, Thermo Fisher Scientific, the Allen Institute, Dyno Therapeutics, Oracle, NVIDIA, Benchling, the UCSF School of Pharmacy. These are not just customers. They are the early evidence base for whether this access model can scale.

Read carefully, this is a meaningful break from how foundation models have been distributed before. A vetted programme — with enterprise governance, no training on customer data, and named institutional users — is materially easier for a CISO or compliance committee to defend than a public API. For pharma buyers in the US, the gating reduces risk. For the rest of the global pharma industry — including India's largest generic manufacturers and most of Europe's research base — it introduces a different kind of risk: structural exclusion at launch.

The competitive subtext, as the trade publication Implicator put it in its launch coverage, is that controlled access may itself be the product. By restricting who gets in and on what terms, OpenAI is doing something foundation model providers have not done before: turning governance into a moat.

The biosecurity question is real

The reason for the gating is not only commercial. It is also genuinely about misuse.

Biology is what biosecurity researchers call a dual-use domain. A model that can interpret genomic data, reason about pathogen biology, and propose molecular modifications to improve binding affinity is, by definition, one that could, in the wrong hands, assist in the design of dangerous pathogens. This is not a hypothetical concern. Biosecurity researchers have flagged it consistently as biology-tuned AI capability has improved.

OpenAI's response combines institutional vetting with technical refusals. Both GPT-Rosalind and the Codex Life Sciences plugin contain hard-coded refusals around biological weapons precursors, gain-of-function pathogen work, and synthesis of controlled substances. Legitimate research on dangerous pathogens remains possible but only through institutional agreements, not the open plugin.

For pharma leaders globally, the practical signal is this: as life sciences AI matures, the regulatory frame around it will tighten, not loosen. The European Union's AI Act, in force since August 2024 and being phased in, classifies AI systems used in medical and life sciences applications as high-risk, triggering mandatory conformity assessments, transparency obligations, and human oversight requirements. Indian regulatory frameworks have not yet caught up, but the direction is set.

Organisations that begin building governance maturity around AI use in R&D today will be substantially better positioned than those that wait for the rules to arrive.

What the partner labs are actually doing

The named launch partners offer a useful preview of where GPT-Rosalind will be tested most intensively.

Amgen has invested in data-driven R&D for over a decade. Sean Bruich, its Senior Vice President of Artificial Intelligence and Data, has framed the collaboration as a way to apply advanced capabilities and tools toward accelerating how Amgen delivers medicines to patients. The key detail is whether Amgen integrates the model into target validation and early biology workflows, or limits it to literature synthesis. The first is transformative. The second is incremental.

Moderna is the most public about its AI ambitions. Stéphane Bancel, its Chief Executive Officer, has stated that Moderna is already using GPT-Rosalind to synthesise complex data and translate insights into experimental workflows. For an mRNA-platform company, the direct applications are rapid antigen identification, codon optimisation, and mRNA sequence design, all of which sit squarely within GPT-Rosalind's claimed strengths.

The Allen Institute offers a different lens. Andy Hickl, its Chief Technology Officer, has highlighted what GPT-Rosalind brings to consistency and repeatability in steps that have always been manual, such as finding data, aligning datasets, structuring queries, and more, inside an agentic workflow. This is the unglamorous middle of the research stack. It is where time savings actually compound.

Novo Nordisk signed a separate, broader strategic alliance with OpenAI on 14 April 2026, just two days before the GPT-Rosalind announcement, covering applications "from drug discovery to commercial." This is the more revealing signal of the two: a major pharmaceutical company committing to embed AI not in a single workflow, but across discovery, manufacturing, supply chain and commercial operations. GPT-Rosalind is one component of that architecture, not the whole of it.

The non-pharma partners are equally telling. Thermo Fisher Scientific operates the laboratory infrastructure layer. Benchling runs the electronic lab notebook used by much of the biotech industry. Oracle and NVIDIA bring data and compute. Together, these are the substrate into which GPT-Rosalind is being woven. The picture, in aggregate, is not a single model being deployed but an emerging stack: model, tools, enterprise infrastructure and vetted access, being assembled around the early-discovery problem.

What this means for Indian pharma and global generic manufacturers

For pharma readers in India and across the world's generic and biosimilar manufacturers, GPT-Rosalind raises a strategic question with two distinct edges.

The first is access asymmetry. The Trusted Access Program, at launch, is United States only. The launch cohort is dominated by branded innovators like Amgen, Moderna and Novo Nordisk. India's major generic manufacturers, which supply a substantial share of generic medicines consumed in the US market, do not currently appear on any partner list. If the geographic and governance restrictions persist as the programme expands, US-based manufacturers who qualify will accumulate AI capability advantages in target identification, formulation design and prior-art analysis that their non-US counterparts cannot match through the same channel.

DrugPatentWatch noted in its launch analysis that between 2026 and 2030, more than $236 billion in branded pharmaceutical revenue is at risk from generic and biosimilar competition. The same model that helps an innovator find new targets can, in different hands, accelerate the design-around work that erodes their patent estate. That dynamic is not new. The speed at which it now operates may be.

The second edge is more immediately actionable. The Codex Life Sciences research plugin is free, on GitHub, and works with OpenAI's mainline models. No Trusted Access required. For Indian R&D, formulation, and regulatory teams, the plugin alone covers a meaningful share of the practical workflow gains: connecting to 50+ scientific databases, automating evidence synthesis, and reducing manual handoffs between literature review, database search, and analysis. The plugin will not give Indian teams everything Amgen has. It will give them substantially more than they had two months ago.

The harder question is what this trajectory means for the structure of the Indian pharma industry over the next five years. India's manufacturing scale, regulatory experience and cost base remain formidable advantages. AI capabilities, such as model access, internal infrastructure, governance maturity and scientific talent that is AI-fluent, are an emerging axis on which competitive position will be redrawn. The window to begin building that capability is now. Not when GPT-Rosalind, or its successors, become generally available.

The AI drug discovery race is already redrawing competitive position globally.

A region-wise breakdown of who is leading and what their platforms actually do.

→ Read: Top AI Drug Discovery Companies In 2026


What GPT-Rosalind is not

A clear-eyed reading of the launch requires saying what the model does not do.

Infographic showing five key limitations of GPT-Rosalind including not a lab replacement not a structure predictor and not a finished product

It is not a finished product. OpenAI has called it a research preview. The model failed to top GPT-5.4 on five of eleven LABBench2 task families. The composition of its training data has not been disclosed in detail. There is a significant gap in a system for regulated R&D, where reproducibility and provenance are not optional.

It is not a general-purpose upgrade. On broad reasoning, coding, or non-biology work, GPT-5.4 remains the stronger model. GPT-Rosalind has traded breadth for depth.

It is not a replacement for laboratory validation. The model can sharpen a hypothesis. It cannot test one. Domain-specific hallucinations, which are confident but incorrect claims about a drug-target interaction or a sequence behaviour, can waste months of bench work or, worse, push flawed reasoning into clinical planning. The Trusted Access framing exists in part because expert partners are needed to audit and validate outputs.

It is not a structure predictor. Folding remains the work of AlphaFold 3, Chai-1 and other specialists.

And on its own, it is not evidence that AI will compress drug development timelines. Implicator, among others, has put this plainly: the real test of GPT-Rosalind will not be benchmark scores. It will be IND filings and Investigational New Drug applications that are emerging over the next two to three years from the partner programmes. The pharmaceutical industry has seen many AI launches followed by a few approved molecules. That pattern will hold or break on data, not announcements.

What pharma leaders should track over 24 to 36 months

For senior R&D, regulatory and business leaders, the watchlist is short and concrete.

IND filings from the partner programmes. Amgen, Moderna, the Allen Institute and Novo Nordisk are the early evidence-based. Watch for molecules whose origin stories explicitly cite GPT-Rosalind in target selection, hypothesis generation or experimental design. Press releases will not be the signal. Regulatory filings will.

Programme expansion. Watch when and whether OpenAI extends Trusted Access beyond the United States and on what terms. The handling of European applicants, in particular, will reveal how the EU AI Act is being interpreted in practice. Indian access will lag. How it lags will matter.

The next models in the series. OpenAI has been explicit: GPT-Rosalind is the first release in a life sciences series. Subsequent models will likely push deeper into specific subdomains like clinical trial design, regulatory submissions and manufacturing-relevant chemistry. Each release will tell us more about where the largest productivity gains are emerging.

The competitive response. Anthropic has its own life sciences positioning. Isomorphic Labs continues to publish. Amazon Bio Discovery launched the same week as GPT-Rosalind. NVIDIA is building specialist infrastructure. Three years from now, the more interesting question may not be whether GPT-Rosalind worked, but which approach —domain-tuned reasoning models, specialist generative biology models, or vertically integrated discovery platforms—produced the first AI-originated molecules to clear Phase 3.

The Pharma Now read

GPT-Rosalind is neither the breakthrough some early coverage suggested nor the gimmick that AI-fatigued readers might dismiss. It is a serious, deliberately gated, carefully partnered first move from the most influential foundation model company in the world into a domain that has resisted AI's productivity gains for twenty years.

The immediate question for pharma R&D leaders is not whether to be excited. It is whether the organisation has the infrastructure, governance maturity, and AI-fluent scientific talent to use tools like this when access opens. For Indian and other non-US manufacturers, the question is sharper: what gets built in the next eighteen months using the freely available Codex plugin and mainline models will determine how quickly Trusted Access, or its successors, translate into competitive position when the geographic gates ease.

The model is named after a woman whose data was taken without her credit, whose name was barely spoken when the prize was given, and whose contribution to the discovery of DNA's structure was, for too long, undercredited because the institutional gates of mid-twentieth-century science were closed to her.

The naming is, in this sense, more pointed than it first appears.

The next twenty-four months will tell us whether the gates around AI in biology are being constructed responsibly or whether, again, they are being drawn around the wrong people.

Pharma Now will continue to track GPT-Rosalind's deployment, the partner programmes, and the global access trajectory as the research preview evolves. Signals from regulatory filings, not press releases, will tell the real story.

Sources and notes

Primary: OpenAI, "Introducing GPT-Rosalind for life sciences research" (16 April 2026); OpenAI Help Center documentation on GPT-Rosalind and the Trusted Access Program.

Cross-verified secondary: FierceBiotech (17 April 2026); VentureBeat; Pharmaphorum; TechTarget; Lab Manager; Euronews; Implicator; DrugPatentWatch (analysis pieces on GPT-Rosalind, generic drugs, and drug development); Nerd Level Tech; AnthemCreation; PitchBook (January 2026 report on AI drug discovery investment).

Industry commentary: Sean Bruich (Amgen), Stéphane Bancel (Moderna), Andy Hickl (Allen Institute), Kimberly Powell (NVIDIA), Sam Altman (OpenAI).

Benchmarks cited — BixBench by Edison Scientific, LABBench2, the Dyno Therapeutics evaluation. Figures as published by OpenAI on launch and confirmed by FierceBiotech and VentureBeat. Date-stamped to the April 2026 launch window; all figures should be re-verified against any updated OpenAI publications before republication.

Note on Rosalind Franklin's biography — Photo 51 was captured in May 1952 at King's College London; the Watson–Crick paper was published in Nature, April 1953; the Nobel Prize was awarded in 1962, four years after Franklin's death from ovarian cancer in 1958 at age 37. Standard biographical sources.

Author Profile

Ravindra Warang

Editor in Chief

Comment your thoughts

Author Profile

Ravindra Warang

Editor in Chief

Ad
Advertisement

You may also like

Article
Pharmaceutical Regulatory Affairs: Challenges, Innovations, and Global Harmonization

Sneha Usakoyal