by Simantini Singh Deo

9 minutes

Can Regulators Trust AI-Generated Scientific Data?

FDA has 500+ AI submissions and rising. Hallucination rates hit 55% on niche topics. Here's the regulatory trust framework pharma must understand.

Can Regulators Trust AI-Generated Scientific Data?

For decades, the relationship between a regulatory agency and the scientific data submitted to it rested on a simple assumption: a human conducted the experiment, a human recorded the result, and a human could be held accountable if that result turned out to be wrong. 

Artificial intelligence has quietly dismantled that assumption. AI models now generate, analyze, summarize, and in some cases directly produce the scientific data that pharmaceutical companies submit in support of drug approvals, clinical trial designs, and manufacturing quality decisions. 

The question this raises is no longer theoretical or hypothetical. It is rapidly becoming the central regulatory challenge of the current decade: can an agency genuinely trust a scientific result that no human directly produced, verified, or witnessed?

The scale of this shift is already substantial:

  1. The FDA reports over 500 regulatory submissions with AI components since 2016, with usage described internally as rising "exponentially"
  2. Nearly 950 AI- or machine learning-enabled medical devices had been approved by the FDA as of August 2024, up from just 520 by January 2023
  3. A 2025 cross-industry survey found 44% of organizations experienced negative consequences from generative AI use, with average financial losses of $4.4 million per incident
  4. Independent 2025–2026 research found that even advanced large language models exhibit 15–20% hallucination rates on factual citation tasks, rising to 35–55% on niche or recent topics

These numbers frame a genuine and growing tension within the scientific establishment: AI is becoming structurally embedded in the regulatory data pipeline at the exact moment that its propensity for confident, plausible-sounding fabrication is becoming better documented and harder to ignore.


The FDA's Answer: A Risk-Based Credibility Framework

Rather than ignore the problem or attempt to ban AI from regulatory submissions outright, the FDA has moved deliberately to build a formal evaluative structure for it. In January 2025, the agency published draft guidance titled “Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products (FDA-2024-D-4689)” — the agency's first guidance document specifically addressing AI used to generate or analyze data submitted in support of a regulatory decision.

The central concept introduced in this guidance is "model credibility" — defined as trust in the performance of an AI model for a specific Context of Use (COU). Critically, the FDA does not treat AI credibility as a single yes-or-no judgment. 

Flowchart showing the FDA seven-step risk-based credibility assessment process for AI submissions.

It is evaluated relative to exactly how the model's output will be used and what risk that specific use carries. The guidance proposes a seven-step risk-based credibility assessment process:

  1. Define the question of interest that the AI model is being used to address
  2. Define the context of use, the specific role and scope of the model within the regulatory submission
  3. Assess the AI model risk, considering both the model's influence on the regulatory decision and the consequence of the model being wrong
  4. Develop a credibility assessment plan, tailored to the level of risk identified
  5. Execute the credibility assessment plan, generating the evidence needed to support trust in the model
  6. Document the results in a transparent, auditable format suitable for regulatory review
  7. Determine adequacy, whether the credibility evidence sufficiently supports the model's intended use within the submission

This framework represents a fundamentally different regulatory philosophy than blanket validation requirements applied uniformly. A model used to triage adverse event reports for human review carries different risk than a model used to generate the primary efficacy endpoint in a pivotal trial and the FDA's framework is explicitly designed to demand proportionally more evidence as the stakes rise.


The FDA's credibility framework sets the standard.

Here's how AI is already transforming pharma regulatory submissions in practice.

AI in Pharmaceuticals | Transforming Regulations


Why Hallucination Is The Central Trust Problem?

The most consequential threat to regulatory trust in AI-generated data is not bias or technical malfunction in the conventional sense, it is hallucination: the tendency of generative AI models to produce fabricated information that is stated with the same confidence as accurate information. This is not a rare edge case. It is a structural feature of how large language models generate output.

The evidence of how serious this problem has become spans multiple scientific domains:

  1. A January 2026 analysis by GPTZero of 4,841 papers accepted to NeurIPS 2025, one of the most competitive, rigorously peer-reviewed AI conferences in the world, found at least 100 confirmed hallucinated citations spanning 53 papers, despite review by three to five expert reviewers per submission
  2. Hallucination manifests not only as fabricated citations but as fabricated data summaries, invented experimental outcomes, and entire paper-mill-style manuscripts
  3. In medical and legal domains specifically, hallucination rates can exceed 28% without proper grounding techniques, even with retrieval-augmented generation systems designed to reduce the problem
  4. A pharmaceutical industry analysis found that 44% of organizations using generative AI reported negative consequences, with hallucinated or fabricated outputs presented with unwarranted confidence identified as the central barrier to clinical trust

What makes hallucination particularly dangerous in a regulatory context is that it does not look like an error. A hallucinated clinical data summary, a fabricated citation supporting a safety claim, or an invented experimental result can pass a cursory review precisely because it is generated to be plausible, well-formatted, and internally consistent, the exact qualities that make legitimate scientific data trustworthy.


Hallucination in regulatory data doesn't stop at the submission.

When it reaches clinical practice, it becomes a patient safety risk.

AI Hallucinations in Pharma | The Patient Safety Risk


Synthetic Data: A Different Category Of Trust Problem

A distinct but related challenge involves synthetic data — datasets generated by AI models (often Generative Adversarial Networks or diffusion models) to augment limited or privacy-restricted real-world data, particularly in clinical trial design and medical imaging. 

Synthetic data is not the same problem as hallucination; it is intentionally generated, not spontaneously fabricated. But it raises its own distinct trust questions for regulators.

Mind map illustrating the four core regulatory challenges of using synthetic data in healthcare.

The core concerns regulators are currently grappling with include:

1. Distribution Shift & Unverifiable Realism — synthetic patient records or trial data may statistically resemble real populations without actually capturing the rare pathologies, edge cases, or demographic nuances present in genuine clinical data

2. Bias Amplification — research in breast imaging found that synthetic data can mitigate or, if misapplied, actively amplify bias across patient subgroups including age, tissue density, and ethnicity

3. Provenance & Auditability — regulators need to be able to trace exactly how synthetic data was generated, what real data it was trained on, and whether that training data itself carries embedded biases or quality issues

4. Fidelity Scoring — emerging best practice calls for synthetic data to carry documented fidelity scores, with AI systems explicitly flagging or auto-rejecting predictions that rely on low-fidelity synthetic inputs rather than presenting them with unwarranted confidence

A 2025 review of synthetic data in healthcare AI concluded that responsible use demands a fundamental shift from quantity to verifiable quality — a principle increasingly echoed across both FDA and EMA guidance as both agencies converge toward shared expectations for AI in regulatory science.


The Regulatory Infrastructure Being Built To Answer The Trust Question

Recognizing that no single guidance document can fully resolve this challenge, the FDA has built out dedicated organizational infrastructure to manage AI oversight on an ongoing basis. This includes:

  1. The Digital Health Center of Excellence (DHCoE), housed within the Center for Devices and Radiological Health, which coordinates AI and digital health policy across the agency
  2. The Digital Health Advisory Committee (DHAC), an external expert committee that held its inaugural meeting in November 2024 to provide guidance on fast-moving AI issues
  3. Two new cross-agency councils established in 2025: an External Policy Council setting principles for AI in regulated products, and an **Internal Use Council overseeing how the FDA itself uses AI in its own review processes
  4. Joint FDA-EMA Guiding Principles, expected to align international regulatory expectations for AI credibility assessment by early 2026

The public comment period on the FDA's January 2025 draft guidance closed in April 2025, with the agency signaling that final guidance — incorporating extensive stakeholder feedback on how to handle generative AI and large language models specifically, is expected in 2026.


Conclusion: Conditional Trust, Not Blind Trust

The honest answer to whether regulators can trust AI-generated scientific data is neither a simple yes nor a simple no. It is conditional, evidence-based, and carefully proportional to risk. 

Regulators are not building frameworks to exclude AI from scientific data generation entirely — the technology's value in accelerating discovery and meaningfully improving data quality is already well established across multiple therapeutic areas. 

Instead, they are building frameworks that demand the same rigor and accountability from an AI model that they have always demanded from human-generated science: clearly defined claims, transparent methodology, and evidence proportional to the consequences of being wrong.

The agencies that ultimately succeed at this will not be the ones that trust AI data unconditionally, nor the ones that reject it reflexively out of caution. 

They will be the ones that build the lasting infrastructure like credibility frameworks, fidelity scoring, hallucination detection, and proportional risk assessment, needed to reliably know the difference between AI-generated science that genuinely deserves trust and AI-generated science that merely sounds like it does.


FAQs

1) Can Regulators Trust AI-Generated Scientific Data?

Regulators can trust AI-generated scientific data only when there is sufficient evidence demonstrating that the AI system is reliable for its intended use. Modern regulatory frameworks focus on validating the credibility, transparency, and performance of AI models rather than accepting their outputs at face value. The level of scrutiny typically increases as the potential impact of the AI-generated data on patient safety or regulatory decisions grows. This risk-based approach helps ensure that trust is earned through evidence rather than assumed.


2) What Is AI Hallucination And Why Is It A Concern For Regulators?

AI hallucination occurs when an artificial intelligence system generates information that appears accurate and convincing but is actually incorrect or entirely fabricated. In scientific and regulatory settings, hallucinations can include invented citations, inaccurate data summaries, or unsupported conclusions. Because these outputs are often presented confidently and in a professional format, they can be difficult to identify during routine reviews. This makes hallucination one of the most significant challenges when evaluating AI-generated scientific content.


3) What Is Synthetic Data And How Is It Used In Healthcare Research?

Synthetic data is artificially generated information created by AI models to replicate the characteristics of real-world datasets without exposing actual patient information. It is often used to support research, train machine learning models, and supplement limited clinical data. While synthetic data can improve privacy protection and data availability, regulators must ensure that it accurately reflects real-world conditions. Proper validation is essential to confirm that synthetic datasets do not introduce hidden biases or distort scientific findings.

Author Profile

Simantini Singh Deo

Senior Content Writer

Comment your thoughts

Author Profile

Simantini Singh Deo

Senior Content Writer

Ad
Advertisement

You may also like

Article
The Pattern Behind FDA Warning Letters: What Startups & CDMOs Often Miss

George Kwiecinski