ELSA AI at FDA: Unmasking Dangerous Assumptions in Life Sciences

Whether in the boardroom, across operational teams, or in daily task execution, unchecked assumptions are one of the greatest threats to value creation, and they operate at every level of a life sciences company. Strategically, they can divert multi-billion-dollar investments into the wrong solutions or markets [1][2]. Operationally, they can degrade quality and safety, sometimes impacting regulatory standing or patient well-being [2][3]. Tactically, they can cause knowledgeable people to follow flawed processes simply because “that’s the way it’s done” [3].

When assumptions go unchallenged, the risks are profound:

Strategic drift, lost market opportunities, and regulatory setbacks [2]
Operational errors, inefficiencies, and compliance risk [3][4]
Misallocation of resources or credibility loss [1][2]

Elsa: What she is, what was expected, and what happened

What is ELSA? ELSA (“Expert Language Synthesis Algorithm”) is the FDA’s generative AI agent launched to accelerate the regulatory review of drug and device submissions. ELSA was intended to automate document review, data synthesis, and provide recommendations to FDA scientists and reviewers, making the approval process more efficient and potentially more consistent [3][4][5][6].

Expected outcomes:

Accelerated document processing and regulatory approvals [3][7]
Reduced administrative burden on human reviewers [3][4]
Higher consistency, transparency, and perceived analytic rigor in reviews [5]
Model for AI transformation in public health processing [7][5]

Key stakeholders:

FDA regulatory reviewers and scientific staff: Users of ELSA outputs, responsible for interpreting AI-generated recommendations and making approval decisions [3][4].
Sponsor companies (pharma, biotech, medtech): Submit dossiers and rely on FDA decisions; their operational timelines and commercial strategies are impacted by ELSA’s effectiveness [8].
Patients and the broader public: Beneficiaries of timely and safe drug/device approvals, whose health depends on the rigor of the review process [5].
Policymakers and technology vendors: Monitor and influence AI’s evolving role and governance in public health regulatory frameworks [8][5].

Current state & public statements: After launch, major flaws emerged. Elsa produced convincing but hallucinated study data, sometimes inventing entire studies or fabricating clinical findings [3][4][6]. Despite FDA officials’ reassurance that “many people reviewed the outputs” and “no single person or tool made the final call,” collective involvement was too often conflated with genuine scrutiny [3][6]. This conflation appears to have led to the critical assumption that the presence of the human in the loop equated to critical review and validation of the outputs [3][4][6]. As a result of these issues and public scrutiny, the FDA paused Elsa’s deployment, initiated internal and external audits, and promised enhanced governance and transparency before any future rollout [3][6].

The missteps

The primary reported challenges with ELSA deployment include:

Hallucinated and inaccurate outputs: ELSA frequently produced fabricated (“hallucinated”) studies, incorrect citations, and inaccurate or incomplete summaries of regulatory and clinical documents. FDA reviewers and internal reports found that some outputs appeared credible but were not factually correct, introducing significant risk into regulatory processes [9][10][11][12][13][14][15].
Lack of reliable human oversight: FDA leadership assumed that “many eyes” reviewing ELSA’s outputs would catch errors. In practice, this led to a diffusion of responsibility, human review was often superficial, and reviewers sometimes accepted AI-supplied information without thorough scrutiny [10][11][12][16].
Poor benchmarking and transparency: The FDA did not establish clear benchmarks or performance targets for the AI’s outputs. It remains unclear how ELSA’s results are evaluated, when and how humans intervene, and what “success” means for the tool within FDA workflows [9][12][17].
Process and governance gaps: The implementation was widely described as rushed, with internal and external experts noting the lack of robust validation, operational guardrails, and clearly defined roles for AI oversight. Concerns about patient safety, data integrity, and accountability in regulatory decision-making arose as a result [12][17][18][16].
Systemic overreliance on AI: ELSA was expected to be a productivity breakthrough, but instead, it often required extra time for reviewers to spot and correct errors, undermining efficiency and increasing the risk of regulatory missteps [10][11][12][15].

One significant lesson from Elsa’s story? Process alone, even one promising collective review, is no substitute for true challenge and ongoing assumption validation. Oversight too often becomes a box-checking exercise. It appears that assumptions may have been made regarding accountability, such as roles and responsibilities being clearly defined, as well as assumptions that the process, including humans in the loop, would mitigate risks of flawed outputs. The presence of the “human in the loop” was conflated with scrutiny, leading to the critical assumption that individuals reviewing the material were examining it with a critical eye and validating the information [3][4][6].

As I analyzed in Strategy’s Silent Saboteur: How Unchecked Assumptions Can Cost Life Science Companies Billions:

❝

“The disciplined questioning of underlying assumptions continues to plague even the most sophisticated organizations… When teams accept foundational assumptions without rigorous validation, even the most sophisticated strategic frameworks produce suboptimal outcomes that can destroy years of investment and organizational credibility.” [1]

This ripple effect is clear: whether in the executive suite or at the AI interface, organizations too often fall victim to unvalidated assumptions across each layer of engagement, from strategy to operations and implementation.

What could have prevented ELSA's stumbles pertaining to unquestioned assumptions?

To avoid operational failures rooted in assumption blindness, organizations should embed:

Assumption audits: Explicit cataloguing and rigorous testing of the premises behind tools, systems, and frameworks. This process helps identify hidden or implicit assumptions that might otherwise go unchallenged, preventing strategies and technologies from being built on faulty foundations. Rigorous assumption audits can mitigate risks such as overlooked vulnerabilities and systemic blind spots.
Red teaming and pre-mortem analysis: Assigning dedicated "devil’s advocate" roles or teams to deliberately challenge expected outcomes and proposed processes before launch. This approach sparks critical debates that uncover weaknesses and potential failure points early. Red teaming mitigates risks of groupthink and confirmation bias, improving robustness of decisions.
Scenario testing: Proactively exploring "what if" situations where systems produce unexpected results or fail under certain conditions. This anticipates operational contingencies and fosters resilience by preparing plans to detect and correct errors rapidly. Scenario testing mitigates risks linked to rare but impactful failures that might not be captured in standard validation.

Building a culture of assumption validation

Elsa’s case underscores that success in digital transformation and AI adoption requires not just technical capability or procedural adherence but a deep organizational commitment to assumption validation. It's about building frameworks and mindsets that demand any foundational premise, be it strategic or operational, be rigorously challenged and thoroughly validated before being adopted. Validation reduces the chances of costly failures and strengthens confidence in decision-making across levels.

Elsa’s lessons reach well beyond the FDA. I invite every life sciences executive, innovator, and operator to reflect: Are we truly challenging and validating our underlying assumptions to ensure that strategic and operational roadmaps rest on validated foundations? Validated assumptions reduce risk and enable stronger, more resilient decision-making. The goal is not to challenge assumptions for their own sake, but to validate them thoughtfully and rigorously, thereby mitigating risks that arise from flawed or unvalidated premises.

For a detailed framework on assumption validation for strategy and execution, see: Strategy’s Silent Saboteur: How Unchecked Assumptions Can Cost Life Science Companies Billions

ELSA and the echo chamber