FDA's AI Clinical Trials Pilot Is a Documentation Problem

The timing

Why this matters now

FDA announced its real-time clinical trials initiative on April 28, 2026, including proof-of-concept work with sponsors and a broader RFI on AI-enabled early-phase trials. The original Federal Register notice was published April 29, 2026. FDA later extended the comment deadline to June 29, 2026.

The RFI is not a binding rule. It is not a final guidance. But it is a very strong signal about where the conversation is going — FDA is asking how AI could support early-phase decisions while maintaining scientific rigor and trustworthy oversight.

The teams that will be best positioned are not the teams using the most AI. They're the teams that can show what the AI did, what it did not do, how humans reviewed it, and how the final decision was justified.

For anyone building or using AI in clinical development, this is the right moment to tighten the documentation habits around AI-assisted work — before a pilot, an inspection, or an Information Request forces the question.

The problem

Most AI use today is invisible

Someone asks a model to summarize a protocol. Someone uses AI to compare an investigator brochure against a clinical protocol. Someone asks a model to draft inclusion and exclusion criteria, rewrite a safety rationale, or hunt for inconsistencies across IND documents.

That can be genuinely useful. But if the output influences regulated work and there's no record of the prompt, the model, the source documents, the assumptions, the limitations, the review, or the final human disposition — the team has quietly created a gap.

The issue is not that AI was used.

The issue is whether the team can reconstruct how AI was used when it mattered.

None of this is a new principle. Regulated industries already have a name for it — ALCOA+: attributable, legible, contemporaneous, original, accurate. And a regulation that enforces it for electronic records: 21 CFR Part 11. AI doesn't get its own framework. It gets held to the one you already follow. The only genuinely new thing is that some of the input is now produced by a model — which means the model, the prompt, and the human review become part of what "attributable" and "original" have to cover.

For regulatory documents, reconstruction is the whole game. You want to be able to answer:

What information was provided to the AI system?
What version of the AI system was used?
What prompt or workflow produced the output?
What did the system actually produce?
What did the human reviewer accept, reject, or modify?
What source evidence supports the final document text?
Was protected, confidential, or proprietary information handled appropriately?
Was the AI used for brainstorming, drafting, review, or actual decision support?

Those last four words matter. Brainstorming, drafting, review, and decision support are different risk levels. Treating them all the same is either too loose or too burdensome. The work is figuring out which is which.

The checklist

What regulatory teams should document

Here's the practical checklist I'd start using now. None of it requires you to become an AI engineer. It asks you to treat AI-assisted work as part of the document lifecycle — which is exactly what it already is.

Document the AI use case

Don't write "used AI." Write what the AI was used for.

"AI used to identify inconsistencies between the protocol synopsis and the full protocol."
"AI used to summarize nonclinical toxicology findings for drafting support."
"AI used to compare draft inclusion criteria against study objectives."
"AI used to support dose-selection rationale review — not to make the final dose-selection decision."

Separate drafting support from decision support. Helping you write clearer prose is one thing. Influencing a clinical or regulatory decision is another — document it more rigorously.

Save the prompt or workflow version

If the same system gives different results depending on the instructions, then the instructions are part of the method. For repeated workflows, version your prompts the way you version templates, checklists, and SOP-driven work instructions. At minimum: prompt name, version, date, owner, intended use, input document types, output format, known limitations, and required human review steps.

"Protocol Consistency Review Prompt v1.3, used May 31, 2026, to compare the protocol synopsis, schedule of activities, and inclusion/exclusion criteria. Output required human review before any document edits."

Don't rely on one-off prompts buried in chat history. Create approved prompt templates for common workflows.

Record the model and system version

Where available: model name, vendor or system, version or release date, temperature or variability settings, retrieval settings, and whether the model had access to uploaded files, connected drives, or web search.

If the exact backend version isn't exposed, document what is. "ChatGPT Enterprise workspace, model selected: GPT-5, exact backend version not exposed" beats no record at all.

Save the source documents used as inputs

AI output is only as reliable as its inputs and retrieval. For each AI-assisted review, record which documents were used — title, version, date, and system location.

Protocol v0.4, dated May 20, 2026
Investigator Brochure v2.1, dated April 12, 2026
Nonclinical summary table, exported from the controlled repository on May 30, 2026

If the system retrieved across a repository, save the retrieval scope — "limited to the approved protocol folder," or "searched the draft IND package folder."

Retain conversations when they influence regulated work

This is the big one, and my answer is: not every conversation, but definitely the meaningful ones. Casual brainstorming that goes nowhere doesn't need to be preserved turn-by-turn. But when a conversation materially influences regulated content — a clinical rationale, a submission document, a response strategy, a quality decision — preserve enough to reconstruct the work. Save the conversation, or an audit summary, when AI was used to:

Generate or revise regulatory document text
Identify discrepancies across source documents
Support a clinical, nonclinical, CMC, or safety rationale
Recommend changes to protocol design
Draft a response to FDA or another regulator
Support go/no-go, dose selection, cohort expansion, or patient selection

If exporting full conversations is too messy, write an "AI Use Summary" instead: prompt, inputs, output, reviewer, disposition, and the final source-backed conclusion.

Capture human review and disposition

AI should not be the final author of a regulatory judgment. For each material output, record what the human reviewer did with it. The categories can be simple — Accepted, Accepted with edits, Rejected, Needs follow-up, Escalated — plus reviewer name, role, date, and rationale.

"AI flagged a mismatch between exclusion criterion 7 and the renal-impairment language in Section 5.2. Clinical lead confirmed the mismatch. Protocol updated in v0.5. Accepted with edits."

Don't only save AI outputs. Save the human decision about the output. That decision is the record that actually protects you.

Link final claims back to source evidence

For regulatory writing, the final document should never rely on AI as the source. AI can help find, compare, summarize, or draft — but the claim has to trace back to controlled evidence: a study report, a protocol section, the investigator brochure, a literature reference, the SAP, a nonclinical data table, a safety database extract, or prior agency correspondence.

Use AI to generate a "claim-to-source" table, then have a human verify it. That's one of the highest-value uses of AI in regulatory writing — and one of the easiest to audit.

Define what AI is not allowed to do

Document the boundaries, not just the capabilities.

AI may suggest inconsistencies, but cannot approve protocol changes.
AI may summarize safety data, but cannot determine causality.
AI may draft response language, but cannot submit or approve agency responses.
AI may identify missing evidence, but cannot decide that evidence is unnecessary.

Put these boundaries directly into SOPs, work instructions, and prompt templates — where people will actually see them.

Protect confidential and patient information

Before using AI with clinical or regulatory documents, know what data is being shared, where it's processed, whether it's retained, and whether the system is approved for that data type. Ask: Is PHI included? Is confidential sponsor information included? Is the environment approved by IT, legal, privacy, and security? Are uploads retained for training? Can data be deleted or exported? Are access controls and audit logs available?

Don't paste sensitive protocol, patient, safety, or CMC information into unapproved tools. The convenience is never worth the confidentiality and recordkeeping risk.

Build an AI documentation packet for high-impact workflows

For higher-risk uses, assemble a small packet that travels with the document history: the AI use summary, the prompt or workflow version, model/system information, the input document list, the output or exported conversation, the human review record, accepted/rejected findings, source-evidence links, the final document version affected, and any open discrepancies.

Keep it light enough that people will actually do it. A good one-page record beats a perfect system nobody uses.

A worked example

If you're a Word and SharePoint shop

Most of the small and mid-cap biotechs I talk to aren't going to stand up a new system for this. They write in Word and they store files in SharePoint. So here's how I'd actually do it there — no new tooling, no migration, just discipline applied to where the work already lives.

The principle: the record lives next to the document it describes.

Store the AI Use Summary as a one-page .docx in the same SharePoint library as the document it supports, named to mirror that document and the version it tracks:

/IND-2026/Module-2/
6.6-nonclinical-overview_v0.4.docx
6.6-nonclinical-overview_AI-USE_v0.4.docx   <- the record
6.6-nonclinical-overview_v0.5.docx          (after review)
6.6-nonclinical-overview_AI-USE_v0.5.docx

That naming convention does real work. Anyone opening the folder six months from now can see, at a glance, that an AI-assisted step happened, which document version it touched, and where to read the details. The record didn't get separated from the thing it explains.

Three SharePoint settings turn that folder into something an auditor can trust:

Turn on Versioning for the library (Library settings → Versioning settings → major versions). Version history gives you a record of who changed what and when, in a tool you already own.
Add a "Disposition" column to the library (a Choice column: Accepted / Accepted with edits / Rejected / Needs follow-up / Escalated). Now the human decision is filterable metadata, not a sentence buried in a file.
Add "Reviewer" and "Review date" columns. Accountability becomes a property of the document, visible in the list view, instead of something you reconstruct later from memory.

Be honest with yourself about one thing here.

SharePoint version history is a good habit. It is not a 21 CFR Part 11 audit trail.

A site administrator can edit metadata, prune versions, or turn versioning off, and SharePoint's native history doesn't capture the old value, the new value, and the reason for the change the way Part 11 expects. So this recipe is the right floor for a small team starting out — it is genuinely better than a chat tab — but if an AI-assisted step feeds something that has to withstand an inspection, the record needs to live in a system with a real, tamper-evident, append-only audit trail. That gap is exactly the line where a purpose-built validated system stops being optional. Know which side of it you're on.

The one-page .docx itself is just the checklist above, collapsed onto a single page: use case, prompt name and version, model and system, input documents with versions, the output or a link to the exported conversation, the reviewer's disposition and rationale, and the source-backed conclusion. Make it a Word template so it's one click to start, and so every contributor produces the same shape of record.

One discipline that pays for itself: keep the reviewer and the disposition in SharePoint and in the document — not in a chat tab. A chat history is not a record. The moment the conversation lives only inside a vendor's web app, it's outside your controlled environment, outside your retention policy, and outside the folder the next person will look in.

This is the same logic that shapes how we built Chrona Bio: documents stay in your tenant, every finding anchors to a specific document version with a checksum, and dismissals require a rationale that's written down. You don't need our product to adopt the habit. You do need the habit.

The shift

Document the document's reasoning environment

The old habit was: write the document, then QC the document. The new habit may need to be: document the document's reasoning environment — source documents, AI tools, prompts, assumptions, human review, and final disposition.

This doesn't mean regulatory writers become AI engineers. It means regulatory teams treat AI-assisted work as part of the document lifecycle. If AI helps find inconsistencies across IND documents, that's useful. If it helps identify missing evidence, that's useful. If it helps a team move faster, that's useful too.

But the value only holds up if the team can explain what changed, why it changed, and who decided.

Where to start

Three small changes you can make this week

If your team is using AI in regulatory writing today, you don't need a governance program to begin. Start here:

Create approved prompt templates for common workflows.

Stop relying on one-off prompts buried in chat history. Version them like you version everything else.

Require an AI Use Summary when AI output affects regulated content.

One page, stored next to the document it supports. Prompt, inputs, output, reviewer, disposition, source-backed conclusion.

Link every accepted AI-assisted finding back to a controlled source.

The claim traces to evidence — not to the model. AI helped you find it; the source is why it's true.

That's not a complete AI governance program. It's a practical foundation — and it's the difference between explaining your work in an afternoon and reconstructing it under pressure.

FDA's RFI is a reminder that AI in clinical development is moving from experimentation toward operational reality. The teams that prepare now will have an easier time explaining their work later.

Sources & further reading