Do I have to review AI scribe notes before they go in the chart?

Yes — and it's not just good practice, it's where the legal and clinical responsibility lives. Guidance the Texas Medical Liability Trust published in July 2025, drawing on Federation of State Medical Boards positions, states that the clinician remains fully responsible for the content of all medical documentation regardless of how it was generated. That guidance is physician-facing, but the underlying principle — that the licensed professional, not the software, is accountable for the record — generalizes to all licensed clinicians. Review the draft against your own memory of the session, edit anything that's off, and sign only what you can stand behind. Automatic signatures are specifically discouraged.

Who is legally responsible for an AI-generated clinical note?

You are — the licensed clinician who signs it. AI can't hold a license, can't be sued, and can't testify. Per the FSMB-based guidance summarized by the Texas Medical Liability Trust (July 2025), the clinician remains fully responsible for the content of all medical documentation regardless of how it was generated; that responsibility doesn't transfer to the software just because a tool drafted the text. When you sign, you are attesting that the note reflects your professional judgment of what happened. That attestation is the whole point of the signature, and it can only land on a person.

Can AI transcription tools really make up things that were never said?

Yes, and it's documented. A study led by researchers at Cornell, the University of Washington, NYU and the University of Virginia, presented at the ACM FAccT conference in June 2024, ran more than 13,000 clear audio clips through OpenAI's Whisper. The Cornell release reports that roughly 1% of transcriptions contained entirely hallucinated phrases the model invented but no speaker said, and in one case the tool fabricated violent content that wasn't in the audio; Associated Press coverage of the same study counted 187 hallucinations across those clips. Hallucinations clustered around pauses and silences, which made them worse for people with speech differences. This is a known failure mode of AI speech-to-text, not a hypothetical — which is exactly why a human who was in the room needs to check the draft.

What makes an AI scribe a 'black box,' and why is that a problem?

A black-box scribe is one where the AI summarizes or transcribes the session, the text drops into the chart, and no human ever compares it to what was actually said. The most extreme version deletes the original audio, so there's no ground truth left to check against. As reported by the Associated Press in October 2024, one widely used Whisper-based medical scribe deletes the source recording 'for data safety reasons.' A former OpenAI engineer put the problem plainly: 'You can't catch errors if you take away the ground truth.' The fix isn't a better model — it's a human in the loop and a record you can verify.

What should I ask before choosing an AI scribe for my practice?

Ask five things, and get the answers in writing. (1) Does a licensed clinician review and sign before anything enters the record? (2) Can I see and edit the draft against the source, and is the original audio or transcript retained so I can verify it? (3) Is my session data used to train any shared or foundation model? (4) Does the vendor sign a Business Associate Agreement? (5) Who is the author of record — and is the answer 'me'? If a vendor auto-signs notes, won't sign a BAA, or can't tell you who the author of record is, walk away.

Why a Clinician Must Sign Every AI-Drafted Note

The Note Is Yours, Even When AI Drafts It

An AI scribe can hand you back the evening you used to spend charting. That's a real gain, and you don't have to feel guilty about wanting it. Being more present in the room, finishing your day on time — those are good outcomes for you and for the people you treat.

But the moment an AI draft becomes part of a client's record, the question changes. It's no longer "did this save me time?" It's "who is accountable for what this note says?" That record might one day be read by the client, a licensing board, an auditor, or a court. You need to be able to stand behind every line of it.

Here's the thesis of this post, stated plainly: a black-box scribe that nobody reads is a liability, not a convenience. The safeguard that holds isn't a smarter model or a lower error rate. It's a licensed human who reads the draft against their own memory of the session and signs it. That principle is the fourth plank of our No AI Therapist Pledge — the clinician is always the author of record — and this post is the case for why that plank matters more than almost anything else an AI EHR can promise.

What a "Black-Box Scribe" Actually Means

Let's define the failure mode precisely, because the danger is specific. An AI scribe listens to (or transcribes) a session, generates a summary or a note, and that text drops into the chart. In a black-box setup, no human ever compares the generated text to what was actually said. The draft becomes the record by default.

That would be fine if AI transcription only ever misheard — a wrong word here, a garbled name there, the kind of error you'd catch on a quick read. But the documented risk is stranger and more serious than mishearing. AI transcription doesn't just get words wrong. Sometimes it invents words, sentences, and events that were never spoken at all.

One important boundary before we go further: this post is about AI scribes — tools that document a session. It is not about AI "therapist" chatbots that try to deliver care. Those are a different technology with a different set of harms, and conflating the two muddies the argument. The evidence below is specifically about speech-to-text and note generation.

The Evidence: AI Transcription Invents Words No One Said

The clearest study on this comes from researchers at Cornell, the University of Washington, NYU, and the University of Virginia. Titled "Careless Whisper: Speech-to-Text Hallucination Harms," it was presented at the ACM Conference on Fairness, Accountability, and Transparency (FAccT) in June 2024. The team examined OpenAI's Whisper, one of the most widely used open speech-to-text models.

The Cornell release reports that roughly 1% of transcriptions contained entirely hallucinated phrases — passages of text the model generated that no one had said. In one example, the tool correctly transcribed a single sentence and then invented several more sentences of violent content, with words like "terror," "knife," and "killed" that were nowhere in the original audio. (Coverage of the same study by the Associated Press put a count on it: 187 hallucinations across more than 13,000 clear audio snippets.) Hallucinations clustered around pauses and silences, which means they disproportionately affected speakers with aphasia or other speech differences — exactly the kind of clients a behavioral-health practice often serves.

The pattern shows up beyond the lab, too. An Associated Press investigation by Garance Burke and Hilke Schellmann, published October 26, 2024, gathered accounts from people working with Whisper in the field. A University of Michigan researcher found hallucinations in 8 of every 10 public-meeting transcriptions he reviewed. A machine-learning engineer found them in about half of more than 100 hours of audio. A developer found them in nearly all of 26,000 transcripts he created. These are different corpora measured different ways — not a single clinical hallucination rate — but together they show this is a recurring, real-world behavior, not a one-off.

The healthcare-specific kicker is the part every clinician should sit with. As the AP reported, a Whisper-based medical scribe from the company Nabla has been used by more than 30,000 clinicians and 40 health systems across roughly 7 million medical visits. And that tool deletes the original audio recording — for stated data-safety reasons — which means there is no ground truth left to check the transcript against. A former OpenAI engineer summed it up: "You can't catch errors if you take away the ground truth." OpenAI itself has warned that Whisper should not be used in "high-risk domains." A clinical record is the definition of a high-risk domain, and a tool that destroys the only thing you could check the note against is the definition of a black box.

A note that no human read, generated by a tool that deleted the audio, is not documentation. It's an unverifiable claim about what happened in the room.

To be clear about scope: the Nabla and Whisper details above are an industry example of the black-box pattern, drawn from public reporting. They describe how some tools in the market behave — not how Coral's own pipeline works. Coral's specific transcription internals are covered separately below, and any specific claim there is marked for verification rather than asserted.

Why "The Model Will Get Better" Isn't the Answer

The natural reply is: sure, but these models improve every year, and the error rate keeps dropping. True — and still beside the point.

A lower hallucination rate is not a zero hallucination rate. And in a clinical record, the cost of the rare invented line is asymmetric. A medication that was never prescribed. A symptom the client never reported. A risk statement whose meaning flips from "denied suicidal ideation" to something else entirely. One of those errors, sitting unread in a chart, can do more damage than a hundred saved hours can repay.

You also cannot quality-control your way out of this if no human is in the loop and the source audio is gone. There is nothing to QA against. The only reliable check on a draft is a person who was actually in the room, reading the text against their own memory of the session. No model improvement removes the need for that person — it just changes how often that person catches something. The check itself is non-negotiable.

The Signature Is the Safeguard

Here is the reframe at the center of this whole argument. A clinician's signature on a note is not a formality or a piece of bureaucratic friction. It's the moment a block of text becomes a clinical and legal attestation. When you sign, you are saying: this reflects my professional judgment of what happened in this session. That sentence cannot be true unless a professional actually read it.

What the Law Already Assumes

The legal framing lines up with the clinical one. In July 2025, the Texas Medical Liability Trust published risk-management guidance on AI medical scribes (author Wayne Wenske), drawing on positions from the Federation of State Medical Boards. The core point: physicians remain fully responsible for the content of all medical documentation, regardless of how it was generated. Clinicians must review, edit, and sign AI scribe output before it enters the record. Automatic signatures on AI-generated content are discouraged. The responsibility for the record stays with the clinician — it isn't something the software can carry, no matter how much of the draft it produced.

One honest caveat: that guidance is physician-facing, and it speaks to medical boards rather than to a therapist-specific statute. So treat it as the principle it is, not as a regulation written for your license. But the principle generalizes cleanly to every licensed clinician: the human professional, not the software, owns the record.

The reason it generalizes is simple and hard to argue with. AI can't hold a license. It can't be sued. It can't testify. When something in a record is challenged, accountability has to land somewhere — and there is nowhere for it to land except the human who signed. Sign-off isn't a checkbox bolted on top of AI. It's the only place professional responsibility can actually live.

How This Shows Up in Good Design

A well-designed AI EHR treats sign-off as the spine of the workflow, not an afterthought. Our pledge states the commitment directly:

Our AI drafts, transcribes, summarizes, and suggests. It does not diagnose, it does not decide, and nothing it produces enters the record until a licensed clinician has read it and signed it. Every suggestion is reviewable and traceable.

Two things follow from taking that seriously, and they map onto the two claims we think any honest AI EHR should be able to make.

The first is that a good system makes the draft easy to review against the truth — traceable, editable, reviewable — which is the exact opposite of deleting the audio. The whole failure of the black-box scribe is that it removes your ability to check. A care-first design preserves it.

The second is that your data trains nothing. The draft works for the clinician whose data it is, within that clinician's boundary; the session is not pooled into a shared training set. Our pledge's second plank puts it honestly: the AI works on that clinician's behalf, within that clinician's boundary, and we will not feed your patients' data to a model that outlives the relationship it came from — and, in its own words, "today, the simple answer is: we don't." That's a present-tense commitment with an honest opt-in softener, not a grand absolute, and we represent it that way on purpose.

Here is how that works in CoralEHR, specifically:

No recording required. A note draft is built from the notes you type during the visit, so by default there is no ambient session audio to retain. And because every draft — however it is produced — is reviewed and signed by a licensed clinician before it enters the record, you remain the check on anything the AI generates.
First-party model, no training on your data. The AI runs on Anthropic's first-party Claude API — never AWS Bedrock or a third-party aggregator — through a single audited code path under a Business Associate Agreement; your patients' data is not used to train models, and access is scoped to the clinician-patient relationship.
Preliminary until you sign. An AI note is saved as a preliminary draft and does not become part of the clinical record until you sign it in a separate, deliberate signing step. There is no auto-sign.

These describe how the product is built today, not promises for later. (Separately, our pledge notes that we are pursuing independent third-party security attestation appropriate to healthcare — we don't claim those attestations as already held.)

What to Ask Any AI Scribe Vendor

You don't need to take anyone's word for this — including ours. Turn the argument into a short due-diligence checklist and put it to every vendor you consider. Get the answers in writing, in a Business Associate Agreement or your subscription terms, not on a marketing page.

Does a licensed clinician review and sign before anything enters the record? If the tool auto-signs notes, walk away.
Can I see and edit the draft against the source? Is the original audio or transcript retained so you can verify what the AI produced? A tool that deletes the source has removed your ability to catch its errors.
Is my session data used to train any shared or foundation model? Get the answer in writing — and look for the difference between "we won't use identifiable data" and "we won't use your data at all."
Does the vendor sign a BAA? This is the HIPAA floor. (Our deeper guide to vendor due diligence lives in the HIPAA guide for private-practice therapists, and you can pressure-test your whole stack with the free HIPAA Compliance Checklist.)
Who is the author of record? The only acceptable answer is: you.

Care You Can Defend

Strip everything else away and the argument is short. AI can carry the paperwork. It cannot carry the accountability.

The safeguard that actually holds isn't a promise that the model will never hallucinate — no honest vendor can make that promise. It's a licensed human who reads the draft and signs it, working inside a system designed to make that review honest: the source stays checkable, the edits stay traceable, and the data trains nothing. That combination is what lets you get your evenings back and keep a record you can stand behind to a client, a board, or a court.

That's the trade we think is worth insisting on. Not "trust the AI." Not "fear the AI." Just: let the AI do the drafting, and keep the signature — and the responsibility it represents — exactly where it belongs.

Ready to see how this works in practice?

Read our No AI Therapist Pledge — including plank four: the clinician is always the author of record.
Telling clients you use AI for notes — a care-first guide to disclosure and consent.
HIPAA Compliance Checklist — a 35-item self-assessment to pressure-test your whole vendor stack.
Explore CoralEHR — an AI-first, HIPAA-compliant EHR built for private-pay behavioral-health practice.

Additional Resources

Internal:

The No AI Therapist Pledge — our contractual commitments on AI and your data
HIPAA Made Simple: A Private Practice Therapist's Guide — BAAs and vendor framing
How to Tell Clients You Use AI for Notes — disclosure and recording consent
HIPAA Compliance Checklist — interactive vendor and safeguards self-assessment

External:

Cornell: AI speech-to-text can hallucinate violent language — the FAccT 2024 "Careless Whisper" study
Associated Press (via TechCrunch): Whisper transcription tool has hallucination issues — the October 2024 investigation, including the Nabla deployment
Texas Medical Liability Trust: Using AI medical scribes — risk management considerations — July 2025 guidance drawing on the FSMB

Why a Clinician Must Sign Every AI-Drafted Note

The Note Is Yours, Even When AI Drafts It

What a "Black-Box Scribe" Actually Means

The Evidence: AI Transcription Invents Words No One Said

Why "The Model Will Get Better" Isn't the Answer

The Signature Is the Safeguard

What the Law Already Assumes

How This Shows Up in Good Design

What to Ask Any AI Scribe Vendor

Care You Can Defend

Additional Resources

Frequently Asked Questions

Related Articles

How to Tell Clients You Use AI for Notes (Without Losing Trust)

Why CoralEHR Doesn't Require Recording Your Sessions

90834 vs 90837: The Billing Difference (and the Revenue You May Be Losing)

Stay Updated with CoralEHR Blog