Industry Insights

De-Identified Is a Door, Not a Wall: What SimplePractice's June 16 Transcript Change Means for Your Clients

On June 16, 2026, SimplePractice begins retaining de-identified session transcripts to improve its AI. Here's what 'de-identified' really protects — and why 'trains nothing' is a stronger promise than 'trains only de-identified.'

CT

CoralEHR Team

· 10 min read

On June 3, 2026, SimplePractice published a blog post explaining how it protects privacy through de-identification. Thirteen days later, on June 16, that explanation acquires a job: it becomes the safeguard standing behind a new policy. As of that date, SimplePractice begins retaining session transcripts from its Note Taker AI scribe — de-identified and de-coupled — to improve its AI.

We want to read this carefully and fairly, because the market leader sets the norm, and because the loudest version of this story ("they sell your data") is the wrong one. SimplePractice says plainly that it does not. The real question is quieter, and it is the one that should keep a thoughtful clinician up at night: what does "de-identified" actually protect when the data is a therapy session?

What SimplePractice's own pages say

We'll quote only SimplePractice's two dated pages, and we'll keep the two sources separate so you can check each claim against its origin.

From the Transcript Retention FAQs (SimplePractice Support, June 2026), the mechanics:

  • Beginning June 16, 2026, SimplePractice retains "a de-identified and de-coupled version of a Note Taker session transcript, to help improve Note Taker and other AI-powered features, such as those included in Care Aide."
  • Retention is optional and managed "at the clinician, client, or session level."
  • Defaults split by date: clinicians who used Note Taker before June 16, 2026 are "opted out of transcript retention by default," with "no transcripts retained unless you choose to opt in"; clinicians who enable Note Taker on or after June 16, 2026 are "opted in to transcript retention by default" and "can opt out at any time."

And, importantly, the same FAQ sets a clear limit we want to represent fairly: SimplePractice states it will never sell transcript content, and that retained transcripts "will be used only to improve SimplePractice's AI features and will never be shared with third parties for their commercial purposes." We take that at face value. This post is not an accusation that anyone is selling therapy data.

From the separate de-identification blog ("How SimplePractice protects your privacy through de-identification," June 3, 2026), the method:

  • De-identification is "the process of removing personal details from health-related data so that the information can no longer be linked back to any specific person."
  • The standard is HIPAA's Safe Harbor method: "all 18 of the following personal identifiers must be removed before data is considered de-identified."
  • De-coupling replaces removed identifiers "with generic placeholders that are not derived from the original values, so they cannot be reversed or traced back to a specific person."
  • The purpose: "De-identified transcripts will be used to improve Note Taker and related AI features."

So the promise, stated precisely, is this: we will not train on your clients' identifiable data; we reserve the right to train on a de-identified version of it. That is an honest description of a real safeguard. It is also where the whole argument lives.

Why "de-identified" is a door, not a wall

Safe Harbor was designed for structured health data — a lab value, a billing code, a diagnosis line. Strip the name, the address, the dates, the record number, and a row in a billing table genuinely does stop pointing at a person. For that kind of data, de-identification is close to a wall.

A psychotherapy transcript is not that kind of data. It is a narrative — the events of a person's life, the people in it, the turns of phrase they used in a room whose entire premise was that no one else was listening. You can remove all 18 Safe Harbor identifiers and still be holding something unmistakably someone's. The particular shape of a loss, the sequence of a relationship, the specific fear named on a specific afternoon: that is not metadata you can swap for a placeholder. It is the content itself, and the content is what makes a session recognizable.

This is what we mean when we say de-identified is a door, not a wall. A wall is a thing you cannot get through. A door is a thing that is closed today and can be opened later — by re-identification research, by a future model that learns the gait of a narrative the way it learns the gait of a sentence, or simply by the fact that the de-identified material is now sitting in a training corpus that outlives the relationship it came from. "Validated to meet Safe Harbor" is a legal threshold. It is not a promise that the story is unrecognizable, and it is not the same thing as the story never being used to build an AI at all.

None of this requires bad intent on anyone's part. SimplePractice can keep every promise it made on those two pages — never sell, never share with third parties, de-identify to Safe Harbor — and the underlying material is still the transcript of a therapy session, sitting in a corpus, being used to improve an AI. The adjective is doing a lot of work. Watch the adjective.

"Trains nothing" beats "trains only de-identified"

Here is the cleaner promise, and it's the one we make. We do not promise to spare your clients' identifiable data. We promise never to use your patients' data — identifiable, de-identified, or aggregated — to train, fine-tune, or evaluate AI at all. That is plank #1 of our No AI Therapist Pledge: we will never use clinical data to train a model meant to deliver therapy, "not the raw data, not a 'de-identified' or aggregated version of it, not through a vendor."

Plank #2 is the structural version of the same line: we will never pool your patients' data into a shared or foundation model. Our AI "works for the clinician whose data it is, on that clinician's behalf, within that clinician's boundary — and nowhere else." There is no de-identified back door, because there is no training pipeline for the door to open into.

Two promises, side by side:

  • Trains only de-identified data. A door. It scopes which version of the data gets used, and trusts the de-identification to hold for content it was never designed for.
  • Trains nothing. A wall. There is no version of your clients' words that becomes training material, because none of it is used for training in the first place.

The second is not a marketing flourish; it is a categorically smaller surface area. You cannot re-identify a transcript that was never retained for training. You cannot leak a corpus that doesn't exist.

What this looks like in CoralEHR

We say this plainly because we have to be able to back it, so here is how it actually works in the product — described honestly, without overstating it.

Note help needs no recording. Our AI drafts notes from the text a clinician types into a scratchpad, plus structured chart fields — not from an audio recording of the session. The note-help path is built to work with zero recording. We think "no recording required" answers the three things clinicians dislike most about scribes at once: it doesn't change the room, it removes the per-session consent burden, and a session that was never recorded can't be subpoenaed as a recording. (To be precise: an optional ambient path exists as early scaffolding in the interface, gated behind a feature flag and a dual-consent step, and it is not wired up on our backend today — note help does not depend on it.)

The clinician is the author of record. AI-drafted notes are saved as preliminary; they do not enter the chart as final until a licensed clinician reads and signs them. There is no auto-sign and no auto-accept anywhere in the product — the transition to a signed, final note is always an explicit human action. That is why a clinician must sign every AI-drafted note, and it is plank #4 of the pledge: the AI drafts, transcribes, and suggests; it does not decide.

Inference, not training. Our AI uses the Anthropic Claude API to generate drafts and suggestions, under a Business Associate Agreement whose commercial terms commit that customer data is not used to train Anthropic's models. The data is used to produce this clinician's draft, in the moment, and not exported into any training or fine-tuning process — because we don't run one. This part rests on contract and architecture, not on a magic switch, and we'd rather describe it that way than imply more.

A couple of honest limits, since the whole point is honesty: treatment-plan AI returns suggestions for clinician review, never a finished plan, and validated instruments like the PHQ-9 and GAD-7 are attached verbatim from a fixed catalog, not authored by AI. We are HIPAA-compliant and sign BAAs; independent attestations like SOC 2 and HITRUST are something we are pursuing, not something we hold today. We'd rather you know exactly where the line is than discover it later.

What to do before June 16

If you use SimplePractice's Note Taker, the practical steps are straightforward, and they're worth taking on your clients' behalf:

  1. Check your default. If you started using Note Taker after June 16, 2026, retention is on by default. Per SimplePractice's FAQ, you can opt out "at any time," and you can manage it "at the clinician, client, or session level."
  2. Decide per client, not just globally. Some clients may be fine with it; some absolutely will not be. The session-level control exists — use it deliberately rather than accepting the default.
  3. Ask the one question that matters of any vendor. Not "do you sell my data" (most will say no, truthfully). Ask: do you use my clients' data — de-identified or not — to train AI? The honest answers sort the field quickly.

This isn't a reason to distrust everyone building AI for therapy. It's a reason to read the adjective. "De-identified" is a real protection for the data it was designed for. A session transcript is not that data — and "we train nothing" is the only answer that closes the door instead of describing how carefully it's left ajar.

If you want to see exactly where we've drawn the line, in our own words and scoped to what we control, read the full No AI Therapist Pledge. And if you're weighing the switch, our honest breakdown of SimplePractice's 2026 pricing lays out the rest of the cost stack.

Related reading: Ethical AI for therapists — the bigger frame · AI trust & data — where your clients' data goes · Why CoralEHR doesn't require recording your sessions — no transcript to retain in the first place · Therapists don't hate AI — they hate AI that pretends to be a therapist.


This article is for general informational purposes and is not legal, compliance, or clinical advice. It summarizes SimplePractice's own published statements as of June 2026 and our own product as built; policies and features on both sides may change. Verify any vendor's current terms against their official documentation, and consult qualified counsel for your own compliance obligations. Any clinical scenario described here is illustrative and fictional.

Frequently Asked Questions

Share:
CT

CoralEHR Team

CoralEHR Team

Stay Updated with CoralEHR Blog

Get the latest insights on modern healthcare solutions, practice management, and therapy workflows delivered to your inbox.