There is no single “off” switch
There are plenty of ways to opt out of AI training, but they’re scattered, each covering one slice:
| Mechanism | Controls | Catch |
|---|---|---|
| robots.txt (AI UAs) | Which crawler may fetch which paths | Voluntary; small crawlers ignore it |
AIPREF Content-Usage |
Standardized “train vs search” preference | Delivery draft expired 2026/5; no RFC yet |
noai / noimageai meta |
Page-level “don’t train on this” | Non-standard; major vendors mostly honor it |
| W3C TDMRep | /.well-known training reservation, EU legal basis |
Still a CG report |
trust.txt datatrainingallowed |
Site-level training reservation | Not enforced by mainstream AI |
| IPTC best practices | Bundles the above into a publisher kit | Guidance, not new power |
| CC Signals | Reciprocity / attribution ask | Pilot, no teeth |
The shared catch: it’s all voluntary
None of these has legal or technical enforcement — they all rely on crawlers honoring them. Big vendors usually respect robots and noai; small crawlers scrape anyway. Even AIPREF, the one most trying to become a real standard, saw its deployable draft expire by May 2026. This layer is far from settled.
The biggest trap: over-blocking backfires
“Block training” and “keep being cited” are two different things. Many people, blocking training, also block the search/citation bots and vanish from AI answers without knowing (see Block the wrong bot). Before you act, decide: are you blocking “training,” or “everything”?
Pragmatic advice
If you really want to block training, do it well with the mechanisms big vendors honor (robots AI UAs + noai) and don’t pile on the rest; always allow the search bots. Treat the newer standards (TDMRep, CC Signals, IPTC) as good to know, not a reason to rebuild. Over-engineered opt-out is costly to maintain and easy to get wrong — which is exactly why it’s worth checking regularly rather than setting once and forgetting.
we actually started seeing our brand mentioned in answers after we cleaned up our about page and got a couple of decent writeups, so theres something to this even if its hard to measure. didnt do anything fancy
i mean isnt the answer just write good clear pages that explain things. feels like we reinvented that wheel every few years with a new acronym
this is the third article ive read this week saying basically the same stuff and none of them tell you what to actually DO on monday morning
ok but how do you even measure if youre getting cited? like is there a tool or are people just typing questions into chatgpt all day and screenshotting
Honestly there's no single magic tool — what matters is the method: test across multiple mainstream AI engines, with several real-world phrasings, repeatedly over time. One-shot from one engine is mostly noise. We run it as continuous monitoring because doing it by hand weekly will burn you out, but the principle's the same if you DIY.
honest question for the author, does this change month to month? like do you get picked up and then quietly dropped when the model updates, or is it sticky once youre in
Both happen. Tactics-driven gains get wiped when a model updates — that's the churn you're describing. But citations that come from genuinely being a clear, trustworthy source tend to be sticky across updates. So we tell people: don't chase this month's behaviour, build the part that survives the update.
saved this. been trying to figure out why we show up on google fine but the AI answers never mention us. makes more sense now
the part about schema markup is slightly off. the engines aren't 'reading' json-ld the way you imply, most of them rely on the rendered text + retrieval from an index. structured data helps disambiguate entities but it's not the primary signal. worth clarifying so people don't go spend a week on schema thinking it's the magic switch
saving this
wait so do i need to pay for chatgpt to get my shop to show up in it?? sorry if dumb question
not a dumb q — no, paying for chatgpt does nothing for that. it's about your site/info being clear and trustworthy enough that the model picks you when someone asks. paying just gets YOU the fancier model, doesn't make it mention you.
the citations thing is huge and nobody talks about it. being the source the AI quotes vs just being on page 1 are completely different games
Exactly the distinction we care about most. Ranking #1 and being the source the model actually quotes are two different games now — you can win one and lose the other. Kept the article short so I didn't go deep on the why, but you nailed it.
doesnt work for me
@dev_marcus this is the thing i kept rambling about in standup lol
wait so do i need to pay for chatgpt to get my shop to show up in it?? sorry if dumb question im not techy, i just run a small bakery and my niece said i should look into this