Engineering citations: how to become the source ChatGPT quotes

A repeatable framework for citation-worthy content, drawn from 18 months of GEO engagements across SaaS and DTC.

There is a question I get on every discovery call: how do we get cited by ChatGPT?

The honest, short answer is: you make your content the kind of source a careful researcher would cite — and then you make sure the model can find it. The longer answer, which is what this piece is about, is a repeatable framework I've been refining across roughly thirty GEO engagements over the last eighteen months.

It's not magic. It's just disciplined.

The four-part anatomy of a citation-worthy page

Almost every page that earns LLM citations shares the same skeleton. When something doesn't get cited, it's usually missing one of these.

1. A defensible thesis

Every citable page makes one specific claim that is non-trivial, falsifiable, and authored. Not "SEO is changing" — "the share of B2B SaaS demand mediated by an LLM at any point in the buying journey crossed 30% in Q1 2026, up from 11% a year ago".

Models cite assertions, not vibes. If your content has no thesis a human could disagree with, it has nothing for a model to attribute.

2. Primary numbers

A citable page contains numbers the LLM cannot easily get elsewhere. Original survey data. Internal benchmarks. A reproducible methodology section. Even one good number, well-sourced, is worth more than ten paragraphs of summary.

If you don't run primary research, partner with someone who does, or aggregate public data in an unusual way and show your work.

3. Named authority

Models weight content by the authority of its named author. "By the editorial team" is invisible. "By Predrag Petrović, AI SEO consultant, Belgrade — 18 years in search" is a citable surface, especially when the author also has a Person schema, a Wikidata entry and a coherent footprint elsewhere on the web.

This is one of the highest-leverage changes you can make to existing content. Add the author. Link the author to a real entity. Update the schema. Done in an afternoon, paid out for years.

4. Date and methodology

LLMs increasingly prefer fresh sources. A page with a visible publication date, last-reviewed date and a clear "how we did this" section beats a timeless evergreen page on the same topic, because the model can be confident the content is current and traceable.

This isn't about updating the year in the meta title. It's about earning the right to say "as of June 2026".

The mechanical layer: making the page legible

Even a perfectly authored, defensible piece won't get cited if the model can't parse it. The mechanical work is mostly invisible to humans, and decisive for retrievers.

  • Summary block at the top. A 2–3 sentence abstract that compresses the thesis. Retrievers grab this first.
  • Predictable structure. H2-driven sections. One claim per paragraph. Lists where lists make sense.
  • Schema markup. Article with author (linked to a Person), datePublished, dateModified, mainEntityOfPage and — critically — about linked to the entity the article is about.
  • Stable URLs. Don't shuffle slugs for SEO experiments. Citations decay when URLs change.
  • No content trapped in JS. Server the prose as HTML. Use the SPA for the rest of the experience if you must.

If you fix nothing else this quarter, fix the schema and the summary block. They are unreasonably effective.

The distribution layer: where citations actually accrue

Authoring great content is necessary but not sufficient. The corpora LLMs rely on for citations are remarkably narrow:

  • Wikipedia / Wikidata
  • A small number of news outlets (Reuters, AP, BBC, FT, The Verge in tech)
  • Stack Overflow, GitHub READMEs and high-vote Reddit threads in their respective verticals
  • Top-tier industry publications and academic papers

Earning a presence in this corpus is the real GEO link-building. It is slower than classical link buying, and dramatically more durable. One Wikipedia citation outperforms fifty paid placements; one well-upvoted Reddit thread outperforms a year of guest posts.

The measurement layer

You can't improve citations you don't measure. The minimum viable dashboard:

  1. Tracked prompts: 30–80 prompts your buyers actually ask, run weekly in ChatGPT, Perplexity, Gemini and Claude.
  2. Citation share: of those runs, what percentage cite you, and where in the answer.
  3. Brand prompt accuracy: when someone asks "what does {your brand} do", does the answer match what you'd want a salesperson to say?
  4. AI Overview presence: for tracked queries on Google, are you in the Overview, in the carousel or absent.

Most teams I work with had none of this six months ago. All of them have it now. You can build a workable version in a spreadsheet plus three afternoons of scripting.

The mistake almost everyone makes

Treating GEO as a content campaign. It isn't. It's a slow, compounding entity programme: claim, source, cite, structure, measure, iterate. The teams that try to "do GEO" in a quarter and move on get nothing. The teams that ship one improvement a week for two quarters end up cited everywhere.

That's the unfair advantage. Patience and process beat hot takes and pivots.