Your data has no opinions: the judgment layer AI reporting actually needs
An agent wired to your semantic model can answer 'what is the value?' perfectly and still be useless on 'is that good?' The missing piece isn't model quality - it's a judgment layer nobody's schema contains.
Wire a capable model to a well-built semantic model and ask it for a number. It will get the number right. Ask it whether the number is good and you'll get fluent hedging - because nothing in the data says which direction is healthy, what threshold matters, or which of two disagreeing systems to believe.
That was the wall I hit making an analytics agent useful to non-technical users. The failures weren't retrieval failures or reasoning failures. The agent was missing what every analyst in the building carries in their head and nobody had written down.
What the schema can't tell you
A semantic model is rich in structure - tables, relationships, measures, lineage. It is empty of judgment. Is a higher engagement score better, always, or only inside a band? At what value does anyone act? When the warehouse and the operational system disagree, which one is authoritative for this metric? Why does this number exist at all?
No amount of model capability recovers that, because it isn't latent in the data. It lives in people. An agent without it can describe your business numerically while understanding none of it.
The glossary nobody owns
The fix is unglamorous: a metric glossary that states, for every measure - what it means in business terms, which direction is good, what thresholds or benchmarks apply, which source is authoritative, and what caveats travel with it. Add a short description of each source system: what it is, what it is the system of record for, where its authority ends.
This document rarely exists, because it has no natural owner. It's beneath the data team's tooling and beside the business's day job. But it is the single highest-leverage artifact in an AI reporting project - more than model choice, more than orchestration.
Draft mechanically, confirm humanly
The encouraging discovery: you don't need weeks of stakeholder workshops to create it. Eighty to ninety percent of a glossary can be drafted mechanically - measure names, DAX structure, and system descriptions imply most definitions. What's left for humans is the judgment itself: the "is higher always better here?" pass, a quick confirmation of thresholds and authority calls.
A model can read every number you have and still not know which direction is good. That knowledge is yours - and it has to be written down.
Draft the eighty percent, schedule one review for the twenty, and version the result next to the semantic model it describes.
The short version
AI reporting projects stall on a missing document, not a missing capability. The judgment layer - meaning, direction, thresholds, authority - exists only in your people until someone writes it down. Write it down mostly mechanically, confirm the judgment calls, and feed it to the agent. Skip it, and the smartest model you can buy will confidently narrate numbers it doesn't understand.