Do I need a semantic layer if my warehouse already has one?

For single-platform agents, a warehouse-native layer (Snowflake Cortex Analyst, Databricks Genie) can ground that platform's own AI. For agents whose reach exceeds one platform, add a decoupled execution layer.

Comparisons & Evaluations·25 Jun 2026·Updated 11 Jul 2026·By Mayank Mudgal·All posts

Why Data Catalogs (Alation, Atlan, Collibra) Can't Execute AI Agents

Q: Is Atlan's 87% an agent SQL-accuracy number?

No. It measures agent-generated description quality (87% rated on par or better than human writing, per Atlan AI Labs, April 2026) - documentation, not execution.

Q: Which catalog is most expensive?

Collibra (~$170K-$510K+/year, often 2-3x Atlan), then Alation (~$198K entry, ~$413K mid-market per GigaOm). Atlan is typically lower entry. All three are quote-only.

Alation, Atlan, and Collibra are all Gartner-recognized leaders that excel at discovery, lineage, stewardship, and now AI-context delivery - but none of the three compiles agent intent into deterministic, dialect-perfect SQL or injects RBAC/ABAC at compile time. They document, ground, and govern metadata; they do not execute queries. The emerging enterprise pattern is catalog + semantic execution layer.

Why data catalogs cannot execute AI agents deterministically.

▶️ Watch the 60-second explainer, then read the full breakdown below. Book a demo →

The two jobs: discovery vs deterministic execution

A data catalog answers "What data do we have, what does it mean, who owns it, and where did it come from?" That is the discovery-and-governance job, and Alation, Atlan, and Collibra are genuinely excellent at it. A semantic execution layer answers a different question: "When an AI agent forms an intent in natural language, how do we compile that intent into governed, deterministic, dialect-perfect SQL that is guaranteed to respect access policy before it runs?" That is the execution job.

The distinction matters because AI agents break every assumption a catalog was built on. A catalog is a passive repository for humans to browse. As McKnight Consulting Group's William McKnight told TechTarget, Alation "provides a passive repository for data rather than an active intelligence platform." Agents don't browse; they generate intent and need that intent executed safely, repeatedly, and identically.

Why LLM-generated SQL is not deterministic execution

When an agent (or a catalog's agent feature) outputs SQL directly, there is no contract boundary between the probabilistic planner and deterministic execution. As RudderStack frames it: "When an agent outputs SQL directly, there is no meaningful contract boundary between a probabilistic planner and deterministic execution. The output can vary from run to run... A semantic layer can define metrics. A catalog can describe lineage. A policy system can label a field as sensitive. But if the agent still outputs SQL, those constraints are often advisory until the moment the warehouse blocks something. Governance happens too late and too indirectly."

This is the crux of the catalog gap. Metadata coverage, lineage, and certified glossary terms all improve the odds an LLM generates correct SQL - but improving the odds is not a guarantee, and "advisory until the warehouse blocks something" is not compile-time governance.

Alation: Agent Builder, Semantic Model Mastering, and the OSI bet

Alation (founded 2012, Redwood City; ~$100M+ revenue; $1.7B valuation after its 2022 Series E led by Salesforce Ventures and Sapphire Ventures; 650+ organizations including 40% of the Fortune 100) has moved aggressively into agentic AI. Agent Builder (announced at revAlation Chicago, private beta in 2025, GA targeted Q1 2026) lets customers build agents grounded in an "Agentic Knowledge Layer." Alation claims built-in evaluation workflows deliver "90%+ accuracy" - a number that came from its Numbers Station acquisition and refers to evaluation-framework accuracy in moving prototypes to production, not a determinism guarantee.

Semantic Model Mastering (available via YAML upload, with an expanded Snowflake connector and sync-back targeting June 2, 2026) positions Alation as "MDM for the semantic layer" - a central place to catalog, govern, and sync semantic models across platforms, aligned to the Open Semantic Interchange (OSI) standard. This is genuinely valuable governance plumbing. But note what it is: a mastering and governance layer for semantic definitions; execution and governance enforcement still happen in the downstream platform, not in a compile step Alation controls.

Alation is a five-time Leader in the Gartner Magic Quadrant for Metadata Management (most recently November 2025) and a Leader in the 2025/2026 Gartner Magic Quadrant for Data and Analytics Governance Platforms. On lineage, it scores 7.3/10 on G2 and relies on the third-party Manta integration (now IBM) for column-level lineage, which carries a separate fee. Pricing is quote-only; third-party estimates put platform access around $60K/year, entry deployments around $198K/year (25 Creator seats), and a mid-market deployment around $413,660/year per GigaOm. G2 data shows ~5-6 month implementations.

Atlan: the most aggressive "context layer for AI" narrative

Atlan (founded 2019; $105M Series C in May 2024 led by GIC and Meritech, valuing it at ~$750M; total funding ~$206M; Salesforce Ventures is an investor; 400+ enterprises including Mastercard, Workday, General Motors, Nasdaq, CME Group, HubSpot) has won the marketing war for the "context layer" framing. Its architecture comprises four products: the Enterprise Data Graph, Context Agents (AI teammates that auto-generate descriptions, metrics, and ontologies), Context Engineering Studio, and the Context Lakehouse (an Iceberg-native, BYOC context store supporting MCP, A2A, SQL, and REST/Graph APIs).

Atlan is a Leader in the 2025 Gartner MQ for Metadata Management and the 2026 Gartner MQ for Data & Analytics Governance (cited specifically for its Iceberg-native architecture). Workday is co-building its semantic layer on Atlan, and Atlan reports handling 8 billion context reads every 90 days.

The 87% claim, contextualized: Per Atlan AI Labs (April 2026): "In April 2026, Context Agents generated 690K+ descriptions across 50+ enterprise customers - 87% rated on par or better than human writing." This is a documentation-quality metric (self-reported), not a SQL-generation or agent-execution accuracy figure. It tells you Atlan's agents write good descriptions; it tells you nothing about whether an agent querying through Atlan produces the same SQL twice. Atlan's own marketing aspires to "Same question. Same answer. Every agent," but the mechanism is context bootstrapping plus human certification, not a compile-time determinism guarantee.

Collibra: governance-first, now an AI control plane

Collibra (founded 2008, Brussels/New York; $250M Series G in November 2021 valuing it at $5.25B; total funding ~$596M from Sequoia Capital Global Equities, Sofina, ICONIQ, Index Ventures, CapitalG, Battery; ~€149M/$210M revenue; ~1,095 employees; ~1,000 customers) is the governance-depth specialist. Contrary to common assumption, Collibra is older than Alation and is not primarily Salesforce-backed - its lead investors are Sequoia Global Equities and Sofina, with CapitalG (Alphabet) among earlier backers.

Collibra's agentic posture is governance and oversight, not execution. The AI Command Center (launched May 6, 2026 - "more than 40 leading enterprises participated in the private preview") is "a unified control plane to see, monitor, and control AI systems and agents across the entire AI lifecycle." Co-founder and CEO Felix Van de Maele framed the launch: "We are entering the era of agentic AI, where systems don't just suggest answers, they take actions... every enterprise scaling AI today is paying a hallucination tax."

The AI agent registry tracks agent metadata, lineage, and usage. The Collibra MCP Server (available in the Databricks Marketplace) "enables context-aware AI by providing copilots and agents with a single, standards-based path to centrally governed metadata and business context." Crucially, this is runtime/platform-level governance - agents query Collibra "before acting" to "get real-time answers to critical governance questions." That is governance as a context check before action, not policy injection into compiled SQL.

Where all three fall short for agent execution

The common thread is a passive metadata / documentation / oversight posture. An arXiv survey of enterprise AI governance platforms puts the structural gap precisely: such platforms "aggregate policies, signals, and risk assessments... Execution enforcement remains delegated to downstream systems, tools, or manual processes. These platforms do not introduce a mandatory execution-time mediation point... Execution-time authorization is assumed, not enforced."

None of the three:

Guarantees determinism (same intent → identical SQL across runs).
Injects RBAC/ABAC/row- and column-level predicates at compile time, before SQL touches the warehouse.
Publishes an agent-execution accuracy benchmark (Atlan's 87% is documentation quality; Alation's 90%+ is an eval-framework figure).
Compiles dialect-perfect SQL across many warehouses from a single graph; they federate or document, but execution lives in the platform.

Alation vs Atlan vs Collibra (10 dimensions)

Dimension	Alation	Atlan	Collibra
Core identity	Discovery-first → "Agentic Data Intelligence"	Active metadata → "Context layer for AI"	Governance-first data intelligence
Founded / HQ	2012, Redwood City	2019, San Francisco	2008, Brussels/New York
Funding / valuation	~$340M+; $1.7B (2022); Salesforce Ventures	~$206M; ~$750M (2024); GIC/Meritech	~$596M; $5.25B (2021); Sequoia GE/Sofina
Customers	650+ (40% Fortune 100)	400+ (Mastercard, GM, Workday)	~1,000
Flagship agentic feature	Agent Builder (GA Q1 2026); Semantic Model Mastering	Context Agents, Context Lakehouse	AI Command Center, agent registry, MCP Server
Governance timing	Documented/inherited; syncs to source	Runtime; pushes policies into Snowflake masking	Runtime "before it acts" context check
Determinism guarantee	No	No	No
Lineage (G2)	7.3 (Manta/IBM add-on)	9.1 (native column-level)	8.0 (Lineage Harvester)
Pricing (quote-only)	~$60K base; ~$198K entry; ~$413K mid-market	Custom; often lower entry	~$170K-$510K+; most expensive
Implementation	~5-6 months	~3 months (DIY)	~6-12 months

Where each catalog genuinely excels

Alation: behavioral/usage-driven discovery from SQL query logs, data literacy, stewardship, search. Best where data democratization matters as much as compliance.
Atlan: modern UX (9.0 G2), automated column-level lineage (9.1 G2), deepest modern-stack (Snowflake/dbt/Tableau) integration, fastest time-to-value, the strongest AI-context narrative.
Collibra: deepest governance workflows, stewardship, policy management, privacy, and compliance for heavily regulated industries; strong lineage visualization.

The semantic execution layer: what the catalogs are missing

A semantic execution layer compiles agent intent into governed SQL. Colrows describes the model directly: it "autonomously builds a typed semantic graph, then compiles every agent intent into governed, deterministic SQL... Governance predicates (RBAC, ABAC, row/column-level filters) are injected at compile time, before any SQL touches the warehouse. The result is deterministic: the same intent produces the same SQL, every time. The same query against Snowflake and against Databricks produces dialect-perfect SQL for each." It enforces governance such that "unauthorized intent fails compilation, data is never read," and produces an audit record per query across 16+ engines. Agents reach that compiler over a governed MCP tool surface - the enforcement described above applies to every agent-issued tool call, not just human queries.

Colrows vs each catalog (head-to-head on agent execution)

Compile-time deterministic execution vs metadata-grounded LLM generation. Colrows resolves intent against a typed graph, proves join paths, and emits SQL deterministically. Alation/Atlan/Collibra ground an LLM with better context - improving but not guaranteeing the output.
Compile-time policy injection vs inherited/runtime governance. Colrows injects RBAC/ABAC/row-column predicates before SQL runs. The catalogs inherit platform RBAC, push tags/masking, or check context "before acting" - enforcement is downstream or advisory.
Dialect-perfect multi-warehouse SQL vs source-agnostic metadata. Colrows compiles to Snowflake, Databricks, Redshift, BigQuery, Postgres, MySQL, and more from one graph. Catalogs document across sources but don't compile execution.
Autonomous semantic maintenance vs human curation. Colrows builds and maintains its graph autonomously (drift detection, conflict resolution). Catalogs rely on stewardship or AI-assisted enrichment plus human certification.

The honest gaps - where catalogs are stronger. Colrows does not do data discovery, end-to-end lineage visualization, business glossary stewardship workflows, privacy/compliance tooling, or AI-model governance/registries. Those are exactly the jobs Alation, Atlan, and Collibra do well. Colrows is not a catalog replacement; it is the execution layer below the agent and above the warehouse.

Alternative semantic execution layers (5 platforms)

dbt Semantic Layer / MetricFlow - code-first, vendor-neutral metric definitions. MetricFlow was open-sourced (Apache 2.0) in October 2025. Limits: no serving without dbt Cloud's paid Semantic Layer API; choose if dbt is the unambiguous center of your stack.
Cube - headless, MCP-native semantic layer with row-level, multi-tenant security applied at compile time. Brex chose Cube over the dbt Semantic Layer and LookML to build an embedded AI financial analyst.
Snowflake Cortex Analyst - warehouse-native, semantic-model-driven text-to-SQL claiming 90%+ SQL accuracy on real-world use cases. All SQL runs inside Snowflake's RBAC framework. Limit: single-platform (Snowflake). See our Cortex Analyst alternatives.
Databricks Genie - lakehouse-native; Genie Ontology (announced June 2026) grounds agents using an "OntoRank" (PageRank-style) authority method. Single-platform (Databricks).
ThoughtSpot Spotter - search-driven and explicitly deterministic: it uses patented search tokens "not text-to-SQL powered by LLMs, which is why we can guarantee the most consistent, trustworthy insights."

The unifying signal: Open Semantic Interchange (OSI), launched by Snowflake, Salesforce, dbt Labs, BlackRock, and RelationalAI on September 23, 2025 (spec finalized in early 2026), now includes Alation, Atlan, Collibra, Cube, ThoughtSpot, and others - the industry admitting that semantics must be defined once and be portable. OSI is the interchange format; it is not itself an execution engine.

Staged buyer guidance

Stage 1 - Discovery, lineage, compliance for humans. Keep (or buy) a catalog. Choose Alation for usage-driven discovery and literacy; Atlan for modern-stack integration; Collibra for deep governance in regulated industries.
Stage 2 - Agents in pilot. Add a semantic/execution layer scoped to one domain. Single-platform: warehouse-native (Cortex Analyst on Snowflake, Genie on Databricks). Multi-platform or embedding: evaluate Cube or a dedicated semantic execution layer.
Stage 3 - Agents in production across multiple warehouses, regulated data, or multi-tenant. Adopt a compile-time semantic execution layer (Colrows) that injects RBAC/ABAC at compile time and emits dialect-perfect SQL with a full audit trail - running alongside your catalog.

Frequently asked questions

Do Alation, Atlan, or Collibra guarantee the same SQL for the same question every time?

No. None publishes a determinism guarantee. Deterministic compilation is claimed by semantic execution layers (Colrows; ThoughtSpot Spotter's token-based, non-LLM approach).

Is Atlan's "87%" an agent SQL-accuracy number?

No. It measures agent-generated description quality ("87% rated on par or better than human writing," April 2026, self-reported by Atlan AI Labs) - documentation, not execution.

Where do these catalogs enforce data access policy?

At runtime or via inherited platform access (Snowflake RBAC, tag-based masking) - not by injecting predicates into compiled SQL before execution.

Which is most expensive?

Collibra (~$170K-$510K+/yr, often 2-3x Atlan), then Alation (~$198K entry, ~$413K mid-market per GigaOm). Atlan is typically lower entry; all three are quote-only.

Are analysts recommending "catalog + semantic layer"?

Yes - the layered "catalog + semantic layer + context layer" architecture is the emerging consensus, and Gartner's Market Guide for Agentic Analytics (Feb 2026) projects 60% of MCP-only agentic-analytics projects will fail by 2028 without a consistent semantic layer.

A note on the claims

Vendor-internal benchmarks are flagged throughout: Atlan's 87% (documentation quality, self-reported by Atlan AI Labs); Alation's "90%+" (eval-framework accuracy from its Numbers Station acquisition); Snowflake Cortex Analyst's "90%+ SQL accuracy" (internal 150-question benchmark); and Databricks Genie's 84.5% first-attempt accuracy (internal 28-question benchmark). All are vendor-reported and not independently audited. Capabilities and positioning move fast - Alation Agent Builder GA Q1 2026, Collibra AI Command Center May 2026, Atlan Context Lakehouse 2026 - so re-verify before procurement.