The runtime architecture, in one line:
Below, those four phases are unpacked into the seven concrete steps the platform performs on every request.
-
Connect
Connect data sources and ingest schema
Colrows introspects warehouses, lakes, operational databases, and metadata catalogues. It pulls schemas, columns, foreign keys, primary keys, and any existing metric definitions into a single representation. The connector layer supports 16+ engines out of the box - Snowflake, Databricks, BigQuery, Redshift, Postgres, MySQL, ClickHouse, Trino, and more - alongside structured documentation surfaces like Confluence and Salesforce notes.
Connection takes minutes per source. There is no upfront migration: the data stays where it lives.
-
Build
Build the typed semantic graph
On top of the mapped structure, Colrows autonomously constructs a typed, versioned semantic graph - entities (Customer, Subscription, Order), metrics (Revenue, Churn, Margin), relationships (ownership, dependency, causality), constraints (valid transformations, thresholds), and governance predicates (who sees what, under which conditions).
The graph is multi-scope: the same concept can resolve differently at
global,datastore,persona, anduserscope. Multi-vector embeddings (one for the formal definition, one for observed usage, one combined) ground each concept against language as it is actually used in the business. -
Resolve
Resolve intent in context
When intent arrives - whether a natural-language question, a dashboard refresh, or an AI agent's tool call - Colrows resolves every term through the graph using the requesting identity, role, persona, and scope. The string "revenue" is not just a token; it resolves to a specific entity with a specific formula, source, and governance scope.
The same question asked by a finance controller and an account executive resolves to different concepts. The same question asked from EU and US scopes resolves to different policies. Resolution is deterministic: identical intent + identical context produces identical resolved terms.
-
Plan
Plan and prove the join paths
The constrained planner searches the graph for valid join paths between the resolved entities. Every join path is formally proven against the graph's typed relationships before any SQL is generated. If no proven path exists, planning fails - Colrows refuses to fabricate a join that the graph does not support.
This is the structural difference between a semantic execution layer and a generic text-to-SQL agent. LLMs hallucinate joins; constrained planning cannot.
-
Compile
Compile to dialect-perfect, governed SQL
The compiler emits dialect-perfect SQL for the target engine. The same intent against Snowflake produces different SQL than the same intent against Databricks, Postgres, or BigQuery - each is optimised for its dialect.
Crucially, RBAC, ABAC, and row/column-level predicates are injected at compile time, before any SQL leaves the planner. Unauthorised queries fail compilation; their data is never read. Governance is structural, not advisory.
-
Execute
Execute, audit, and trace
The compiled SQL runs on the underlying engine. Colrows does not store query results or move data; the warehouse remains the single source of truth.
Every query produces a point-in-time reproducible audit trail - the graph version, identity context, resolved entities, proven join paths, and the exact compiled SQL. Historical queries can be re-run with the definitions that were in force at that moment, which is what makes Colrows safe in regulated environments where audits arrive months after the query.
-
Maintain
Maintain the graph autonomously
Schemas drift. Columns are renamed. New tables appear. Business logic evolves. Autonomous maintenance agents watch every connected source for drift, detect the change, and propose graph updates that humans approve in-product.
The result: the semantic graph stays current without a manual catalogue-update process. The humans focus on edge cases and intent, not on chasing schema changes through documentation.
Why this pipeline matters for AI
Generic text-to-SQL agents skip steps 2, 4, and 5. They have no typed graph to resolve against, no constrained planner to prove the joins, and no compile-time governance to enforce policy. The output looks plausible but is fabricated; the audit trail is "the model said so."
Colrows compiles every query through all seven steps. The output is deterministic, the joins are proven, the policies are structural, and the audit trail is reproducible. That is the structural difference between a guess and a query.
Where to go next
- What is a semantic layer? - the foundational definition and the comparison to data catalogs and warehouses
- Glossary - precise definitions for every term used above
- Architecture documentation - the deeper engineering reference
- Pharma case study, BFSI case study, Travel-retail case study - the pipeline applied to real workloads