Datasources

Colrows connects to your warehouses, lakehouses, and operational databases through a uniform connector layer. The semantic graph is warehouse-agnostic - one definition, many backends, dialect-perfect SQL at the edge.

Supported datasources

Category	Engines	Auth modes
Cloud warehouses	Snowflake, Databricks SQL, Google BigQuery, Amazon Redshift	Username/password, key-pair, OAuth, IAM
Lakehouse / query engines	Trino, Starburst, Presto, Athena, Dremio	JDBC + LDAP/Kerberos/IAM
OLAP	ClickHouse, Druid, Pinot, Exasol	JDBC, native auth
RDBMS	PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, IBM Db2	JDBC, AD/Kerberos, certificate
Cloud-native	AlloyDB, Aurora Postgres, Cloud SQL, Azure Database	IAM, password
NoSQL / Search	MongoDB, Elasticsearch (read-only)	Native auth, X.509

Onboarding a datasource

Open Datasources → Add datasource

Pick the connector. Each connector exposes only the fields it actually needs - there is no generic "advanced JDBC URL" trapdoor.

Provide connection details

Host, port, database, warehouse / catalog / schema as applicable, plus credentials. Colrows runs a probe in your network and reports the latency.

# Snowflake
host        = acme.snowflakecomputing.com
warehouse   = COMPUTE_WH
database    = ANALYTICS
role        = COLROWS_READER
auth        = key-pair    # or password

Choose how Colrows reaches the database

Direct, SSH tunnel, or SSL with a custom CA. The network section below has the full set.
Initial crawl

Colrows scans the schema, registers candidate datasets and columns into the semantic graph, and runs distribution fingerprinting to bootstrap drift detection. You can scope the crawl to specific catalogs / schemas - new objects can be promoted later.
Bind concepts

Open Consensus and bind your business concepts to anchors in the new datasource. Once bound, every query - analyst, dashboard, or AI - runs through governance.

Driver setup

Colrows ships with built-in drivers for every supported engine. For self-hosted deployments, drivers live in /opt/colrows/drivers and are version-pinned per release. Custom dialects can be registered through the SQL Engine SDK - contact support for the SDK.

Network options

Direct connection - Colrows Cloud reaches your database from a fixed set of egress IPs (provided in Datasources → Network). Simplest setup; works for any internet-reachable database.
SSH tunnel - Colrows opens a tunnel through a bastion you control. Useful when the database has no public endpoint.
SSL / mTLS - upload your CA bundle and (optionally) client cert. Required by some regulated deployments.
Private link - AWS PrivateLink, Azure Private Endpoint, GCP Private Service Connect. Available on Enterprise plans.
Self-hosted runner - for fully air-gapped environments, deploy the Colrows runner inside your VPC and let it call out to the control plane over HTTPS.

Use a read-only role.

Colrows needs SELECT, SHOW, and DESCRIBE on the catalogs you want governed. It never asks for write privileges. Granting more than necessary is a control failure waiting to happen.

Cross-source semantics

Colrows can compile queries that span multiple datasources where a valid join path exists. The planner pushes down per dialect and only materializes intermediate results when no pushdown is possible. This is how the same metric definition can be served from Snowflake for analytics and from Postgres for operational queries - without duplicating logic.

Datasources

Supported datasources

Onboarding a datasource

Open Datasources → Add datasource

Provide connection details

Choose how Colrows reaches the database

Initial crawl

Bind concepts

Driver setup

Network options

Cross-source semantics