Datasources
Colrows connects to your warehouses, lakehouses, and operational databases through a uniform connector layer. The semantic graph is warehouse-agnostic - one definition, many backends, dialect-perfect SQL at the edge.
Supported datasources
| Category | Engines | Auth modes |
|---|---|---|
| Cloud warehouses | Snowflake, Databricks SQL, Google BigQuery, Amazon Redshift | Username/password, key-pair, OAuth, IAM |
| Lakehouse / query engines | Trino, Starburst, Presto, Athena, Dremio | JDBC + LDAP/Kerberos/IAM |
| OLAP | ClickHouse, Druid, Pinot, Exasol | JDBC, native auth |
| RDBMS | PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, IBM Db2 | JDBC, AD/Kerberos, certificate |
| Cloud-native | AlloyDB, Aurora Postgres, Cloud SQL, Azure Database | IAM, password |
| NoSQL / Search | MongoDB, Elasticsearch (read-only) | Native auth, X.509 |
Onboarding a datasource
-
Open Datasources → Add datasource
Pick the connector. Each connector exposes only the fields it actually needs - there is no generic "advanced JDBC URL" trapdoor.
-
Provide connection details
Host, port, database, warehouse / catalog / schema as applicable, plus credentials. Colrows runs a probe in your network and reports the latency.
# Snowflake host = acme.snowflakecomputing.com warehouse = COMPUTE_WH database = ANALYTICS role = COLROWS_READER auth = key-pair # or password -
Choose how Colrows reaches the database
Direct, SSH tunnel, or SSL with a custom CA. The network section below has the full set.
-
Initial crawl
Colrows scans the schema, registers candidate datasets and columns into the semantic graph, and runs distribution fingerprinting to bootstrap drift detection. You can scope the crawl to specific catalogs / schemas - new objects can be promoted later.
-
Bind concepts
Open Consensus and bind your business concepts to anchors in the new datasource. Once bound, every query - analyst, dashboard, or AI - runs through governance.
Driver setup
Colrows ships with built-in drivers for every supported engine. For self-hosted deployments, drivers live in /opt/colrows/drivers and are version-pinned per release. Custom dialects can be registered through the SQL Engine SDK - contact support for the SDK.
Network options
- Direct connection - Colrows Cloud reaches your database from a fixed set of egress IPs (provided in Datasources → Network). Simplest setup; works for any internet-reachable database.
- SSH tunnel - Colrows opens a tunnel through a bastion you control. Useful when the database has no public endpoint.
- SSL / mTLS - upload your CA bundle and (optionally) client cert. Required by some regulated deployments.
- Private link - AWS PrivateLink, Azure Private Endpoint, GCP Private Service Connect. Available on Enterprise plans.
- Self-hosted runner - for fully air-gapped environments, deploy the Colrows runner inside your VPC and let it call out to the control plane over HTTPS.
Colrows needs SELECT, SHOW, and DESCRIBE on the catalogs you want governed. It never asks for write privileges. Granting more than necessary is a control failure waiting to happen.
Cross-source semantics
Colrows can compile queries that span multiple datasources where a valid join path exists. The planner pushes down per dialect and only materializes intermediate results when no pushdown is possible. This is how the same metric definition can be served from Snowflake for analytics and from Postgres for operational queries - without duplicating logic.