ClickHouse
Overview
ClickHouse is a high-performance open-source columnar database management system (DBMS) for online analytical processing (OLAP). SaddleData connects to ClickHouse as a Destination, enabling high-throughput data ingestion into your analytical tables.
Capabilities
- Destination: Load data from any SaddleData source into ClickHouse.
- High-Throughput: Uses native ClickHouse protocol with batched inserts for maximum performance.
- Schema Evolution: Automatically creates tables (
MergeTree) and adds columns (ALTER TABLE) as your data changes.
Configuration
To connect to ClickHouse, provide the following credentials:
- Host: The hostname or IP address of your ClickHouse server.
- Port: The Native TCP port (typically
9000), NOT the HTTP port (8123). - Database: The target database name.
- Username: The database user.
- Password: The user's password.
- SSL: Enable if your server requires a secure connection (e.g., ClickHouse Cloud).
Sync Modes
Supported destination modes:
- Full Refresh - Overwrite: Truncates the destination table and reloads all data.
- Incremental - Append: Appends new records to the destination table.
Note: "Incremental - Deduped (Upsert)" is currently not supported for ClickHouse due to the heavy nature of mutations in OLAP systems. We recommend using
ReplacingMergeTreeengines or handling deduplication at query time.
Schema Handling
- Automatic Table Creation: Creates tables using the
MergeTreeengine. - Drift Detection: Detects new columns in the source data and alters the ClickHouse table to include them.
- Type Inference: Automatically maps source types to ClickHouse types (
Int64,Float64,DateTime64,Nullable(String)).
Declarative Configuration
apiVersion: v1
kind: Connection
metadata:
name: clickhouse-connection
spec:
connectorId: clickhouse
configuration:
host: localhost
port: 9000
database: my_database
username: saddledata
password: '********'
ssl: false
capability: destination