Skip to main content

ClickHouse

Overview

ClickHouse is a high-performance open-source columnar database management system (DBMS) for online analytical processing (OLAP). SaddleData connects to ClickHouse as a Destination, enabling high-throughput data ingestion into your analytical tables.

Capabilities

  • Destination: Load data from any SaddleData source into ClickHouse.
  • High-Throughput: Uses native ClickHouse protocol with batched inserts for maximum performance.
  • Schema Evolution: Automatically creates tables (MergeTree) and adds columns (ALTER TABLE) as your data changes.

Configuration

To connect to ClickHouse, provide the following credentials:

  • Host: The hostname or IP address of your ClickHouse server.
  • Port: The Native TCP port (typically 9000), NOT the HTTP port (8123).
  • Database: The target database name.
  • Username: The database user.
  • Password: The user's password.
  • SSL: Enable if your server requires a secure connection (e.g., ClickHouse Cloud).

Sync Modes

Supported destination modes:

  • Full Refresh - Overwrite: Truncates the destination table and reloads all data.
  • Incremental - Append: Appends new records to the destination table.

Note: "Incremental - Deduped (Upsert)" is currently not supported for ClickHouse due to the heavy nature of mutations in OLAP systems. We recommend using ReplacingMergeTree engines or handling deduplication at query time.

Schema Handling

  • Automatic Table Creation: Creates tables using the MergeTree engine.
  • Drift Detection: Detects new columns in the source data and alters the ClickHouse table to include them.
  • Type Inference: Automatically maps source types to ClickHouse types (Int64, Float64, DateTime64, Nullable(String)).

Declarative Configuration

apiVersion: v1
kind: Connection
metadata:
name: clickhouse-connection
spec:
connectorId: clickhouse
configuration:
host: localhost
port: 9000
database: my_database
username: saddledata
password: '********'
ssl: false
capability: destination