PostgreSQL
Overview
PostgreSQL is a powerful, open-source object-relational database system. SaddleData can use PostgreSQL as both a data source and a destination.
Prerequisites
Before connecting to PostgreSQL, please ensure you have the following:
- A dedicated, read-only user for your source database to ensure data security.
- Whitelisted SaddleData's IP addresses in your firewall to allow for a successful connection. Our IP addresses are
100.20.10.1and100.20.10.2.
Configuration
When creating a PostgreSQL Integration, you will need to provide the following information:
- Host: The hostname or IP address of your PostgreSQL server.
- Port: The port your PostgreSQL server is listening on (default is
5432). - User: The username for your dedicated read-only user.
- Password: The password for the user.
- Database: The name of the database you want to connect to.
Sync Modes
PostgreSQL as a Source
When using PostgreSQL as a source, you can choose from the following sync modes:
- Full Refresh: Reads all data from the table.
- Incremental: Reads only new rows from the table based on a cursor column.
PostgreSQL as a Destination
When using PostgreSQL as a destination, you can choose from the following sync modes:
- Full Refresh - Overwrite: Replaces all data in the destination table.
- Incremental - Append: Appends new records to the destination table.
- Incremental - Deduped (Upsert): Updates existing rows and inserts new rows based on a primary key, using an
INSERT ... ON CONFLICT DO UPDATEstatement.
Schema Evolution
PostgreSQL supports full Schema Drift handling:
- Source: Detects new and dropped columns.
- Destination: Supports
ALTER TABLEoperations to automatically add new columns when the "Automatically Update Destination" policy is active or when a drift is manually approved.
Advanced Destination Options
Reset Sequences
When cloning or syncing data between PostgreSQL databases, explicit IDs are often copied. However, PostgreSQL internal sequences (used for SERIAL and BIGSERIAL columns) do not update automatically when IDs are inserted explicitly. This can lead to "duplicate key" errors on subsequent new inserts.
Saddle Data provides a Reset Sequences after Sync option in the Flow Destination settings. When enabled, Saddle Data will automatically update all sequences in the target table to match the current maximum ID value at the end of every successful sync.
This feature requires the database user to have sufficient permissions to modify sequences (typically USAGE and SELECT on the sequence objects).
Declarative Configuration
apiVersion: v1
kind: Connection
metadata:
name: postgres-connection
spec:
connectorId: postgres
configuration:
host: localhost
port: 5432
user: saddledata
password: '********'
dbname: my_database
sslmode: prefer
capability: both