Skip to main content

BigQuery

Overview

Google BigQuery is a serverless, highly-scalable data warehouse. SaddleData can use BigQuery as a data destination.

Prerequisites

To connect to BigQuery, you will need:

  • A Google Cloud Platform project with the BigQuery API enabled.
  • A service account with the "BigQuery Data Editor" and "BigQuery Job User" roles.
  • The JSON key file for the service account.

Configuration

When creating a BigQuery Integration, you will need to provide the following information:

  • Project ID: The ID of your Google Cloud Platform project.
  • Dataset: The BigQuery dataset you want to write to.
  • Service Account JSON: The contents of the JSON key file for your service account.

Sync Modes

When using BigQuery as a destination, you can choose from the following sync modes:

  • Full Refresh - Overwrite: Replaces all data in the destination table.
  • Incremental - Append: Appends new records to the destination table.
  • Incremental - Deduped (Upsert): Updates existing records and inserts new records based on a primary key.

Declarative Configuration

apiVersion: v1
kind: Connection
metadata:
name: bigquery-connection
spec:
connectorId: bigquery
integrationId: gcp-integration-id
configuration:
capability: destination
project_id: my-project-id
dataset_id: my-dataset-id
gcs_staging_bucket: my-gcs-staging-bucket