Data Catalog & Governance
The Data Catalog is the central nervous system of Saddle Data. It provides a single source of truth for all schemas across your organization, moving beyond simple data replication to true data management and governance.
Key Features
1. Centralized Schema Registry
When you run Discovery on any connection, Saddle Data automatically snapshots the schema and saves it as a Data Asset in your catalog. These assets can then be reused across multiple flows, ensuring consistent naming and typing throughout your infrastructure.
2. Virtual Data Assets
In addition to discovered schemas, you can create Virtual Assets manually. These are schemas not tied to a physical database connection, designed specifically for use with the LLM Validation Gateway. They act as "Contract Registries" for AI services.
3. Schema Time Machine
Every time a schema change is detected during discovery, a sync run, or a manual update, Saddle Data creates a new Version.
- Historical Audit: View exactly what a table looked like at any point in history.
- AI Change Summaries: Our Vertex AI engine analyzes the differences between versions and provides a human-readable summary of what changed and why.
4. Global Impact Analysis
Decoupling schemas from flows allows you to see the Blast Radius of a potential change.
- In the Catalog, view an asset to see a list of every flow currently dependent on it.
- Before you modify an upstream database, you can identify exactly which pipelines will be affected.
5. Policy Integration
The Data Catalog works in tandem with the Data Governance platform.
- Centralized Tagging: Apply global security tags (e.g., PII, PHI) directly to columns within the catalog.
- Enforced Compliance: Once tagged, columns are subject to organization-wide policies that automatically inject mandatory transformations into your sync pipelines.
Workflow
- Cataloging: Run discovery on your connections OR create a Virtual Asset manually.
- Tagging: Review your assets in the Data Catalog and apply security tags to sensitive columns.
- Integration: Copy the Asset ID for use in the LLM Validation Gateway or map the asset in the Flow Editor.
- Monitoring: Use the Version History to track how your data contracts evolve over time.
Plan Gating
The Data Catalog is available on all plans. However, advanced features like AI Change Summaries and integration with Automated Policy Enforcement require an Enterprise or Enterprise+ subscription.