Flow Orchestration (Lightweight DAG)
Flow Orchestration allows you to chain multiple Flows together to create automated data pipelines. This is often referred to as a Lightweight DAG (Directed Acyclic Graph).
Instead of scheduling every Flow to run at a specific time and hoping the previous one finished, you can configure a Flow to start exactly when its upstream dependency completes.
Key Concepts
Upstream and Downstream
- Upstream Flow: The flow that triggers the next step.
- Downstream Flow: The flow that is "listening" for the completion of the upstream flow.
Trigger Conditions
You can specify the condition under which the downstream flow should be triggered:
- On Success: The downstream flow only runs if the upstream flow finishes successfully. (Most common)
- On Failure: The downstream flow runs only if the upstream flow fails. This is useful for triggering cleanup tasks or error notifications.
- Always: The downstream flow runs regardless of the upstream flow's outcome.
Use Cases
1. Sync then Model (ELT)
The most common use case is syncing raw data from a source (like Salesforce or Postgres) into a warehouse, and then triggering a Post-Load (dbt) Transformation flow once that data is fresh.
2. Multi-Step Pipelines
If you have a complex process where data needs to be moved through multiple stages (e.g., Source -> S3 -> BigQuery -> Analytics Table), you can chain these steps together to ensure data integrity and minimize latency.
3. Cross-Plane Orchestration
Saddle Data's orchestration works across different execution planes. A Flow running on a Remote Agent inside your VPC can trigger a Flow running in Saddle Cloud, and vice-versa.
Configuration
To set up a chained trigger:
- Open the Flow Editor for the flow you want to run as a downstream step.
- Click the Schedule button.
- Navigate to the Chained tab.
- Select the Upstream Flow from the dropdown list of your organization's flows.
- Select the Trigger Condition (e.g., "Trigger when upstream succeeds").
- Click Save.
Once configured, the downstream flow's schedule type will change to dependency, and it will automatically wait for the upstream flow to complete before starting its own execution.
Saddle Data automatically prevents direct circular dependencies (e.g., Flow A triggers Flow B, and Flow B triggers Flow A) to ensure your pipelines don't enter an infinite loop.
Visualization & Observability
Saddle Data provides built-in tools to visualize and monitor your complex data pipelines.
Pipeline View (Topology)
The Pipeline View provides a visual map of all flows within your organization and their dependencies. This Directed Acyclic Graph (DAG) view allows you to:
- Understand Dependencies: See at a glance which flows trigger others.
- Monitor Health: Nodes are color-coded based on their last run status (Green for success, Red for failure, Grey for idle).
- Take Action: Manually trigger any flow in the DAG or navigate directly to its details.
Execution Traces
When a chain of flows is triggered (e.g., Flow A -> Flow B -> Flow C), Saddle Data treats this as a single "Transaction."
- Trace Timeline: From any flow run, you can open the Trace Timeline to see the entire cascade of executions, including start times, durations, and outcomes for every step in the chain.
- Unified Logging: Drill down into the execution logs of any specific run directly from the trace view.
Audit Log Grouping
In the Audit Logs, you can toggle "Group by Trigger" to collapse related executions into a single entry, making it easier to audit high-frequency or complex pipeline cascades.