Using the LLM Validation Gateway
This guide will show you how to set up a schema contract and use the LLM Validation Gateway to clean and validate AI-generated data.
Step 1: Create a Schema Contract
Before you can validate data, you need to define what the "perfect" output looks like in the Data Catalog.
- Navigate to the Data Catalog in the Saddle Data UI.
- Click "New Asset" to create a Virtual Asset.
- Give your asset a name (e.g.,
Customer Inquiry Extraction). - Define your fields. For a customer inquiry, you might have:
customer_id(integer)email(string)priority(integer)is_resolved(boolean)
- (Optional) Add security tags. If you tag
emailasPII, the gateway will automatically mask it based on your governance policies. - Click "Create Asset".
Step 2: Get your IDs
To call the API, you need two pieces of information:
- Asset ID: In the Data Catalog table, find your new asset and click the Copy ID icon.
- API Key: Go to Organization Settings > API Keys and generate a new key (or copy an existing one).
Step 3: Call the Gateway API
The gateway is a single POST endpoint: https://api.saddledata.io/v1/gateway/validate.
Curl Example
Send your "dirty" LLM output in the payload_raw field.
curl -X POST https://api.saddledata.io/v1/gateway/validate \
-H "X-Api-Key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"asset_id": "YOUR_ASSET_UUID",
"payload_raw": "Here is the data from the email: \n\n ```json\n{\n \"customer_id\": \"12345\",\n \"email\": \"[email protected]\",\n \"priority\": 1,\n \"is_resolved\": \"yes\"\n}\n```"
}'
Step 4: Handle the Response
The gateway returns a unified JSON response.
Successful Validation
If the data was successfully extracted and coerced to your schema:
{
"success": true,
"data": {
"customer_id": 12345, // Coerced to integer
"email": "sup****@saddledata.com", // Masked by governance
"priority": 1,
"is_resolved": true // Coerced to boolean
},
"metadata": {
"governance_applied": ["email_MASK"],
"safety_findings": []
}
}
Schema Drift (Validation Failure)
If the LLM returned an object that doesn't match your contract (e.g., missing a field or an un-coercible type), the gateway returns success: false with detailed drift metrics.
{
"success": false,
"error": "Schema Validation Failed",
"drift_details": {
"missing_fields": ["priority"],
"type_mismatches": [
{
"field": "customer_id",
"expected": "integer",
"received": "string"
}
]
}
}
Tips for Success
- Version your prompts: If you update your LLM prompt, check the Version History in the Data Catalog to ensure your schema contract still aligns with the model's new behavior.
- Enable Random Sampling: If you are validating against a physical database asset, use the
--randomflag in your CLI scans to find potential drift before it hits the gateway. - Monitor Usage: Keep an eye on your monthly validation limits in the dashboard to ensure your agents stay online.