AI writes your SQL.
sqlsure makes sure it's right.

A query can be perfectly valid, run without error, and return a number that's silently wrong — revenue double-counted by a join, an average summed, a patient ID exposed. Databases don't catch it. Linters don't catch it. LLMs reviewing their own SQL don't catch it.

sqlsure does — deterministically, in 0.1 ms, before the query runs.

pip install sqlsure View on GitHub Read the BIRD finding →
2,568expert-written benchmark queries audited
45 flagsraised — every one traced to a real defect
0false alarms on code we didn't write
how wrong a BIRD gold answer we disproved is

The bug that ships to your dashboard

SELECT o.region, SUM(o.amount) AS revenue
FROM orders o
JOIN order_items i ON o.order_id = i.order_id
GROUP BY 1

Valid SQL. Runs clean. Every order's amount counted once per line item. The total is inflated, the dashboard looks normal, and nobody knows — that's the fan-out trap, one of nine silent-error classes sqlsure rejects with an exact fix:

✗ FANOUT — SUM(amount) after one-to-many join to order_items:
  amount will be double-counted.
fix: Pre-aggregate orders to [order_id] in a CTE before joining.

No new language. Your dbt tests are the rulebook.

sqlsure judges SQL against facts your team already declared: a dbt unique test is a grain declaration, a relationships test is join cardinality, a one-line meta tag marks what's safe to sum. Plain PK/FK declarations work too. Rules are dictionary lookups, not LLM calls — same input, same verdict, every time, offline, and it never reads your data.

Three doors, one engine

CI gate

One GitHub Action. PRs that double-count get a red X and the exact fix as a comment.

MCP server

Your AI agent checks every query before executing; rejections carry fixes the agent applies itself. Draft → check → fix → execute.

Python library

check(sql, model) inside any text-to-SQL product — plus a drop-in gate for Vanna/WrenAI-style generators and a semantic eval metric.

Receipts, not promises. We ran sqlsure over the gold answers of Spider and BIRD — the benchmarks every text-to-SQL model is graded on. It mechanically found a schema defect affecting 13 questions and a gold answer that's provably wrong by 8× — from the exact bug class it targets. Ten of its fifteen BIRD flags were independently confirmed by a human expert review we discovered afterward. The full story →

What sqlsure is not

Start in 60 seconds

pip install sqlsure

# audit any dbt repo (no dbt install needed)
python -m sqlsure.scan path/to/repo --report report.md

# give your agent the inspector
claude mcp add sqlsure -- python -m sqlsure.mcp_server --manifest target/manifest.json