SQL Trust Lens: Schema Validation for LLM-Generated SQL

View on GitHub

Pipeline Architecture

The Problem

LLMs generate SQL that references tables and columns that do not exist. These queries reach the database, crash pipelines, and sometimes leak information through detailed error messages. There is no standard gate between generation and execution.

NEO built SQL Trust Lens to intercept LLM-generated SQL before it hits the database, validate every identifier against the live schema, and block any query that fails the threshold.

How the Trust Score Works

SQL Trust Lens assigns each query a score from 0 to 100 using a weighted formula:

trust_score = (table_component × 0.5 + col_component × 0.5) × 100

The tool validates every table and column reference in the query against the actual database schema. Missing identifiers reduce the score. The three score bands map to actions:

This means a query that gets one table right but references five non-existent columns will score below 50 and never reach the database.

The EvalEngine

The core component is EvalEngine, which wraps the full validation and execution pipeline. It returns a ValidationResult struct with:

The suggested_corrections field is particularly useful. If a column is named customer_name and the query says custmer_name, the engine catches the edit distance and suggests the fix rather than just rejecting the query.

Typo Correction with difflib

LLMs sometimes generate near-correct identifiers. Instead of treating all mismatches as hallucinations, SQL Trust Lens uses Python's difflib to compute similarity ratios against every schema identifier.

When the ratio exceeds a threshold, the engine classifies the mismatch as a likely typo and adds it to suggested_corrections. This prevents valid queries from being blocked due to minor spelling errors while still catching entirely fabricated identifiers.

LLM Backend Priority

The tool supports three backends, tried in this order when in auto mode:

  1. llama.cpp local GGUF model, no API cost
  2. OpenRouter cloud API fallback
  3. Mock built-in templates for testing without any API key

This lets you run the full pipeline offline during development and switch to a cloud backend in production without changing application code.

How to Build This with NEO

Open NEO in VS Code or Cursor and describe what you want to build. A good starting prompt for this project:

"Build a Python library that validates LLM-generated SQL against a live SQLite schema before execution. Score each query 0-100 using a weighted formula: table validity at 50% and column validity at 50%. Return a ValidationResult with trust_score, can_execute boolean, valid/invalid table and column lists, suggested corrections using difflib fuzzy matching for near-miss identifiers, and complexity metrics counting joins, subqueries, and aggregations. Block queries scoring below 50. Support three LLM backends in priority order: local llama.cpp, OpenRouter API, and a mock client for testing. Add a Streamlit UI for interactive querying and history export."

Build with NEO →

NEO generates the EvalEngine, difflib correction logic, backend priority chain, and Streamlit UI. From there you iterate ask it to add a Northwind sample database for out-of-the-box testing, add per-query execution history with CSV export, or add support for PostgreSQL schemas via SQLAlchemy introspection.

To run the finished project:

git clone https://github.com/dakshjain-1616/sql-trust-lens
cd sql-trust-lens
pip install -r requirements.txt
streamlit run app.py

The Streamlit UI opens with the bundled Northwind schema loaded submit any SQL query and immediately see its trust score, identifier breakdown, and suggested corrections for any mismatched column or table names.

NEO built a SQL validation layer that stops LLM hallucinations before they hit the database. See what else NEO ships at heyneo.com.

Try NEO in Your IDE

Install the NEO extension to bring AI-powered development directly into your workflow: