QueryGPT: NL-to-SQL Pipeline
A multi-agent NL-to-SQL pipeline of four staged agents (intent, table selection, column pruning, SQL generation) that progressively narrows schema context to cut token usage, with pydantic-ai structured outputs and automatic retries, llama-index + Neo4j hybrid vector store RAG, and end-to-end SQL validation.
The problem
NL-to-SQL systems often fail in predictable ways:
- Selecting the wrong tables
- Over-selecting columns (token bloat)
- Generating syntactically valid SQL that’s semantically wrong
- Breaking as schemas change
What I built
A pipeline of four staged agents that decomposes the task into smaller, testable steps:
- Intent classification
- Table selection
- Column pruning
- SQL generation
Progressively narrowing the schema context at each stage cuts token usage and reduces failure cases. Each step emits structured outputs via pydantic-ai with automatic retries rather than free-form text, making the whole pipeline easier to debug, cheaper to run, and more reliable.
The stack uses llama-index and a Neo4j hybrid vector store to retrieve similar query examples (RAG), and validates generated SQL end-to-end so errors surface immediately.
Why it’s interesting
- It demonstrates “LLM engineering as software engineering”: interfaces, validation, and failure containment.
- It’s a realistic data product: schema-aware, maintainable, and designed for iteration.