Perl SQL parser
Posted on March 6th, 2010 by Daniel NichterI know “great programmers reuse code”, or some adage like that, but I wrote my own SQL query parser in Perl for various, uninteresting reasons. My goals were:
- Single package/file without external dependencies (rules out SQL::Parser)
- MySQL-specific, i.e. parse MySQL’s “flexible” syntax
- Full FROM/JOIN clause and subquery parsing
- Parse nested clauses/statements/(sub)queries
- Parse everything into logical structure (logical to me at least)
- Extensively tested
- Handle 90% of cases
The result is SQLParser.pm (only available via that link, not on CPAN).
It successfully parses:
select
now(),
(select foo from bar where id=1)
from
t1,
t2
join (select * from sqt1) as t3 using (`select`)
join t4 on t4.id=t3.id
where
c1 > any(
select col2 as z from sqt2 zz where sqtc<(
select max(col) from l where col<100
)
)
and s in ("select", "tricky")
or s <> "select"
group by 1
limit 10
(That syntax is valid but the query is a complete fabrication, i.e. not sane.) Apart from the MySQL source I don’t know of any other publicly available code that can successfully parse a SQL statement like that. Granted, my code has limitations, too, like CASE statements, but my goal was only 90% of cases (relative; what’s 90% for me may not be 90% for you).
The code was easier than I thought at first. Parsing FROM/JOIN clauses and subqueries are the biggest challenges. There’s a fair amount of code comments so you should be able to follow along and understand why and how I do things.
If you know of a better Perl SQL parser that meets my goals, please let me know!
Leave a Reply