codenode

DB<3> x $code

Perl SQL parser

Posted on March 6th, 2010 by Daniel Nichter

I know “great programmers reuse code”, or some adage like that, but I wrote my own SQL query parser in Perl for various, uninteresting reasons. My goals were:

  1. Single package/file without external dependencies (rules out SQL::Parser)
  2. MySQL-specific, i.e. parse MySQL’s “flexible” syntax
  3. Full FROM/JOIN clause and subquery parsing
  4. Parse nested clauses/statements/(sub)queries
  5. Parse everything into logical structure (logical to me at least)
  6. Extensively tested
  7. Handle 90% of cases

The result is SQLParser.pm (only available via that link, not on CPAN).

It successfully parses:

select
   now(),
   (select foo from bar where id=1)
from
   t1,
   t2
   join (select * from sqt1) as t3 using (`select`)
   join t4 on t4.id=t3.id
where
   c1 > any(
      select col2 as z from sqt2 zz where sqtc<(
         select max(col) from l where col<100
      )
   )
   and s in ("select", "tricky")
   or s <> "select"
group by 1
limit 10

(That syntax is valid but the query is a complete fabrication, i.e. not sane.) Apart from the MySQL source I don’t know of any other publicly available code that can successfully parse a SQL statement like that. Granted, my code has limitations, too, like CASE statements, but my goal was only 90% of cases (relative; what’s 90% for me may not be 90% for you).

The code was easier than I thought at first. Parsing FROM/JOIN clauses and subqueries are the biggest challenges. There’s a fair amount of code comments so you should be able to follow along and understand why and how I do things.

If you know of a better Perl SQL parser that meets my goals, please let me know!

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Spam protection by WP Captcha-Free

Copyright © 2009 codenode. Theme by THAT Agency powered by WordPress.