Regular expression to match “select…from…where” sql queries
-
19-09-2019 - |
Question
I want to find a sequence of "select from where " in a string. For example,
"select * from T WHERE t.c1 =1"
should be matched.
Here is the regular expression pattern I wrote:
"SELECT\s\.*FROM\s\.*WHERE\s\.*";
But it doesn't work. What is wrong with it?
Solution
You shouldn't have backslash-escaped the dots; since you did, the engine is trying to match a literal dot, not "any character" like you're expecting.
Try:
SELECT\s.*FROM\s.*WHERE\s.*
Also, as others have posted, make sure it's in case-insensitive mode. How you do that depends on the language you're using.
OTHER TIPS
I'm not sure what regex engine you're targeting, but you might try this:
# note the trailing i, which in perl means case-insensitive
# this will also put the interesting bits into regex backreferences
#
# This also changes \s to \s+ in case you have a situation with multiple
# spaces between terms
#
/select\s+(.*)\s+from\s+(.*)\s+where\s+(.*)/i
One problem with the RE is that it's case sensitive. Depending on the form of RE, there is probably a flag to specify case insensitive matching. For example, Perl-compatible REs use a "/i" flag: /SELECT\s.*FROM\s.*WHERE\s.*/i
Assumptions:
- Your sql statement will not span across lines.
- You will not have two sql statements on a single line.
This works for me.
SELECT\s+?[^\s]+?\s+?FROM\s+?[^\s]+?\s+?WHERE.*
Java escaped version:
String regex = "SELECT\\s+?[^\\s]+?\\s+?FROM\\s+?[^\\s]+?\\s+?WHERE.*";
You may append a terminator instead of the .* depending on your case. Of course, you have to run it in case-insensitive mode, OR modify the regex appropriately.
Thanks your guys's reply. I got the answer from mmyers, here is my final solution:
string szPattern = @"SELECT\s.*FROM\s.*WHERE\s.*";
Regex rRegEX = new Regex ( szPattern,RegexOptions.IgnoreCase | RegexOptions.Multiline );
Match match =rRegEX.Match(testCase.Statement);
if (match.Success)
Try this regex, I tried to be a bit more restrictive with table and column names and also consider using filter with (=, <=,> =, <,> and IN ()):
string regex = "SELECT\W(([a-zA-z0-9]+)([,]*)\W)+\WFROM\W([a-zA-z0-9#]+)(\W[a-zA-Z])*\WWHERE\W([a-zA-z0-9_]+)\W([=<>IN(]+)\W(.+)"