Question

I have the weirdest problem and my extremely basic knowledge of SQL must be terribly wrong, but I cannot make sense of the behaviour illustrated below.

I have this file test.csv

id,field
A,0
B,1
C,2
D,"0"
E,"1"
F,"2"
G,
H,""
I," "

And this test code:

#! /usr/bin/perl

use strict;
use warnings;
use DBI;
use Devel::VersionDump qw(dump_versions);

my $dbh = DBI->connect ("dbi:CSV:");
$dbh->{RaiseError} = 1;
$dbh->{TraceLevel} = 0;
my $i = 0;

foreach my $cond ("TRUE",
                  "field <> 0 AND field <> 1",
                  "field = 0 OR field = 1",
                  "NOT (field = 0 OR field = 1)",
                  "NOT field = 0 OR field = 1",
                  "field <> 0",
                  "NOT field <> 0",
                  ) {
  print "Condition #" . $i++ . " is $cond:\n";
  my $sth = $dbh->prepare("SELECT * FROM test.csv WHERE $cond");
  $sth->execute();
  $sth->dump_results();
};

print "\n\n";
dump_versions();

When run, this is the output:

Condition #0 is TRUE:
'A', '0'
'B', '1'
'C', '2'
'D', '0'
'E', '1'
'F', '2'
'G', ''
'H', ''
'I', ' '
9 rows
Condition #1 is field <> 0 AND field <> 1:
'C', '2'
'F', '2'
'G', ''
'H', ''
'I', ' '
5 rows
Condition #2 is field = 0 OR field = 1:
'A', '0'
'B', '1'
'D', '0'
'E', '1'
4 rows
Condition #3 is NOT (field = 0 OR field = 1):
'A', '0'
'B', '1'
'D', '0'
'E', '1'
4 rows
Condition #4 is NOT field = 0 OR field = 1:
'B', '1'
'C', '2'
'E', '1'
'F', '2'
'G', ''
'H', ''
'I', ' '
7 rows
Condition #5 is field <> 0:
'B', '1'
'C', '2'
'E', '1'
'F', '2'
'G', ''
'H', ''
'I', ' '
7 rows
Condition #6 is NOT field <> 0:
'A', '0'
'D', '0'
2 rows


Perl version: v5.16.3 on MSWin32 (C:\Program Files\Perl64\bin\perl.exe)
ActivePerl::Config                                     -  Unknown
ActiveState::Path                                      -     1.01
AutoLoader                                             -     5.73
C:::Program Files::Perl64::site::lib::sitecustomize.pl -  Unknown
Carp                                                   -     1.26
Class::Struct                                          -     0.63
Clone                                                  -     0.34
Config                                                 -  Unknown
Config_git.pl                                          -  Unknown
Config_heavy.pl                                        -  Unknown
Cwd                                                    -     3.40
DBD::CSV                                               -     0.41
DBD::File                                              -     0.42
DBI                                                    -    1.631
DBI::DBD::SqlEngine                                    -     0.06
DBI::SQL::Nano                                         - 1.015544
Data::Dumper                                           -    2.139
Devel::VersionDump                                     -     0.02
DynaLoader                                             -     1.14
Encode                                                 -     2.49
Encode::Alias                                          -     2.16
Encode::Config                                         -     2.05
Encode::Encoding                                       -     2.05
Errno                                                  -     1.15
Exporter                                               -     5.67
Exporter::Heavy                                        -     5.67
Fcntl                                                  -     1.11
File::Basename                                         -     2.84
File::Spec                                             -     3.40
File::Spec::Unix                                       -     3.40
File::Spec::Win32                                      -     3.40
File::stat                                             -     1.05
IO                                                     -  1.25_06
IO::Dir                                                -      1.1
IO::File                                               -     1.16
IO::Handle                                             -     1.33
IO::Seekable                                           -      1.1
List::Util                                             -     1.27
Math::BigFloat                                         -    1.997
Math::BigInt                                           -    1.998
Math::BigInt::Calc                                     -    1.997
Math::Complex                                          -     1.59
Math::Trig                                             -     1.23
Params::Util                                           -     1.07
SQL::Dialects::AnyData                                 -    1.405
SQL::Dialects::Role                                    -    1.405
SQL::Eval                                              -    1.405
SQL::Parser                                            -    1.405
SQL::Statement                                         -    1.405
SQL::Statement::Function                               -    1.405
SQL::Statement::Functions                              -    1.405
SQL::Statement::Operation                              -    1.405
SQL::Statement::Placeholder                            -    1.405
SQL::Statement::RAM                                    -    1.405
SQL::Statement::Term                                   -    1.405
SQL::Statement::TermFactory                            -    1.405
SQL::Statement::Util                                   -    1.405
Scalar::Util                                           -     1.27
SelectSaver                                            -     1.02
Symbol                                                 -     1.07
Text::CSV_XS                                           -     1.07
Tie::Hash                                              -     1.04
Time::HiRes                                            -   1.9725
Win32                                                  -     0.47
XSLoader                                               -     0.16
base                                                   -     2.18
bytes                                                  -     1.04
constant                                               -     1.25
integer                                                -     1.00
overload                                               -     1.18
overloading                                            -     0.02
sort                                                   -     2.01
strict                                                 -     1.07
unicore::Heavy.pl                                      -  Unknown
unicore::lib::Perl::Word.pl                            -  Unknown
unicore::lib::Perl::_PerlIDS.pl                        -  Unknown
utf8                                                   -     1.09
utf8_heavy.pl                                          -  Unknown
vars                                                   -     1.02
warnings                                               -     1.13
warnings::register                                     -     1.02

Condition #0 shows the complete dataset and is fine.

Condition #1 is just some compound condition and works fine.

Condition #2 is the opposite condition (basic logic rules used to invert it), and works fine too.

Yet, condition #3 should be the opposite of #2 and thus equal to #1, but the result is the same as #2: I cannot make any sense of this.

Condition #4 shows that, omitting the parentheses, NOT does work fine, but of course this query is different from any of the previous ones.

Conditions #5 and #6 show a situation where NOT acts exactly as one would expect.

So, why NOT on a compound condition acts as if the NOT were not specified at all?!

By the way, I read this scary post Perl DBD::CSV - SQL Syntax - "AND" clause is not working properly and added Devel::VersionDump to check whether I have a similar issue but it seems to me that all relevant packages are the newest available. Hence, I really have no clue about this.

Was it helpful?

Solution

I confirm it's a bug of SQL::Parser:

'where_clause' => HASH(0x7f9686737480)
  'arg1' => HASH(0x7f9686808248)
     'arg1' => HASH(0x7f96866b50f8)
        'fullorg' => 'field'
        'type' => 'column'
        'value' => 'field'
     'arg2' => HASH(0x7f968588dfe0)
        'fullorg' => 0
        'type' => 'number'
        'value' => 0
     'neg' => 0
     'nots' => HASH(0x7f96866b55d8)
          empty hash
     'op' => '='
  'arg2' => HASH(0x7f9684498ce0)
     'arg1' => HASH(0x7f96845fb798)
        'fullorg' => 'field'
        'type' => 'column'
        'value' => 'field'
     'arg2' => HASH(0x7f96866b5158)
        'fullorg' => 1
        'type' => 'number'
        'value' => 1
     'neg' => 0
     'nots' => HASH(0x7f96866b55a8)
          empty hash
     'op' => '='
  'neg' => 0
  'nots' => HASH(0x7f9686808320)
       empty hash
  'op' => 'OR'

The top-most "neg" should be 1. Please open a ticket at https://rt.cpan.org/Dist/Display.html?Name=SQL-Statement - when you refer this thread, the test case is proven :)

Cheers, Jens

OTHER TIPS

SQL logic for DBD::CSV is NOT contained in DBD::CSV, which is just a thin glue layer between Text::CSV_XS and DBI.

All SQL knowledge is dealt with by SQL::Statement. If you think you found a real bug, please try to dig in that module and find the cause, create a patch and post the issue with the patch on RT :)

If this indeed happens to be a bug with the distributivity of NOT over parentheses you could fix it quick and dirty by replacing

NOT (A OR B)

with

NOT A AND NOT B

which is equivalent to the former in a logical sense. That probably does not answer the question why your code fails, but if this works and the other does not, than i would assume that to be a bug (or maybe distributivity just is not implemented, no idea what is advertised as supposed to be working and what is not).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top