Question

I posted this in the BigQuery issue tracker: (please star the issue if it affects you) https://code.google.com/p/google-bigquery/issues/detail?id=89&q=join%20each

What steps will reproduce the problem?

  1. See job personal-real-estate:job_up2I9A31Bo8NSvwD0XTWG2tBoVA
  2. I run
SELECT *        FROM  
      (select *,integer(AD_STREET_NO_PROP) as str_no_prop, integer(CD_ADDR_ZIP_PROP) as CD_ADDR_ZIP_PROP1 from [acris_nyc.nyc_dof_SOA] 
      where NM_RECIPIENT_1 like '%THE MICHAEL R. BLOOMBERG REVOCABLE%') AS s 
        JOIN   each  
      (select *,integer(hnum_lo) as str_num,integer(zip) as zip1 from [acris_nyc.nyc_dof_tc_Tentative_Assessment_Roll] where owner like '%BLOOM%' and txcl = '1') AS a  
      on s.str_no_prop = a.str_num and s.ad_street_1_prop = a.str_name order by NEW_FV_T desc limit 100

What is the expected output? What do you see instead?

I expect one record to be returned.

containing

17 as the str_num and "EAST 79 STREET" as the str_name

What version of the product are you using? On what operating system?

BigQuery on 4/22/2014 from chrome browser

Please provide any additional information below.

I try a very similar query on a much smaller set of tables and it works as expected.

SELECT *  FROM (select *, integer(number) as inumber from [test_1.table1] where owner like '%BLOOM%') as a join each (select *, integer(number) as inumber from [test_1.table2] where owner like '%BLOOM%') as b on a.inumber=b.inumber and a.street = b.street

returns 

Row a_number    a_street    a_owner a_inumber   b_number    b_street    b_owner b_inumber    
1   00000017    EAST 79 STREET  BLOOMBERG, MICHAEL R    17  17  EAST 79 STREET  THE MICHAEL R. BLOOMBERG REVOCABLE  17   

If I query the individual tables in the 1 million row case they contain the data that should match when the join completes.

Is there any way to debug the actual join operation?

Thanks.

Was it helpful?

Solution

Just to close the loop on this question; after investigation it turned out to be an error in the data that was masked by automatic-whitespace-deletion that was being done in the browser. See https://code.google.com/p/google-bigquery/issues/detail?id=89&q=join%20each for more information.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top