Question

I am needing to write a SQL statement from 3 tables, a table listing web URLs, a table listing possible categories, and then a table called URL_Categories. (Similar to the classic Student / Classes / Enrollment SQL Problem). The combination of URL and CATEGORY makes URL_CATEGORIES. One URL might have 1 or 'X' categories, so there can be multiple rows in the URL_CATEGORY table, but the COMBINATION of URL and CATEGORY are unique in the URL_CATEGORY table. The applicable CREATE TABLE definitions are:

CREATE TABLE URL (
  ID AUTOINC,
  SOURCE_DATE DATETIME,
  SITE VARCHAR(30),
 ...
); -186 rows


CREATE TABLE CATEGORY (
  ID AUTOINC,
  CATEGORY_NAME VARCHAR(20),   
);  -- 9 rows

CREATE TABLE URL_CATEGORIES (
  URL_ID INTEGER,
  CAT_ID INTEGER,  
); - 195 rows

In short, I want to see all columns. Since URL_CATEGORIES has 195 rows, my OUTPUT should have 195 rows. For each row in the URL_CATEGORIES table, select all the corresponding column in the URL table where URL_CATEGORIES.URL_ID = URL.ID and all the columns in the CATEGORY table where URL_CATEGORIES.CAT_ID = CATEGORY.ID.

The SQL I am using is giving me 38025 rows, telling me I have a Cartesian issue... The SQL is

select U1."*", C2."*", U3."*"
  from "URL_CATEGORIES" U1 
 inner join "CATEGORY" C2
    on (U1."CAT_ID" = C2."ID"),
      "URL_CATEGORIES" U1 
 inner join "URL" U3
    on (U1."URL_ID" = U3."ID")

I am thinking I need a subselect to get the rows on the 3rd table, not a join. How do I need to rewrite my SQL?

Thanks

Was it helpful?

Solution

You have a comma in the from statement and you refer to url_categories again, so you are getting a cartesian product. The query interprets this as a cross join. Try this:

select U1.*, C2.*, U3.*
from URL_CATEGORIES U1 inner join
      CATEGORY C2
      on U1.CAT_ID = C2.ID inner join
      URL U3
      on U1.URL_ID = U3.ID

The double quotes aren't necessary for this query.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top