Question

How do I structure an SQL statement to run across multiple flat unrelated tables and display the result with the result of the select and the name of the table where the result came from.

The scenario is such that I have several tables with the same column name in each. It is data that I have received from outside parties that I store as it is in different tables.

Same tables look like:

Table 1: pid, parent_name, student_name, student_number, class_name, columnN
Table 2: pid, previous_school, previous_school, student_number, columnN
Table 3: pid, student_name, student_number, parent_name, column4, columnN
Table 14: pid, student_number, parent_name, column4, columnN
Table N: pid, previous_school, parent_name, column4, columnN

I need an SQL statement that searches for student_name across all tables In pseudo code: for each table, find a student named john doe and return to me the row where you got the result and the table where you found the result

Give the result in the following presentation:

john doe, Table 1, pid
john doe, Table 9, pid

To make it a bit complicated, the column student_name might not be in all tables so the query needs to proceed graciously if doesn't find the column there.

Was it helpful?

Solution

You are looking for dynamic SQL. Assemble your query from the system catalog automatically:

SELECT string_agg('SELECT student_name, '''
                   || c.oid::regclass || ''' AS tbl, pid FROM '
                   || c.oid::regclass
                   || $$ WHERE student_name = 'John Doe'$$
                 , E'\nUNION ALL\n')
FROM   pg_namespace n
JOIN   pg_class     c ON c.relnamespace = n.oid
WHERE  n.nspname = 'public'         -- schema name where your tables lie
AND    c.relname LIKE 't%'          -- and / or filter table names
AND    EXISTS (
   SELECT 1 FROM pg_attribute 
   WHERE  attrelid = c.oid
   AND    attname = 'student_name'  -- make sure column exists
   AND    NOT attisdropped          -- and is alive
   );

Produces the query string:

SELECT student_name, 'tbl1' AS tbl, pid FROM tbl1 WHERE student_name = 'John Doe'
UNION ALL
SELECT student_name, 'tbl2' AS tbl, pid FROM tbl2 WHERE student_name = 'John Doe'
UNION ALL
SELECT student_name, 'tbl3' AS tbl, pid FROM tbl3 WHERE student_name = 'John Doe'
...

Then run it in a second call or completely automate it with a PL/pgSQL function using EXECUTE. Example:
Select a dynamic set of columns from a table and get the sum for each

This query produces safe code with sanitized identifiers preventing SQL injection. (Explanation for oid::regclass here.)

There are more related answers. Use a search.

BTW, LIKE in student_name LIKE 'John Doe' is pointless. Without wildcards, just use =.

OTHER TIPS

Just add a literal value to each select that describes the source table:

select student_name, 'Table 1', pid
from table1
where ...
union
select some_name, 'Table 2', pid
from table2
where ...
union
...

You could get the names of the tables from the all_tables view, create an union query dinamically and then execute it with execute immediate. Something like this --be careful the code might have errors--:

DECLARE
  v_query VARCHAR2(4000) := '';
  v_student_name VARCHAR2(50) := 'John Doe';

  type r_results is record (
    student_name VARCHAR2(500),
    table_name VARCHAR2(100),
    pid NUMBER
  );

  v_results r_results;

  CURSOR c_tables IS
    SELECT table_name 
    FROM all_tables 
    WHERE upper(table_name) LIKE '%_STUDENT_RECORDS';
BEGIN      
  FOR client_table IN c_tables LOOP
    IF v_query IS NOT NULL THEN
      v_query := v_query || ' UNION ';
    END IF;

    v_query := v_query || 'SELECT student_name, ' || client_table.table_name || ', ' || pid FROM ' || client_table.table_name || ' WHERE student_name = ''' || v_student_name || '''';
  END LOOP;


  EXECUTE IMMEDIATE v_query INTO v_results;
END;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top