Question

Found a similar question but without an answer that worked succuessfully.

I need to select a sample of 50 of each status type within a single table.

TABLE1

MEMBER  STATUS
1234       A
1324       A
3424       R
3432       S
3232       R
2783       A
2413       S
4144       R
2387       S

I tried:

SEL Member, status FROM TABLE1 Qualify Row_Number ( ) OVER (PARTITION BY status ORDER BY random (1,10000)) <=50

As suggested in the previous question/answer but Teradata does not like RANDOM in an Aggregate or Ordered Analytical Function.

Was it helpful?

Solution

Since, as you say in the comments, you do not need to use random you can do this:

SEL     Member, status 
FROM    TABLE1 
QUALIFY ROW_NUMBER() OVER (
            PARTITION BY status 
            ORDER BY NULL) <= 50;

Edit: based on a question in the comments related to a spool-space error, an approach we can try when the table is large is to create a temporary table with a portion of the records from the original table.

CREATE MULTISET VOLATILE TABLE tmp_tbl AS (
    SEL     Member, status
    FROM    TABLE1
    WHERE   somefield = 'somecriterion'
) WITH DATA 
ON COMMIT PRESERVE ROWS;

And then try again:

SEL     Member, status 
FROM    tmp_tbl /* now from temporary table */
QUALIFY ROW_NUMBER() OVER (
            PARTITION BY status 
            ORDER BY NULL) <= 50;

OTHER TIPS

If you have a discrete number of Status values the following approach may work given that the TOP operator is processed after the WHERE clause has been evaluated by Teradata. TOP n is actually the preferred method over the Window Aggregate approach of QUALIFY ROW_NUMBER() OVER() or QUALIFY RANK () OVER() providing better performance at best or comparable performance at worst:

SELECT TOP 50 
       Member
     , Status
FROM Table1
WHERE Status = 'A'
UNION ALL
SELECT TOP 50 
       Member
     , Status
From Table1
WHERE Status = 'R'
UNION ALL
SELECT TOP 50 
       Member
     , Status
FROM Table1
WHERE Status = 'S';

You may be also have success using multiple queries with the SAMPLE clause filtering each query by the status code you wish to sample.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top