Question

Found a similar question but without an answer that worked succuessfully.

I need to select a sample of 50 of each status type within a single table.

TABLE1

MEMBER  STATUS
1234       A
1324       A
3424       R
3432       S
3232       R
2783       A
2413       S
4144       R
2387       S

I tried:

SEL Member, status FROM TABLE1 Qualify Row_Number ( ) OVER (PARTITION BY status ORDER BY random (1,10000)) <=50

As suggested in the previous question/answer but Teradata does not like RANDOM in an Aggregate or Ordered Analytical Function.

Était-ce utile?

La solution

Since, as you say in the comments, you do not need to use random you can do this:

SEL     Member, status 
FROM    TABLE1 
QUALIFY ROW_NUMBER() OVER (
            PARTITION BY status 
            ORDER BY NULL) <= 50;

Edit: based on a question in the comments related to a spool-space error, an approach we can try when the table is large is to create a temporary table with a portion of the records from the original table.

CREATE MULTISET VOLATILE TABLE tmp_tbl AS (
    SEL     Member, status
    FROM    TABLE1
    WHERE   somefield = 'somecriterion'
) WITH DATA 
ON COMMIT PRESERVE ROWS;

And then try again:

SEL     Member, status 
FROM    tmp_tbl /* now from temporary table */
QUALIFY ROW_NUMBER() OVER (
            PARTITION BY status 
            ORDER BY NULL) <= 50;

Autres conseils

If you have a discrete number of Status values the following approach may work given that the TOP operator is processed after the WHERE clause has been evaluated by Teradata. TOP n is actually the preferred method over the Window Aggregate approach of QUALIFY ROW_NUMBER() OVER() or QUALIFY RANK () OVER() providing better performance at best or comparable performance at worst:

SELECT TOP 50 
       Member
     , Status
FROM Table1
WHERE Status = 'A'
UNION ALL
SELECT TOP 50 
       Member
     , Status
From Table1
WHERE Status = 'R'
UNION ALL
SELECT TOP 50 
       Member
     , Status
FROM Table1
WHERE Status = 'S';

You may be also have success using multiple queries with the SAMPLE clause filtering each query by the status code you wish to sample.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top