Question

First question here so please be gentle :)

I am struggling on a problem with extracting information from one of our OLTP databases which stores several types of information including multiple choice answers given to questions. These answers provide nice insight for us and thus we want to store it in our Datawarehouse.

The challenge is that the answer for all ticked boxes is stored as one integer value. While this may be an elegant solution for programming purposes and will work real-time displaying values it is less helpfull when processing data for a datawarehouse.

This is how the answer data is stored:

customer    question    answer
----------- ----------- -----------
1           1           6
2           1           2
3           1           62

After a while i noticed that it stores the SUM() of the answers where the SUM would be 2^position. Like in the sample below:

question    answer_desc          position    answer_value
----------- -------------------- ----------- ------------
1           a                    1           2
1           b                    2           4
1           c                    3           8
1           d                    4           16
1           e                    5           32

Which gives the following answers:

customer 1 will have answers 'a' and 'b' to question 1
customer 2 will have answer 'a' to question 1
customer 3 will have answers 'a', 'b', 'c', 'd' and 'e' to question 1

I have come up with a mathematical formula for extracting the highest possible 2^n value from the answer extracting it from the total to get each and every ticked box for the provided answer and put it in a function:

ALTER FUNCTION [dbo].[ZZ_answers] (@input1 VARCHAR(50),@input2 BIGINT)
RETURNS @Table TABLE
(
inputwaarde varchar(50) null,
waarde int NOT NULL,
exponent int NOT NULL
)
AS
BEGIN

--DECLARE @waarde BIGINT
DECLARE @exponent INTEGER
DECLARE @output INTEGER
DECLARE @waarde BIGINT

SET @waarde = @input2

--SET @waarde =  147848218 -- SELECT
--SET @exponent = 0

WHILE @waarde >0

    BEGIN
        IF @waarde >0 
            BEGIN
                SET @exponent = (
                                    SELECT 
                                        FLOOR(
                                                    (
                                                        LOG(@waarde)/LOG(2)
                                                    )
                                             )
                                )
                SET @waarde = @waarde - (
                                            SELECT 
                                                POWER(2,
                                                    FLOOR(
                                                            (   
                                                            LOG(@waarde)/LOG(2)
                                                            )
                                                        )
                                                    )
                                        )
                INSERT @Table
                SELECT rtrim(@input1),@input2,@exponent;

            END
    END
RETURN
END

I want to ask what the best approach would be to use this on filling my Datawarehouse.

Currently i have two approaches in my mind:

1) Select all distinct answer values from my answers table and use the function above to generate all possible answers currently being used by the system. I would implement this as a part of the ETL procedure in the SSIS package when filling the Datawarehouse. Altough this would give an accurate result it would be performance heavy to process each result realtime and thus slow the generation of the Datawarehouse. Our answers table has approximately 11 milion entries and growing.

2) Generate a new table with all possible answer values based on the questions table. I would have to loop through each question and possible variations in ticked boxes and provide the right answer value for that specific combination. Needless to say this would be a heavy operation to generate all x^y answers for each possible combination. However this would result in a data table which can then be used for processing the datawarehouse. We would need to regenerate the answers table whenever new questions are generated. The likelyness of this would be very unlikely.

Which of the two would you encourage to use? and how would i go about looping my questions as efficiently as possible if i were to choose option 2? and would there be another option i'm not seeing?

Était-ce utile?

La solution

Here is an example of using bitwise operators to extract what you want:

Then it's a matter of pivoting and processing that into what you require.

SELECT Customer, answer, 
answer & POWER(2,0) pos1,
answer & POWER(2,1) pos2,
answer & POWER(2,2) pos3,
answer & POWER(2,3) pos4,
answer & POWER(2,4) pos5,
answer & POWER(2,5) pos6,
answer & POWER(2,6) pos7
from (
SELECT 1 Customer, 6 answer
union all
SELECT 1 Customer, 2 answer
union all
SELECT 1 Customer, 62 answer
) F

Autres conseils

So following answer was applicable to my problem. it is based on the selected answer.

with the selected answer i was able to create a pivotted table with the possibilities for each answer:

Select * FROM(
SELECT  itemid, answer, 
answer & POWER(2,0) pos01,
answer & POWER(2,1) pos02,
answer & POWER(2,2) pos03,
answer & POWER(2,3) pos04,
answer & POWER(2,4) pos05,
answer & POWER(2,5) pos06,
answer & POWER(2,6) pos07,
answer & POWER(2,7) pos08,
answer & POWER(2,8) pos09,
answer & POWER(2,9) pos10,
answer & POWER(2,10) pos11,
answer & POWER(2,11) pos12,
answer & POWER(2,12) pos13,
answer & POWER(2,13) pos14,
answer & POWER(2,14) pos15,
answer & POWER(2,15) pos16,
answer & POWER(2,16) pos17,
answer & POWER(2,17) pos18,
answer & POWER(2,18) pos19,
answer & POWER(2,19) pos20,
answer & POWER(2,20) pos21,
answer & POWER(2,21) pos22,
answer & POWER(2,22) pos23, 
answer & POWER(2,23) pos24,
answer & POWER(2,24) pos25,
answer & POWER(2,25) pos26,
answer & POWER(2,26) pos27,
answer & POWER(2,27) pos28,
answer & POWER(2,28) pos29,
answer & POWER(2,29) pos30
from (
SELECT  itemid,data1 as answer from Dossieritems where data1 != 0
) P ) pvt

results in:

itemid      answer      pos01       pos02       pos03       pos04       pos05       pos06       pos07       pos08       pos09       pos10       pos11       pos12       pos13       pos14       pos15       pos16       pos17       pos18       pos19       pos20       pos21       pos22       pos23       pos24       pos25       pos26       pos27       pos28       pos29       pos30
----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
498         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
499         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
500         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
501         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
502         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
503         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
520         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
548         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
549         512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1330        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1331        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1332        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1366        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1422        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1238        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1240        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
1300        512         0           0           0           0           0           0           0           0           0           512         0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
259234      333405704   0           0           0           8           0           0           0           0           0           512         1024        2048        4096        0           16384       0           65536       131072      262144      524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259237      536829448   0           0           0           8           0           0           0           0           0           512         1024        2048        4096        0           16384       0           65536       131072      262144      524288      1048576     2097152     4194304     8388608     16777216    33554432    67108864    134217728   268435456   0
259238      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259239      400211226   0           2           0           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    67108864    0           268435456   0
259240      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259245      400211226   0           2           0           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    67108864    0           268435456   0
259257      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259263      390741246   0           2           4           8           16          32          64          128         0           0           1024        2048        4096        8192        0           0           0           131072      0           524288      0           0           4194304     0           16777216    33554432    67108864    0           268435456   0
259270      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259277      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259279      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259280      333102366   0           2           4           8           16          0           0           0           256         0           1024        2048        4096        8192        0           32768       0           131072      0           524288      1048576     0           4194304     8388608     16777216    33554432    0           0           268435456   0
259286      390741246   0           2           4           8           16          32          64          128         0           0           1024        2048        4096        8192        0           0           0           131072      0           524288      0           0           4194304     0           16777216    33554432    67108864    0           268435456   0 

The next step is Unpivotting the results and (in my case) convert the tablename to a joinable integer:

Select *, CONVERT(int,SUBSTRING(pos,4,2)) as positie FROM(
SELECT  itemid, answer, 
answer & POWER(2,0) pos01,
answer & POWER(2,1) pos02,
answer & POWER(2,2) pos03,
answer & POWER(2,3) pos04,
answer & POWER(2,4) pos05,
answer & POWER(2,5) pos06,
answer & POWER(2,6) pos07,
answer & POWER(2,7) pos08,
answer & POWER(2,8) pos09,
answer & POWER(2,9) pos10,
answer & POWER(2,10) pos11,
answer & POWER(2,11) pos12,
answer & POWER(2,12) pos13,
answer & POWER(2,13) pos14,
answer & POWER(2,14) pos15,
answer & POWER(2,15) pos16,
answer & POWER(2,16) pos17,
answer & POWER(2,17) pos18,
answer & POWER(2,18) pos19,
answer & POWER(2,19) pos20,
answer & POWER(2,20) pos21,
answer & POWER(2,21) pos22,
answer & POWER(2,22) pos23, 
answer & POWER(2,23) pos24,
answer & POWER(2,24) pos25,
answer & POWER(2,25) pos26,
answer & POWER(2,26) pos27,
answer & POWER(2,27) pos28,
answer & POWER(2,28) pos29,
answer & POWER(2,29) pos30
from (
SELECT  itemid,data1 as answer from Dossieritems where data1 != 0
) P ) pvt 

UNPIVOT
    (ans for pos IN 
     (pos01,pos02,pos03,pos04,pos05,pos06,pos07,pos08,pos09,pos10,pos11,pos12,pos13,pos14,pos15,pos16,pos17,pos18,pos19,pos20,pos21,pos22,pos23,pos24,pos25,pos26,pos27,pos28,pos29,pos30)
    )AS unpvt --INNER JOIN Diagsoorten DS ON (DS.Positie = CONVERT(int,SUBSTRING(pos,4,2)))

    Where ans != 0

This code is able to process 9 million results in just over a minute.

The end result looks like this:

itemid      answer      ans         pos                                                                                                                              positie
----------- ----------- ----------- -------------------------------------------------------------------------------------------------------------------------------- -----------
498         512         512         pos10                                                                                                                            10
499         512         512         pos10                                                                                                                            10
500         512         512         pos10                                                                                                                            10
501         512         512         pos10                                                                                                                            10
502         512         512         pos10                                                                                                                            10
503         512         512         pos10                                                                                                                            10
520         512         512         pos10                                                                                                                            10
548         512         512         pos10                                                                                                                            10
549         512         512         pos10                                                                                                                            10
1330        512         512         pos10                                                                                                                            10
1331        512         512         pos10                                                                                                                            10
1332        512         512         pos10                                                                                                                            10
1366        512         512         pos10                                                                                                                            10
1422        512         512         pos10                                                                                                                            10
1238        512         512         pos10                                                                                                                            10
1240        512         512         pos10                                                                                                                            10
1300        512         512         pos10                                                                                                                            10
259234      333405704   8           pos04                                                                                                                            4
259234      333405704   512         pos10                                                                                                                            10
259234      333405704   1024        pos11                                                                                                                            11
259234      333405704   2048        pos12                                                                                                                            12
259234      333405704   4096        pos13                                                                                                                            13
259234      333405704   16384       pos15                                                                                                                            15
259234      333405704   65536       pos17                                                                                                                            17
259234      333405704   131072      pos18                                                                                                                            18
259234      333405704   262144      pos19                                                                                                                            19
259234      333405704   524288      pos20                                                                                                                            20
259234      333405704   1048576     pos21                                                                                                                            21
259234      333405704   4194304     pos23                                                                                                                            23
259234      333405704   8388608     pos24                                                                                                                            24
259234      333405704   16777216    pos25                                                                                                                            25
259234      333405704   33554432    pos26                                                                                                                            26
259234      333405704   268435456   pos29                                                                                                                            29

Thank you ElectricLlama!

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top