Question

I'm struggling to write a sql query for the following scenario and any pointers would be appreciated:

I have my data stored like this:

Zone Name | SessionId
Zone 1    | 1
Zone 2    | 1
Zone 3    | 1
Zone 1    | 1
Zone 1    | 2
Zone 2    | 2
Zone 2    | 3
Zone 3    | 3
Zone 2    | 3

(note that people can enter the same zone more than once in a session)

I want to produce an output that shows the percentage of people who, having entered one zone, also entered another (i.e. 66% of people who entered Zone 1 also entered Zone 2):

       Zone 1 | Zone 2 | Zone 3
Zone 1 100%   | 66%    | 33%
Zone 2 66%    | 100%   | 66%
Zone 3 33%    | 66%    | 100%

Is there a specific built in SQL function to do something like this? Can anyone give any suggestions on how to achieve this?

Thanks Tom

p.s. using PostgreSQL

fiddle here: http://sqlfiddle.com/#!15/f379c/1/0

Was it helpful?

Solution

So if you can deal with some conversion on the application side:

DECLARE @ZoneCounts TABLE
    (
  Zone nvarchar(max),
  sessionId int
    )

INSERT INTO @ZoneCounts
values
('Zone 1', 1),
('Zone 2', 1),
('Zone 3', 1),
('Zone 1', 2),
('Zone 2', 2)

select v3.down, v3.across, convert(decimal(13,2),(CONVERT(decimal(13,2), v3.matchingSessions) / v3.sessionsVisitingDown) * 100) as percentage
from
(SELECT v2.down, v2.across, v2. matchingSessions, max(v2.matchingSessions) over (partition by v2.down) as sessionsVisitingDown
FROM (select down, across, matchingSessions from 
(select lhs.Zone as down, rhs.Zone as across, count(lhs.sessionId) over (partition by lhs.Zone, rhs.Zone) as matchingSessions
from @ZoneCounts as lhs inner join @ZoneCounts as rhs on lhs.sessionId = rhs.sessionId) AS v1
GROUP BY down, across, matchingSessions) AS v2) 
as v3

gives you

enter image description here

NB - uses MS SQL but should convert


Ah, Postgres isn't quite the same

SELECT v3.down, v3.across, cast(cast(v3.matchingSessions as decimal) / cast(v3.sessionsVisitingDown as decimal) * 100 as decimal(13,2)) AS percentage
FROM
(SELECT v2.down, v2.across, v2. matchingSessions, max(v2.matchingSessions) over (partition BY v2.down) AS sessionsVisitingDown
FROM (SELECT down, across, matchingSessions FROM
(SELECT lhs.name AS down, rhs.name AS across, count(lhs.id) over (partition BY lhs.name, rhs.name) AS matchingSessions
FROM ItemList AS lhs INNER JOIN ItemList AS rhs ON lhs.id = rhs.id) AS v1
GROUP BY down, across, matchingSessions) AS v2)
AS v3

here's a SQL fiddle: http://sqlfiddle.com/#!15/f379c/13/0 using PostgreSQL

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top