Cross Join! In SQL Server, but I really only need about a third of the rows?
-
13-02-2021 - |
Domanda
Ok, so I'm not sure if I'm going about this the right way. But my scenario is I've got a stored procedure I'd like to test using all the possible combinations of the inputs it takes.
Let's say I have a stored procedure that takes two parameters, like this:
set UsefulValue = exec spMyStoredProc @ProfileID, @RoleID
Now in my case, ProfileID means something like people and RoleID means something like system roles. I'm being a little general here on purpose. The point is, I have about 60,000 defined people, and about 600 defined roles.
Unfortunately, the system I'm testing is COMPLICATED, like bad-hair-day complicated, and I really need to run this procedure through its paces.
Ok, with me so far? Hopefully you are; check out this sql to generate the data that I will eventually pass to the stored procedure:
select profiles.ProfileID, roles.RoleID from Profile profiles
cross join dbo.DefinedRoles roles
This is actually pretty good, but the problem is it's taking forever to run, and frankly I don't need every permutation of these two values.
So I tried constricting the result set like so:
select top 300000 profiles.ProfileObjID, roles.RoleName from dbo.Profile profiles
cross join dbo.rj_v_DefinedRoles roles
But oops! That constricts the final result, so I end up with only a result that is pretty much 2-3 roles (depending on what I pass to the top verb) and with a ProfileID for each person.
What I'd like to have, my goal, is to get results for all the roles I have (about 600) and for each one, maybe only use half of the ProfileIDs instead of the whole shebang.
So does that make sense what I'm asking?
Soluzione 4
This worked well. GREATLY cuts down on the time it takes to execute the cross join.
select profiles.ProfileObjID, roles.RoleName
from dbo.rj_v_DefinedRoles roles
cross join dbo.Profile profiles
where ProfileObjId in (select ProfileObjId from dbo.Profile
where ProfileObjId between 10000 and 11000)
Altri suggerimenti
If I understand the problem you want a number of test cases, but do not want them generated in favour of a specific role / person (which cross join is doing to you).
select top 300000
profiles.ProfileObjID, roles.RoleName
from dbo.Profile profiles
cross join dbo.rj_v_DefinedRoles roles
order by newid()
The order by will randomise your results, then top the results to get a random test set of data. This will not of course guarentee any specific role / person is in the final results, it's a percentage chance etc.
You could use ROW_NUMBER() in a couple of different places to select a portion of your data...
-- Maximum of 20,000 rows per RoleID
WITH
combinations AS
(
SELECT
roles.RoleID,
profiles.ProfileID,
ROW_NUMBER() OVER (PARTITION BY roles.RoleID) AS profile_sequence_id
FROM
Profile profiles
CROSS JOIN
dbo.DefinedRoles roles
)
SELECT
*
FROM
combinations
WHERE
profile_sequence_id < 20000
Or...
-- Maximum of 20,000 rows per RoleID
WITH
profiles AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY 1) AS profile_sequence_id
FROM
Profile
)
SELECT
roles.RoleID,
profiles.ProfileID
FROM
dbo.DefinedRoles roles
CROSS JOIN
profiles
WHERE
profiles .profile_sequence_id < 20000
Have you tried using the DISTINCT
keyword?
select DISTINCT
profiles.ProfileObjID, roles.RoleName from dbo.Profile profiles cross join dbo.rj_v_DefinedRoles roles