Parsing URL Links
-
14-02-2021 - |
Question
I have a large data set of over 10k+ rows and I'm trying to parse the url link
that people of clicked on
here is a table: dbo.email_list
UserID Cliked_Linked
101012 https:// amz/profile_center?qp= 8eb6cbf33cfaf2bf0f51
052469 htpps:// lago/center=age_gap=email_address=caipaingn4535=English_USA
046894 https://itune/fr/unsub_email&utm=packing_345=campaign_6458_linkname=ghostrider
So I tried this code:
UPDATE email_list set Clicked_Link= REVERSE(SUBSTRING(REVERSE(Cliked_Link),,CHARINDEX('.', REVERSE(ColumnName)) + 1, 999))
Unfortunately this didn't work.
The goal is to have the link split where the '=' sign is and and have anything that is between the equal sign be in its own column
This is the result I hope to have
UserID COL_1 COL_2 COL_3 COL_4
101012 https:// amz/profile_center?qp 8eb6cbf33cfaf2bf0f51 NaN
052469 htpps:// lago/center email_addres caipaingn4535 English_USA
046894 https://itune/fr/unsub_email&utm packing_345 campaign_6458_linknam ghostrider
Solution
If you are on SQL Server 2016 or above, you can use STRING_SPLIT()
in combination with PIVOT
. You would have to know how many =
signs you have though.
SELECT UserID, Cliked_Linked, [1],[2],[3],[4],[5]
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY USERID ORDER BY (SELECT NULL)) as rn
FROM dbo.email_list
CROSS APPLY
STRING_SPLIT(Cliked_Linked,'=')
) AS SourceT
PIVOT
(
MAX(value)
FOR rn IN ([1],[2],[3],[4],[5])
) as Pvt;
Example result set
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange