How to find the longest sequence of letter in a string
-
03-07-2021 - |
Question
I want to find out the longest sequence of letter in a string
e.g. in the word Honorificabcdwert
, the output will be abcd
.
How can I do it?
My idea is to get the Ascii and then count the sequence until it breaks at some point. But I was able to proceed with only
DECLARE @t TABLE(ID INT IDENTITY,String VARCHAR(100))
INSERT INTO @t SELECT 'Honorificabcdwert'
;with Get_Individual_Chars_Cte AS
(
SELECT
ID
,Row_ID =ROW_NUMBER() Over(PARTITION by ID Order by ID)
,SUBSTRING(String,Number,1) AS [Char]
,ASCII(SUBSTRING(String,Number,1)) AS [Ascii Value]
FROM @t
INNER JOIN master.dbo.spt_values ON
Number BETWEEN 1 AND LEN(String)
AND type='P'
)
Select * from Get_Individual_Chars_Cte
After this I don't know what to do. Help needed for this or any other way of doing so.
La solution
Will this help
DECLARE @t TABLE(ID INT IDENTITY,String VARCHAR(100))
INSERT INTO @t
SELECT 'Honorificabcdwert' UNION ALL
SELECT 'AbCdEfxy' UNION ALL
SELECT 'abc1234defg' UNION ALL
SELECT 'XYZABCPPCKLMIDBABC' UNION ALL
SELECT 'MNOP$%^&~()MNOPQRS;:'
SELECT ID, OriginalString,Sequence
FROM (SELECT ID, REPLACE(string,'%','') AS Sequence,OriginalString,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY LEN(string) DESC, string) AS rn
FROM (SELECT OriginalString = b.String, CASE WHEN b.String LIKE a.strings THEN a.strings ELSE NULL END AS string,
b.ID, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY LEN(strings) DESC, strings) AS rn
FROM (SELECT COALESCE('%' + b.strings+a.strings + '%','%' + a.strings + '%') AS strings
FROM (SELECT SUBSTRING('ABCDEFGHIJKLMNOPQRSTUVWXYZ',t1.N,t2.N-t1.N+1) AS strings, t1.N
FROM (VALUES(1),(2),(3),(4),(5),(6),(7),(8),
(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),
(23),(24),(25),(26)) t1(N)
CROSS JOIN (VALUES(1),(2),(3),(4),(5),(6),(7),(8),
(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),
(23),(24),(25),(26)) t2(N)
WHERE t1.N <= t2.N) a
LEFT OUTER JOIN (SELECT REVERSE(SUBSTRING('ZYXWVUTSRQPONMLKJIHGFEDCBA',1,N)) AS strings, 1 AS ID
FROM (VALUES(1),(2),(3),(4),(5),(6),(7),(8),
(9),(10),(11),(12),(13),(14),(15),
(16),(17),(18),(19),(20),(21),(22),
(23),(24),(25),(26)) t1(N)
UNION ALL SELECT '', 1) b ON a.N = b.ID) a
CROSS JOIN @t b) a ) a
WHERE a.rn = 1
ORDER BY a.ID
Result
ID OriginalString Sequence
1 Honorificabcdwert ABCD
2 AbCdEfxy ABCDEF
3 abc1234defg DEFG
4 XYZABCPPCKLMIDBABC XYZABC
5 MNOP$%^&~()MNOPQRS;: MNOPQRS
Based on your inputs provided in the course of discussion with @Martin Smith, the program is being developed. Please test it and let me know if it satisfies your requirement.
Autres conseils
For consecutive rows with characters rising in alphabetical order (equating alphabetical order with ASCII order here) ROW_NUMBER() OVER (ORDER BY Row_ID) - [Ascii Value]
will be the same.
This is not sufficient on its own however as for the string ABCZE
that would put E
in the same group as ABC
so then you need a second operation to find gaps in that grouping sequence.
Something like the following should do it.
DECLARE @t TABLE(ID INT IDENTITY,String VARCHAR(100))
INSERT INTO @t SELECT 'Honorificabcdwfrt'
;with Get_Individual_Chars_Cte AS
(
SELECT
ID
,Row_ID =ROW_NUMBER() Over(PARTITION by ID Order by ID)
,SUBSTRING(String,number,1) AS [Char]
,ASCII(SUBSTRING(String,number,1)) AS [Ascii Value]
FROM @t
INNER JOIN master.dbo.spt_values ON
number BETWEEN 1 AND LEN(String)
AND type='P'
)
, T1 AS
(
Select *,
ROW_NUMBER() OVER (ORDER BY Row_ID) - [Ascii Value] AS RN
from Get_Individual_Chars_Cte
), T2 AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY Row_ID) -
ROW_NUMBER() OVER (PARTITION BY RN ORDER BY Row_ID) AS Grp
FROM T1
)
SELECT TOP 1 WITH TIES *
FROM T2
ORDER BY COUNT(*) OVER (PARTITION BY RN, Grp) DESC