Question

Suppose, that I have a following resultset:

SELECT
    *
FROM
    (
        SELECT 1 as `no`, NULL as `sequence`
        UNION ALL
        SELECT 2, ''
        UNION ALL
        SELECT 3, '1'
        UNION ALL
        SELECT 4, '1,2,3,4,5'
        UNION ALL
        SELECT 5, '2,4,5'
        UNION ALL
        SELECT 6, '1, 5'
        UNION ALL
        SELECT 7, '1,3,5'
    ) as `sub`;

My task was to count sequence breaks / holes for each sequence listed in below. I've written following stored function:

DELIMITER $$

DROP FUNCTION IF EXISTS `countSequenceBreaks`$$

CREATE FUNCTION `countSequenceBreaks`(`sequence` VARCHAR(1000))
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE `delimiter` CHAR DEFAULT ',';

DECLARE `current`, `last` INT;
DECLARE `result` INT DEFAULT 0;

IF
    `sequence` IS NULL
    OR
    NOT LENGTH(`sequence`)
    OR
    NOT INSTR(`sequence`, `delimiter`)
THEN RETURN `result`;
END IF;

SET `current`   = SUBSTRING_INDEX(`sequence`, `delimiter`, 1);
SET `last`      = SUBSTRING_INDEX(`sequence`, `delimiter`, -1);

IF `last` < `current`
THEN
    SET `result`    = `last`;
    SET `last`      = `current`;
    SET `current`   = `result`;
    SET `result`    = 0;
END IF;

WHILE `current` < `last` DO
    IF NOT FIND_IN_SET(`current`, `sequence`)
    THEN SET `result` = `result` + 1;
    END IF;

    SET `current` = `current` + 1;
END WHILE;

RETURN `result`;
END$$

DELIMITER ;

But I'm worried about WHILE-loop might take too much iterations for different sequence members and cause query slowdown.

Questions:

  1. Is there any way to improve the stored function?
  2. If there is a way, then how?

My debug query:

SELECT
    `no`, `sequence`, `countSequenceBreaks`(`sequence`)
FROM
    (
        SELECT 1 as `no`, NULL as `sequence`
        UNION ALL
        SELECT 2, ''
        UNION ALL
        SELECT 3, '1'
        UNION ALL
        SELECT 4, '1,2,3,4,5'
        UNION ALL
        SELECT 5, '2,4,5'
        UNION ALL
        SELECT 6, '1, 5'
        UNION ALL
        SELECT 7, '1,3,5'
    ) as `sub`;

It's resultset:

no sequence   `countSequenceBreaks`(`sequence`)
-----------------------------------------------
1  NULL       0
2             0
3  1          0
4  1,2,3,4,5  0
5  2,4,5      1
6  1,5        3
7  1,3,5      2

Regards.

Was it helpful?

Solution

You can do it with one simple query:

select sequence,  
CASE WHEN NOT INSTR(IFNULL(sequence,''), ',') THEN 0
     ELSE 
       (
         SUBSTRING_INDEX(sequence,',' ,-1)
         -SUBSTRING_INDEX(sequence,',' , 1)
        ) 
        -
        (LENGTH(sequence)-LENGTH(REPLACE(sequence,',','')))
END countSequenceBreaks

from t

How to find count of sequence breaks?

For example for 1,3,5 sequence.

All we need to know breaks count is to calculate count of missed delimiters. In this case the full string 1,2,3,4,5 contains 5-1=4 delimiters but the 1,3,5 sequence contains only 2 delimiters so count of breaks (missed digits - what is equal to count of missed delimiters as you can see) = 4-2 = 2

How to know count of delimiters in the string?

In our case when delimiter has one symbol length it is (LENGTH(sequence)-LENGTH(REPLACE(sequence,',',''))

SQLFiddle demo

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top