Question

I have an old database with a gazillion records (more or less) that have a single tags column (with tags being pipe-delimited) that looks like so:

    Breakfast
    Breakfast|Brunch|Buffet|Burger|Cakes|Crepes|Deli|Dessert|Dim Sum|Fast Food|Fine Wine|Spirits|Kebab|Noodles|Organic|Pizza|Salad|Seafood|Steakhouse|Sushi|Tapas|Vegetarian
    Breakfast|Brunch|Buffet|Burger|Deli|Dessert|Fast Food|Fine Wine|Spirits|Noodles|Pizza|Salad|Seafood|Steakhouse|Vegetarian
    Breakfast|Brunch|Buffet|Cakes|Crepes|Dessert|Fine Wine|Spirits|Salad|Seafood|Steakhouse|Tapas|Teahouse
    Breakfast|Brunch|Burger|Crepes|Salad
    Breakfast|Brunch|Cakes|Dessert|Dim Sum|Noodles|Pizza|Salad|Seafood|Steakhouse|Vegetarian
    Breakfast|Brunch|Cakes|Dessert|Dim Sum|Noodles|Pizza|Salad|Seafood|Vegetarian
    Breakfast|Brunch|Deli|Dessert|Organic|Salad
    Breakfast|Brunch|Dessert|Dim Sum|Hot Pot|Seafood
    Breakfast|Brunch|Dessert|Dim Sum|Seafood
    Breakfast|Brunch|Dessert|Fine Wine|Spirits|Noodles|Pizza|Salad|Seafood
    Breakfast|Brunch|Dessert|Fine Wine|Spirits|Salad|Vegetarian

Is there a way one could retrieve each tag and insert it into a new table tag_id | tag_nm using MySQL only?

Was it helpful?

Solution 2

After finding there is no official split function I've solved the issue using only MySQL like so:

1: I created the function strSplit

CREATE FUNCTION strSplit(x varchar(21845), delim varchar(255), pos int) returns varchar(255)
return replace(
replace(
substring_index(x, delim, pos),
substring_index(x, delim, pos - 1),
''
),
delim,
''
);

Second I inserted the new tags into my new table (real names and collumns changed, to keep it simple)

INSERT IGNORE INTO tag (SELECT null, strSplit(`Tag`,'|',1) AS T FROM `old_venue` GROUP BY T)

Rinse and repeat increasing the pos by one for each collumn (in this case I had a maximum of 8 seperators)

Third to get the relationship

INSERT INTO `venue_tag_rel` 
(Select a.`venue_id`, b.`tag_id` from `old_venue` a, `tag` b 
     WHERE 
     (         
     a.`Tag` LIKE CONCAT('%|',b.`tag_nm`) 
     OR a.`Tag` LIKE CONCAT(b.`tag_nm`,'|%') 
     OR a.`Tag` LIKE CONCAT(CONCAT('%|',b.`tag_nm`),'|%') 
     OR  a.`Tag` LIKE b.`tag_nm`
     ) 
)

OTHER TIPS

Here is my attempt which uses PHP..., I imagine this could be more efficient with a clever MySQL query. I've placed the relationship part of it there too. There's no escaping and error checking.

$rs = mysql_query('SELECT `venue_id`, `tag` FROM `venue` AS a');
while ($row = mysql_fetch_array($rs)) {
    $tag_array = explode('|',$row['tag']);
    $venueid = $row['venue_id'];
    foreach ($tag_array as $tag) {
        $rs2 = mysql_query("SELECT `tag_id` FROM `tag` WHERE tag_nm = '$tag'");
        $tagid = 0;
        while ($row2 = mysql_fetch_array($rs2)) $tagid = $row2['tag_id'];
        if (!$tagid) {
            mysql_execute("INSERT INTO `tag` (`tag_nm`) VALUES ('$tag')");
            $tagid = mysql_insert_id;
        }
        mysql_execute("INSERT INTO `venue_tag_rel` (`venue_id`, `tag_id`) VALUES ($venueid, $tagid)");
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top