Question

The HTML content of my page is located in a LONG_TEXT column in MySQL.

My objectif is to extract the url of my facebook page using the LOCATE, SUBSTRING or SUBSTRING_INDEX functions in MySQL

I found this article Mysql query to extract domains from urls but it doesn't really fit the problem.

How would you extract efficiently the string between 'href="http://www.facebook.com/' and '"' in the string using mysql?

Was it helpful?

Solution

This solution works but can certainly be improved

IF(LOCATE('http://www.facebook.com/', html_cache) > 0, CONCAT('http://www.facebook.com/', SUBSTRING_INDEX((SUBSTRING_INDEX(html_cache, 'http://www.facebook.com/', -1)), '"', 1)), html_cache) AS page_url
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top