Question

I am using cfhttp to get a website . I want to replace all the links inside the body tags. Importantly I don't want to mess up the stylesheets etc in the head.

I want to do the following:

In the external web page body we may find a link:

<a href="http://www.externallink.com">External Link</a>

I want to replace it with the following:

<a href="http://www.mydomain.com?url=http://www.externallink.com">External Link</a>

Its easy enough using Replace() but then I also replace all the linked stylesheets etc. I just want to edit the href's of clickable links.

Était-ce utile?

La solution

I've modified an HTML document's DOM to add tracking parameters to links in outbound email messages using the jsoup library. (jsoup is an open source Java HTML Parser and can be download at http://jsoup.org/.) You'll note that it uses jQuery-like select methods, but all manipulations are performed on the server-side (I've also used it for removing ads from CFHTTTP-fetched HTML.)

Here's a quick sample of working ColdFusion code that will do exactly what you want on the server-side:

<CFSET TheHTML = CFHTTP.FileContent>
<CFSET jsoup = CreateObject("java", "org.jsoup.Jsoup")>
<CFSET TempHTML = jsoup.parse(TheHTML)>
<CFLOOP ARRAY="#TempHTML.select('a')#" INDEX="ThisLink">
    <CFSET TheLink = thisLink.attr("href").toString()>
    <CFSET TheHTML = replace(TheHTML, TheLink, "http://mywebsite.com/?u=" & URLEncodedFormat(TheLink))>
</CFLOOP>
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top