Question

i want to extract data 'AT401726' from the html given below

<td class="publicationInfoColumn">
  <h4>Publication info:</h4>
  AT401726<br>2008-08-15
</td>

& i solved it by using JQuery, the working code is given below

('body').find('.publicationInfoColumn').clone().children().remove().end().text()

is there any other better technique to extract data from above given html ? there are many html like above in my crawled html page

Was it helpful?

Solution

The text you are looking for the the contents of the next sibling element of the h4 element, so try

var text = $.trim($('.publicationInfoColumn h4').prop('nextSibling').nodeValue);
console.log(text)

Demo: Fiddle

OTHER TIPS

Use:

$('td.publicationInfoColumn').text();//for text

or

$('td.publicationInfoColumn').html();//for html

You're not supposed to be able to target specific text nodes like that. The best you can probably do is:

$('.publicationInfoColumn').html().match(/\b.*(?=<br>)/)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top