I'm trying to get the text (in this case it's '10-Q') of an entry from XBRL using cheerio.js with nodejs. The line is below:

<dei:DocumentType contextRef="D2013Q3YTD" id="Fact-DB2A50C2A485F9CC21D51934C6E61D42">10-Q</dei:DocumentType>

I've tried:

$('dei:DocumentType').text

and a few others to no avail. There is not unique id or anything else that I can see.

Sample file:

http://www.sec.gov/Archives/edgar/data/1018724/000144530513002495/amzn-20130930.xml

So how could I go about extracting this text? Thanks.

有帮助吗?

解决方案

It turns out that parsing the file above is very possible with Cheerio.

This works using Cheerio:

$('dei\\:CurrentFiscalYearEndDate').text().trim();

One must escape the special characters, twice, evidently.

其他提示

XBRL is XML and it cannot be treated as HTML DOM with libraries like cheerio. You will need an XML parser with Xpath support, like xpath, libxml or o3-xml

Then you can get the value with an XPath expression like this:

/*/dei:DocumentType/text()
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top