Why would scraperwiki omit lines from scraped html?

https://stackoverflow.com//questions/9603243

09-12-2019
|

質問

I have a really simple python script on scraperwiki:

import scraperwiki
import lxml.html

html = scraperwiki.scrape("http://www.westphillytools.org/toolsListing.php")
print html

I haven't written anything to parse it yet... for now I just want the html.

When I run it in edit mode it works perfectly.

When a scheduled scrape runs (or I manually run it), it omits dozens (or even hundreds) of lines.

It's a very small webpage so data overload shouldn't be a problem. Any ideas?

解決

In the editor, individual print statements are rolled up into one line for display. You can click "more..." in the console on the editor to view the whole lot.

When run scheduled, it's just output exactly like in any console. So if there are carriage returns in the HTML, you'll get lots of lines of output.

To reduce the amount of output we store, we truncate large outputs from scheduled runs. That's where you've seen "[53 lines, 159000 characters omitted]".

It's not really intended that stdout from scheduled runs is for anything other than debugging. You need to save to the datastore for output you want to use.

他のヒント

It sounds like the data are there in your variable. Try printing it a line at a time.

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow