Looking at the python-scribd
documentation or the scribd
API reference, any object that can give you a document ID or website URL can also give you a download URL. Or, if you already have a document ID, you can just call get
to get an object that can give you a download URL.
Most likely, you've got a Document
object, which has this method:
get_download_url
(self, doc_type='original')Returns a link that can be used to download a static version of the document.
So, wherever you're calling get_scribd_url
, just call get_download_url
.
And then, to download the result, Python has urllib2
(2.x) or urllib.request
(3.x) built into the standard library, or you can use requests
or any other third-party library instead.
Putting it all together as an example:
# do all the stuff to set up the api_key, get a `User` object, etc.
def is_document_i_want(document):
return document.author == "Me"
urls = [document.get_download_url() for document in user.all()
if is_document_i_want(document)]
for url in urls:
path = urllib.parse.urlparse(url).path
name = os.path.basename(path)
u = urllib.request.urlopen(url)
with open(name, 'w') as f:
f.write(u.read())
print('Wrote {} as {}'.format(url, name))
Presumably you're going to want to use something like user.find
instead of user.all
. Or, if you've already written the code that gets the document IDs and don't want to change it, you can use user.get
with each one.
And if you want to post-filter the results, you probably want to use attributes beyond the basic ones (or you would have just passed them to the query), which means you need to call load
on each document before you can access them (so add document.load()
at the top of the is_document_i_want
function). But really, there's nothing complicated here.