Download a Google Sites page Content Feed using gdata-python-client

https://stackoverflow.com/questions/12634738

04-07-2021
|

Pregunta

My final goal is import some data from Google Site pages. I'm trying to use gdata-python-client (v2.0.17) to download a specific Content Feed:

self.client = gdata.sites.client.SitesClient(source=SOURCE_APP_NAME)
self.client.client_login(USERNAME, PASSWORD, source=SOURCE_APP_NAME, service=self.client.auth_service)     
self.client.site = SITE
self.client.domain = DOMAIN

uri = '%s?path=%s' % (self.client.MakeContentFeedUri(), '[PAGE PATH]')
feed = self.client.GetContentFeed(uri=uri)
entry = feed.entry[0]
...

Resulted entry.content has a page content in xhtml format. But this tree doesn't content any plan text data from a page. Only html page struct and links.

For example my test page has

 <div>Some text</div>

ContentFeed entry has only div node with text=None.

I have debugged gdata-python-client request/response and checked resolved data from server in raw buffer - any plan text data in content. Hence it is a Google API bug.

May be there is some workaround? May be i can use some common request parameter? What's going wrong here?

Solución

This code works for me against a Google Apps domain and gdata 2.0.17:

import atom.data
import gdata.sites.client
import gdata.sites.data

client = gdata.sites.client.SitesClient(source='yourCo-yourAppName-v1', site='examplesite', domain='example.com')
client.ClientLogin('admin@example.com', 'examplepassword', client.source);

uri = '%s?path=%s' % (client.MakeContentFeedUri(), '/home')
feed = client.GetContentFeed(uri=uri)
entry = feed.entry[0]
print entry

Given, it's pretty much identical to yours, but it might help you prove or disprove something. Good luck!

Licenciado bajo: CC-BY-SA con atribución

No afiliado a StackOverflow