Вопрос

I'm building a custom harvester for importing data from an external site into CKAN (version 1.8).

It works pretty well and creates the metadata and the resources associated with it. I'd like to aggregate this resources and create a new CSV to save it in the Datastore while harvesting in the import phase.

I know I can use the DataStore API but I'd prefer not to use HTTP (it makes no sense to me to give an API key / user / URL / ... to an harvester that has permissions to add stuff)

Is it possible to call the DataStore API functions directly from the harvester? https://github.com/okfn/ckan/blob/master/ckanext/datastore/logic/action.py

Every function takes a context parameter which is not documented.

Это было полезно?

Решение

You have a couple of distinct things you are doing here:

  • Converting CSV to appropriate python (or JSON) structure for insertion in the datastore
  • Inserting into the datastore

For the latter you can use either:

The API just calls the logic actions (plus does auth) so these are pretty similar but the logic approach will likely be faster and could be more natural if you are already doing code. That said the API could be conceptually cleaner as you have nice boundaries to your different components in the form of defined web apis.

For the former (i.e. conversion of CSV to JSON) recommend you use the Data Converters library, especially the commas.py part which converts to exactly the format you need. There is a full web service being developed based on Data Converters but it is not yet fully operational.

Другие советы

I solved this by using ckanext-datastorer (for the DataStore) and ckanclient (for uploading the file)

ckanclient is bugged with CKAN 1.8 because it doesn't handle redirects correctly. We solved with this bleeding and dirty patch https://gist.github.com/mammadori/4945812

A better fix would be completely drop urllib and change the whole ckanclient to use requests instead.

Thanks for your support

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top