formatting string output with linebreaks and tabs in Python?
-
22-02-2021 - |
Pergunta
Im trying to extract some data from a large batch of files and convert them to a specific (JSON) format for importing into a database using Django Fixtures
I've been able to get this far:
'{ {\n "pk":2,\n "model": trials.conditions,\n "fields": {\n "trial_id": NCT00109798,\n "keyword": Brain and Central Nervous System Tumors,\n }{\n "pk":3,\n "model": trials.conditions,\n "fields": {\n "trial_id": NCT00109798,\n "keyword": Lymphoma,\n }{\n "pk": 2,\n "model": trials.criteria,\n "fields": {\n "trial_id": NCT00109798,\n "gender": Both,\n "minimum_age": 18 Years,\n "maximum_age": N/A,\n "healthy_volunteers": No,\n "textblock": ,\n }\n\t\t"pk":2,\n\t\t"model": trials.keyword,\n\t\t"fields": {\n\t\t"trial_id": NCT00109798,\n\t\t"keyword": primary central nervous system non-Hodgkin lymphoma,\n\t\t}\n\t\t
...many lines later.....
After completion of study treatment, patients are followed every 3 months for 1 year, every\n 4 months for 1 year, and then every 6 months for 3 years.\n\n PROJECTED ACCRUAL: A total of 6-25 patients will be accrued for this study.\n ,\n "overall_status": Recruiting,\n "phase": Phase 2,\n "enrollment": 25,\n "study_type": Interventional,\n "condition": 2,3,\n "criteria": 1,\n "overall_contact": testdata,\n "location": 4,\n "lastchanged_date": March 31, 2010,\n "firstreceived_date": May 3, 2005,\n "keyword": 2,3,\n "condition_mesh": ,\n }\n \n {\n "pk": testdata,\n "model": trials.contact,\n "fields": {\n "trial_id": NCT00109798,\n "last_name": Pamela Z. New, MD,\n "phone": ,\n "email": ,\n }}'
The output actually needs to look like this:
{
"pk": trial_id,
"model": trials.trial,
"fields": {
"trial_id": trial_id,
"brief_title": brief_title,
"official_title": official_title,
"brief_summary": brief_summary,
"detailed_Description": detailed_description,
"overall_status": overall_status,
"phase": phase,
"enrollment": enrollment,
"study_type": study_type,
"condition": _______________,
"elligibility": elligibility,
"criteria": ______________,
"overall_contact": _______________,
"location": ___________,
"lastchanged_date": lastchanged_date,
"firstreceived_date": firstreceived_date,
"keyword": __________,
"condition_mesh": condition_mesh,
}
"pk": null,
"model": trials.locations,
"fields": {
"trials_id": trials_id,
"facility": facility,
"city": city,
"state": state,
"zip": zip,
"country": country,
}
Any advice would be much appreciated.
Solução
Alternative to json.dumps indent parameter:
Python has a pretty printer at http://docs.python.org/library/pprint.html. It is extremely simple to use but only pretty prints python objects (You can't give it a json string and expect formatted output)
Eg.
pydict = {"name":"Chateau des Tours Brouilly","code":"chateau-des-tours-brouilly-2009-1","region":"France > Burgundy > Beaujolais > Brouilly","winery":"Chateau Des Tours","winery_id":"chateau-des-tours","varietal":"Gamay","price":"14.98","vintage":"2009","type":"Red Wine","link":"http://www.snooth.com/wine/chateau-des-tours-brouilly-2009-1/","tags":"colorful, mauve, intense, purple, floral, violet, lively, rich, raspberry, berry","image":"http://ei.isnooth.com/wine/b/7/8/wine_6316762_search.jpeg","snoothrank":3,"available":1,"num_merchants":10,"num_reviews":1}
from pprint import pprint
pprint(pydict)
The output is
{'available': 1,
'code': 'chateau-des-tours-brouilly-2009-1',
'image': 'http://ei.isnooth.com/wine/b/7/8/wine_6316762_search.jpeg',
'link': 'http://www.snooth.com/wine/chateau-des-tours-brouilly-2009-1/',
'name': 'Chateau des Tours Brouilly',
'num_merchants': 10,
'num_reviews': 1,
'price': '14.98',
'region': 'France > Burgundy > Beaujolais > Brouilly',
'snoothrank': 3,
'tags': 'colorful, mauve, intense, purple, floral, violet, lively, rich, raspberry, berry',
'type': 'Red Wine',
'varietal': 'Gamay',
'vintage': '2009',
'winery': 'Chateau Des Tours',
'winery_id': 'chateau-des-tours'}
Outras dicas
There is a pretty printer in the json module. Try something like this, print json.dumps(s, indent=4)
.
>>> s = {'pk': 5678, 'model': 'trial model', 'fields': {'brief_title': 'a short title', 'trial_id': 1234}}
>>> print json.dumps(s, indent=4)
{
"pk": 5678,
"model": "trial model",
"fields": {
"brief_title": "a short title",
"trial_id": 1234
}
}