题
我有一个类似我想要隐蔽要CSV文件。我怎么可以这样做蟒蛇?
我试过:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
f.writerow(item)
f.close()
然而,它没有工作。我使用Django、错误我收到的是:
file' object has no attribute 'writerow'
所以,然后我试图如下:
import json
import csv
f = open('data.json')
data = json.load(f)
f.close()
f = open('data.csv')
csv_file = csv.writer(f)
for item in data:
csv_file.writerow(item)
f.close()
然后我得到的错误:
sequence expected
样品json文件:
[
{
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_logentry",
"name": "Can add log entry",
"content_type": 8
}
},
{
"pk": 23,
"model": "auth.permission",
"fields": {
"codename": "change_logentry",
"name": "Can change log entry",
"content_type": 8
}
},
{
"pk": 24,
"model": "auth.permission",
"fields": {
"codename": "delete_logentry",
"name": "Can delete log entry",
"content_type": 8
}
},
{
"pk": 4,
"model": "auth.permission",
"fields": {
"codename": "add_group",
"name": "Can add group",
"content_type": 2
}
},
{
"pk": 10,
"model": "auth.permission",
"fields": {
"codename": "add_message",
"name": "Can add message",
"content_type": 4
}
}
]
没有正确的解决方案
其他提示
我不知道这个问题已经或没有得到解决,但让我贴什么,我已经为参考进行。
首先,你的JSON具有嵌套的对象,所以通常不能被直接转换成CSV。 你需要改变的东西是这样的:
{
"pk": 22,
"model": "auth.permission",
"codename": "add_logentry",
"content_type": 8,
"name": "Can add log entry"
},
......]
下面是我的代码,以生成CSV从:
import csv
import json
x = """[
{
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_logentry",
"name": "Can add log entry",
"content_type": 8
}
},
{
"pk": 23,
"model": "auth.permission",
"fields": {
"codename": "change_logentry",
"name": "Can change log entry",
"content_type": 8
}
},
{
"pk": 24,
"model": "auth.permission",
"fields": {
"codename": "delete_logentry",
"name": "Can delete log entry",
"content_type": 8
}
}
]"""
x = json.loads(x)
f = csv.writer(open("test.csv", "wb+"))
# Write CSV Header, If you dont need that, remove this line
f.writerow(["pk", "model", "codename", "name", "content_type"])
for x in x:
f.writerow([x["pk"],
x["model"],
x["fields"]["codename"],
x["fields"]["name"],
x["fields"]["content_type"]])
您将得到作为输出:
pk,model,codename,name,content_type
22,auth.permission,add_logentry,Can add log entry,8
23,auth.permission,change_logentry,Can change log entry,8
24,auth.permission,delete_logentry,Can delete log entry,8
使用的pandas
库时,的这是与使用两个命令容易! 强>
pandas.read_json()
要JSON字符串转换为大熊猫对象(无论是串联或数据帧)。然后,假设将结果存储为df
:
df.to_csv()
哪些可以返回一个字符串或直接写到csv文件。
基于以前的答案的详细程度,我们都应该感谢熊猫快捷方式。
我假设你的JSON文件将解码成词典列表。首先,我们需要将压平JSON对象的函数:
def flattenjson( b, delim ):
val = {}
for i in b.keys():
if isinstance( b[i], dict ):
get = flattenjson( b[i], delim )
for j in get.keys():
val[ i + delim + j ] = get[j]
else:
val[i] = b[i]
return val
您JSON对象上运行该代码段的结果是:
flattenjson( {
"pk": 22,
"model": "auth.permission",
"fields": {
"codename": "add_message",
"name": "Can add message",
"content_type": 8
}
}, "__" )
是
{
"pk": 22,
"model": "auth.permission',
"fields__codename": "add_message",
"fields__name": "Can add message",
"fields__content_type": 8
}
JSON对象的输入阵列中应用该函数到每个字典后:
input = map( lambda x: flattenjson( x, "__" ), input )
和找到相关的列名称:
columns = [ x for row in input for x in row.keys() ]
columns = list( set( columns ) )
不难通过csv模块来运行这样的:
with open( fname, 'wb' ) as out_file:
csv_w = csv.writer( out_file )
csv_w.writerow( columns )
for i_r in input:
csv_w.writerow( map( lambda x: i_r.get( x, "" ), columns ) )
我希望这有助于!
JSON可以代表各种各样的数据结构--一个JS"对象"是大致如蟒蛇的词典(与串钥匙),JS"阵列"大致如Python列表,并且你可以窝他们只作为最后的"叶子"的元素是数字或字符串。
CSV可以基本上仅仅代表一种2-D表--选择地与一个第一行的"头",即"栏中的名称",从而可以使表可解释为一个列表中的字典,而不是通常的解释,一个列表中列出(同样,"叶"的元素可以是数字或字符串)。
因此,在一般情况下,可以不把任意的JSON结构CSV。在一些特殊情况下可以(阵列的数组没有进一步的筑巢;阵列的对象都具有完全相同的密钥)。其特殊的情况下,如果有的话,适用于你的问题吗?解决方案的详细信息取决于其特殊情况下你做的。鉴于令人惊讶的事实,你甚至不说其中一个适用,我猜你可能没有考虑的约束,既不可用的情况下实际上适用,和你的问题是无法解决。但请不要澄清!
一个通用的解决方案,其平移的任何JSON列表平对象为csv。
传递input.json文件作为命令行上的第一个参数。
import csv, json, sys
input = open(sys.argv[1])
data = json.load(input)
input.close()
output = csv.writer(sys.stdout)
output.writerow(data[0].keys()) # header row
for row in data:
output.writerow(row.values())
此代码应该为你工作,假设你的JSON数据是在一个名为data.json
文件。
import json
import csv
with open("data.json") as file:
data = json.load(file)
with open("data.csv", "w") as file:
csv_file = csv.writer(file)
for item in data:
fields = list(item['fields'].values())
csv_file.writerow([item['pk'], item['model']] + fields)
这将是容易使用csv.DictWriter()
,详细的实施可以是这样的:
def read_json(filename):
return json.loads(open(filename).read())
def write_csv(data,filename):
with open(filename, 'w+') as outf:
writer = csv.DictWriter(outf, data[0].keys())
writer.writeheader()
for row in data:
writer.writerow(row)
# implement
write_csv(read_json('test.json'), 'output.csv')
请注意,这里假设你所有的JSON对象具有相同的字段。
下面是参考其可以帮助你。
我是有丹的建议解决方案麻烦,但是这个工作对我来说:
import json
import csv
f = open('test.json')
data = json.load(f)
f.close()
f=csv.writer(open('test.csv','wb+'))
for item in data:
f.writerow([item['pk'], item['model']] + item['fields'].values())
其中 “test.json” 包含以下内容:
[
{"pk": 22, "model": "auth.permission", "fields":
{"codename": "add_logentry", "name": "Can add log entry", "content_type": 8 } },
{"pk": 23, "model": "auth.permission", "fields":
{"codename": "change_logentry", "name": "Can change log entry", "content_type": 8 } }, {"pk": 24, "model": "auth.permission", "fields":
{"codename": "delete_logentry", "name": "Can delete log entry", "content_type": 8 } }
]
如前面的答案中提到在转换JSON来CSV困难是因为JSON文件可以包含嵌套的字典,并且因此是一个多维数据结构经文一个csv这是2D的数据结构。然而,为了把一个多维结构为CSV的好方法是具有与主键绑在一起的多个的CSV。
在您的示例中,第一CSV输出具有列“PK”,“模型”,“字段”作为列。对于“PK”和“模式”的值很容易得到,但因为“田”列包含一个字典,它应该是自己的CSV和因为“代号”出现在作为主键,你可以作为输入使用对于“域”,完成第一个CSV。第二CSV包含从与代号为可以被用来将2周的CSV绑在一起的主键的“字段”列中的字典。
下面是您JSON文件的溶液,其嵌套的字典转换为2周的CSV。
import csv
import json
def readAndWrite(inputFileName, primaryKey=""):
input = open(inputFileName+".json")
data = json.load(input)
input.close()
header = set()
if primaryKey != "":
outputFileName = inputFileName+"-"+primaryKey
if inputFileName == "data":
for i in data:
for j in i["fields"].keys():
if j not in header:
header.add(j)
else:
outputFileName = inputFileName
for i in data:
for j in i.keys():
if j not in header:
header.add(j)
with open(outputFileName+".csv", 'wb') as output_file:
fieldnames = list(header)
writer = csv.DictWriter(output_file, fieldnames, delimiter=',', quotechar='"')
writer.writeheader()
for x in data:
row_value = {}
if primaryKey == "":
for y in x.keys():
yValue = x.get(y)
if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
row_value[y] = str(yValue).encode('utf8')
elif type(yValue) != dict:
row_value[y] = yValue.encode('utf8')
else:
if inputFileName == "data":
row_value[y] = yValue["codename"].encode('utf8')
readAndWrite(inputFileName, primaryKey="codename")
writer.writerow(row_value)
elif primaryKey == "codename":
for y in x["fields"].keys():
yValue = x["fields"].get(y)
if type(yValue) == int or type(yValue) == bool or type(yValue) == float or type(yValue) == list:
row_value[y] = str(yValue).encode('utf8')
elif type(yValue) != dict:
row_value[y] = yValue.encode('utf8')
writer.writerow(row_value)
readAndWrite("data")
我知道它已经有很长的时间,因为这个问题已经被问,但我想我可能添加到其他人的答案,并分享博客文章,我想解释一个非常简洁的方式解决。
下面是链路
打开文件进行写入
employ_data = open('/tmp/EmployData.csv', 'w')
创建CSV作家对象
csvwriter = csv.writer(employ_data)
count = 0
for emp in emp_data:
if count == 0:
header = emp.keys()
csvwriter.writerow(header)
count += 1
csvwriter.writerow(emp.values())
请务必关闭文件以节省内容
employ_data.close()
此工作相对良好。 它展平的JSON将其写入到一个CSV文件。 嵌套元素进行管理:)
这对于蟒3
import json
o = json.loads('your json string') # Be careful, o must be a list, each of its objects will make a line of the csv.
def flatten(o, k='/'):
global l, c_line
if isinstance(o, dict):
for key, value in o.items():
flatten(value, k + '/' + key)
elif isinstance(o, list):
for ov in o:
flatten(ov, '')
elif isinstance(o, str):
o = o.replace('\r',' ').replace('\n',' ').replace(';', ',')
if not k in l:
l[k]={}
l[k][c_line]=o
def render_csv(l):
ftime = True
for i in range(100): #len(l[list(l.keys())[0]])
for k in l:
if ftime :
print('%s;' % k, end='')
continue
v = l[k]
try:
print('%s;' % v[i], end='')
except:
print(';', end='')
print()
ftime = False
i = 0
def json_to_csv(object_list):
global l, c_line
l = {}
c_line = 0
for ov in object_list : # Assumes json is a list of objects
flatten(ov)
c_line += 1
render_csv(l)
json_to_csv(o)
享受。
我的简单的方法来解决这个问题:
创建像一个新的Python文件:json_to_csv.py
添加以下代码:
import csv, json, sys
#if you are not using utf-8 files, remove the next line
sys.setdefaultencoding("UTF-8")
#check if you pass the input file and output file
if sys.argv[1] is not None and sys.argv[2] is not None:
fileInput = sys.argv[1]
fileOutput = sys.argv[2]
inputFile = open(fileInput)
outputFile = open(fileOutput, 'w')
data = json.load(inputFile)
inputFile.close()
output = csv.writer(outputFile)
output.writerow(data[0].keys()) # header row
for row in data:
output.writerow(row.values())
添加这个代码后,保存该文件,并在终端上运行:
蟒json_to_csv.py input.txt中output.csv
我希望这帮助你。
SEEYA!
这不是一个非常聪明的方式做到这一点,但我有同样的问题,这个工作对我来说:
import csv
f = open('data.json')
data = json.load(f)
f.close()
new_data = []
for i in data:
flat = {}
names = i.keys()
for n in names:
try:
if len(i[n].keys()) > 0:
for ii in i[n].keys():
flat[n+"_"+ii] = i[n][ii]
except:
flat[n] = i[n]
new_data.append(flat)
f = open(filename, "r")
writer = csv.DictWriter(f, new_data[0].keys())
writer.writeheader()
for row in new_data:
writer.writerow(row)
f.close()
改性艾力McGail的回答,以支持内
使用列表JSON def flattenjson(self, mp, delim="|"):
ret = []
if isinstance(mp, dict):
for k in mp.keys():
csvs = self.flattenjson(mp[k], delim)
for csv in csvs:
ret.append(k + delim + csv)
elif isinstance(mp, list):
for k in mp:
csvs = self.flattenjson(k, delim)
for csv in csvs:
ret.append(csv)
else:
ret.append(mp)
return ret
谢谢!
import json,csv
t=''
t=(type('a'))
json_data = []
data = None
write_header = True
item_keys = []
try:
with open('kk.json') as json_file:
json_data = json_file.read()
data = json.loads(json_data)
except Exception as e:
print( e)
with open('bar.csv', 'at') as csv_file:
writer = csv.writer(csv_file)#, quoting=csv.QUOTE_MINIMAL)
for item in data:
item_values = []
for key in item:
if write_header:
item_keys.append(key)
value = item.get(key, '')
if (type(value)==t):
item_values.append(value.encode('utf-8'))
else:
item_values.append(value)
if write_header:
writer.writerow(item_keys)
write_header = False
writer.writerow(item_values)
尝试此
import csv, json, sys
input = open(sys.argv[1])
data = json.load(input)
input.close()
output = csv.writer(sys.stdout)
output.writerow(data[0].keys()) # header row
for item in data:
output.writerow(item.values())
此代码适用于任何给定的JSON文件
# -*- coding: utf-8 -*-
"""
Created on Mon Jun 17 20:35:35 2019
author: Ram
"""
import json
import csv
with open("file1.json") as file:
data = json.load(file)
# create the csv writer object
pt_data1 = open('pt_data1.csv', 'w')
csvwriter = csv.writer(pt_data1)
count = 0
for pt in data:
if count == 0:
header = pt.keys()
csvwriter.writerow(header)
count += 1
csvwriter.writerow(pt.values())
pt_data1.close()
亚历克回答是伟大的,但它并没有在有多层嵌套的情况下工作。下面是支持多层嵌套的修改版本。这也使得标题名称有点更好,如果嵌套对象已经指定其自己的密钥(例如火力地堡分析/ BigTable的/ BigQuery资料):
"""Converts JSON with nested fields into a flattened CSV file.
"""
import sys
import json
import csv
import os
import jsonlines
from orderedset import OrderedSet
# from https://stackoverflow.com/a/28246154/473201
def flattenjson( b, prefix='', delim='/', val=None ):
if val == None:
val = {}
if isinstance( b, dict ):
for j in b.keys():
flattenjson(b[j], prefix + delim + j, delim, val)
elif isinstance( b, list ):
get = b
for j in range(len(get)):
key = str(j)
# If the nested data contains its own key, use that as the header instead.
if isinstance( get[j], dict ):
if 'key' in get[j]:
key = get[j]['key']
flattenjson(get[j], prefix + delim + key, delim, val)
else:
val[prefix] = b
return val
def main(argv):
if len(argv) < 2:
raise Error('Please specify a JSON file to parse')
filename = argv[1]
allRows = []
fieldnames = OrderedSet()
with jsonlines.open(filename) as reader:
for obj in reader:
#print obj
flattened = flattenjson(obj)
#print 'keys: %s' % flattened.keys()
fieldnames.update(flattened.keys())
allRows.append(flattened)
outfilename = filename + '.csv'
with open(outfilename, 'w') as file:
csvwriter = csv.DictWriter(file, fieldnames=fieldnames)
csvwriter.writeheader()
for obj in allRows:
csvwriter.writerow(obj)
if __name__ == '__main__':
main(sys.argv)
由于数据似乎是在字典中的格式,它会出现,你应该实际使用csv.DictWriter()来实际输出与适当的报头信息的行。这应该允许转换中处理比较容易。然后,字段名参数将设立的顺序正确,而第一行作为标题的输出将允许它被读取,后来由csv.DictReader()处理。
例如,麦克再通过使用
output = csv.writer(sys.stdout)
output.writerow(data[0].keys()) # header row
for row in data:
output.writerow(row.values())
然而只是改变初始设置到 输出= csv.DictWriter(filesetting,字段名=数据[0] .keys())
请注意,由于在字典中元素的顺序是没有定义,你可能必须明确创建的字段名条目。一旦你这样做,writerow会工作。写操作然后如最初所示工作。
不幸的是我没有enouthg声誉做出了惊人的@Alec McGail回答一个小的贡献。 我用Python3和我都需要地图转换为@Alexis [R评论下面的列表。
Additionaly我已经发现了CSV作家被添加额外的CR到该文件(I具有用于与csv文件内数据的每一行一个空行)。继@Jason R.库姆斯回答这个线程的解决方案是非常简单: CSV在Python添加一个额外的回车
您需要简单地将lineterminator =“\ n”参数添加到csv.writer。这将是:csv_w = csv.writer( out_file, lineterminator='\n' )
出人意料地,我发现没有此处公布的答案至今正确处理所有可能出现的情况(例如,嵌套类型的字典,嵌套列表,无值,等等)。
此解决方案应该在所有情况下工作:
def flatten_json(json):
def process_value(keys, value, flattened):
if isinstance(value, dict):
for key in value.keys():
process_value(keys + [key], value[key], flattened)
elif isinstance(value, list):
for idx, v in enumerate(value):
process_value(keys + [str(idx)], v, flattened)
else:
flattened['__'.join(keys)] = value
flattened = {}
for key in json.keys():
process_value([key], json[key], flattened)
return flattened
您可以使用此代码为JSON文件转换为csv文件 读取该文件后,我的对象转换为数据帧熊猫,然后保存此为CSV文件
import os
import pandas as pd
import json
import numpy as np
data = []
os.chdir('D:\\Your_directory\\folder')
with open('file_name.json', encoding="utf8") as data_file:
for line in data_file:
data.append(json.loads(line))
dataframe = pd.DataFrame(data)
## Saving the dataframe to a csv file
dataframe.to_csv("filename.csv", encoding='utf-8',index= False)
我可能会迟到了,但我觉得,我处理过类似的问题。我有看起来像这样的JSON文件
我只希望从这些JSON文件中提取几个键/值。所以,我写以下代码来提取相同。
"""json_to_csv.py
This script reads n numbers of json files present in a folder and then extract certain data from each file and write in a csv file.
The folder contains the python script i.e. json_to_csv.py, output.csv and another folder descriptions containing all the json files.
"""
import os
import json
import csv
def get_list_of_json_files():
"""Returns the list of filenames of all the Json files present in the folder
Parameter
---------
directory : str
'descriptions' in this case
Returns
-------
list_of_files: list
List of the filenames of all the json files
"""
list_of_files = os.listdir('descriptions') # creates list of all the files in the folder
return list_of_files
def create_list_from_json(jsonfile):
"""Returns a list of the extracted items from json file in the same order we need it.
Parameter
_________
jsonfile : json
The json file containing the data
Returns
-------
one_sample_list : list
The list of the extracted items needed for the final csv
"""
with open(jsonfile) as f:
data = json.load(f)
data_list = [] # create an empty list
# append the items to the list in the same order.
data_list.append(data['_id'])
data_list.append(data['_modelType'])
data_list.append(data['creator']['_id'])
data_list.append(data['creator']['name'])
data_list.append(data['dataset']['_accessLevel'])
data_list.append(data['dataset']['_id'])
data_list.append(data['dataset']['description'])
data_list.append(data['dataset']['name'])
data_list.append(data['meta']['acquisition']['image_type'])
data_list.append(data['meta']['acquisition']['pixelsX'])
data_list.append(data['meta']['acquisition']['pixelsY'])
data_list.append(data['meta']['clinical']['age_approx'])
data_list.append(data['meta']['clinical']['benign_malignant'])
data_list.append(data['meta']['clinical']['diagnosis'])
data_list.append(data['meta']['clinical']['diagnosis_confirm_type'])
data_list.append(data['meta']['clinical']['melanocytic'])
data_list.append(data['meta']['clinical']['sex'])
data_list.append(data['meta']['unstructured']['diagnosis'])
# In few json files, the race was not there so using KeyError exception to add '' at the place
try:
data_list.append(data['meta']['unstructured']['race'])
except KeyError:
data_list.append("") # will add an empty string in case race is not there.
data_list.append(data['name'])
return data_list
def write_csv():
"""Creates the desired csv file
Parameters
__________
list_of_files : file
The list created by get_list_of_json_files() method
result.csv : csv
The csv file containing the header only
Returns
_______
result.csv : csv
The desired csv file
"""
list_of_files = get_list_of_json_files()
for file in list_of_files:
row = create_list_from_json(f'descriptions/{file}') # create the row to be added to csv for each file (json-file)
with open('output.csv', 'a') as c:
writer = csv.writer(c)
writer.writerow(row)
c.close()
if __name__ == '__main__':
write_csv()
我希望这将有助于。有关如何这段代码工作,你可以检查这里细节