使用Python的内置.csv模块编写

https://stackoverflow.com/questions/1020053

06-07-2019
|

题

[请注意，这是一个与已经回答的问题不同的问题如何使用Python的内置.csv编写器模块替换列？]

我需要在巨大的Excel .csv文件中进行查找和替换（特定于一列URL）。由于我正处于尝试自学脚本语言的初级阶段，我想我会尝试在python中实现该解决方案。

当我在更改条目内容后尝试写回.csv文件时遇到了麻烦。我已经阅读了有关如何使用编写器的官方csv模块文档，但是没有一个例子涵盖了这种情况。具体来说，我试图在一个循环中完成读取，替换和写入操作。但是，在for循环的参数和writer.writerow（）的参数中都不能使用相同的'row'引用。所以，一旦我在for循环中进行了更改，我应该如何写回文件？

编辑：我实施了S. Lott和Jimmy的建议，结果仍然相同

编辑＃2：我添加了“rb”和“wb”根据S. Lott的建议，使用open（）函数

import csv

#filename = 'C:/Documents and Settings/username/My Documents/PALTemplateData.xls'

csvfile = open("PALTemplateData.csv","rb")
csvout = open("PALTemplateDataOUT.csv","wb")
reader = csv.reader(csvfile)
writer = csv.writer(csvout)

changed = 0;

for row in reader:
    row[-1] = row[-1].replace('/?', '?')
    writer.writerow(row)                  #this is the line that's causing issues
    changed=changed+1

print('Total URLs changed:', changed)

编辑：供您参考，这是解释器的新完整追溯：

Traceback (most recent call last):
  File "C:\Documents and Settings\g41092\My Documents\palScript.py", line 13, in <module>
    for row in reader:
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

解决方案

您无法读取和写入同一文件。

source = open("PALTemplateData.csv","rb")
reader = csv.reader(source , dialect)

target = open("AnotherFile.csv","wb")
writer = csv.writer(target , dialect)

ALL文件操作的常规方法是创建原始文件的修改后的COPY。不要尝试更新文件。这只是一个糟糕的计划。

修改

在行

source = open("PALTemplateData.csv","rb") target = open("AnotherFile.csv","wb")

“rb”和“wb”是绝对必要的。每次忽略这些内容时，都会打开文件以便以错误的格式阅读。

你必须使用“rb”读取.CSV文件。 Python 2.x别无选择。使用Python 3.x，您可以省略它，但使用“r”。明确地说清楚。

你必须使用“wb”写一个.CSV文件。 Python 2.x别无选择。使用Python 3.x，您必须使用“w”。

修改

看来你正在使用Python3。你需要放弃“b”。来自“rb”和“wb”。

阅读： http://docs.python.org/3.0/库/ functions.html＃开放

其他提示

以二进制文件打开csv文件是错误的。 CSV是普通文本文件，因此您需要使用
打开它们
source = open("PALTemplateData.csv","r") target = open("AnotherFile.csv","w")

错误

_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

是因为你是以二进制模式打开它们。

当我用python打开excel csv时，我使用了类似的东西：

try: # checking if file exists f = csv.reader(open(filepath, "r", encoding="cp1250"), delimiter=";", quotechar='"') except IOError: f = [] for record in f: # do something with record

它工作得相当快（我打开两个大约10MB的每个csv文件，虽然我用python 2.6，而不是3.0版本）。

在python中使用excel csv文件的工作模块很少 - pyExcelerator 是其中之一它们。

问题是你正在尝试写入你正在读取的同一个文件。写入另一个文件，然后在删除原始文件后重命名。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow