解决Python csv.writer的转义字符和引用参数问题-Python教程-PHP中文网

解决Python csv.writer的转义字符和引用参数问题

聖光之護

发布： 2025-07-10 16:38:24

原创

997人浏览过

解决python csv.writer的转义字符和引用参数问题

摘要

本文旨在解决在使用Python的csv.writer时，由于未正确设置delimiter、quotechar、escapechar等参数，导致输出CSV文件内容被双引号包裹的问题。我们将通过一个实际案例，详细讲解如何正确配置这些参数，避免不必要的引用，并提供修改后的代码示例，以确保CSV文件按照预期格式输出。

正文

在使用Python的csv模块处理CSV文件时，csv.writer是一个非常常用的工具。然而，如果不正确地配置其参数，可能会导致一些意想不到的问题，例如输出的CSV文件中的所有字段都被双引号包裹。本文将通过一个具体的例子，展示如何避免这个问题，并提供一个可行的解决方案。

问题描述

假设我们需要编写一个Python脚本，该脚本能够：

立即学习“Python免费学习笔记（深入）”；

读取一个CSV文件。
指定CSV文件中的某些列。
将指定列中的某个字符串A替换为字符串B。
将修改后的数据写入新的CSV文件。

在实现过程中，如果直接使用默认的csv.writer，可能会发现输出的CSV文件中的每一行都被双引号包裹，这并不是我们期望的结果。

示例

假设我们有如下的CSV文件（myreport.csv）：

code1;code2;money1;code3;type_payment;money2
74;1;185.04;10;AMEXCO;36.08
74;1;8.06;11;MASTERCARD;538.30
74;1;892.46;12;VISA;185.04
74;1;75.10;15;MAESTRO;8.06
74;1;63.92;16;BANCOMAT;892.46

登录后复制

我们希望将money1和money2列中的.替换为,。期望的输出如下：

code1;code2;money1;code3;type_payment;money2
74;1;185,04;10;AMEXCO;36,08
74;1;8,06;11;MASTERCARD;538,30
74;1;892,46;12;VISA;185,04
74;1;75,10;15;MAESTRO;8,06
74;1;63,92;16;BANCOMAT;892,46

登录后复制

但是，如果使用不正确的csv.writer配置，可能会得到如下的输出：

code1;code2;money1;code3;type_payment;money2
"74;1;185,04;10;AMEXCO;36,08"
"74;1;8,06;11;MASTERCARD;538,30"
"74;1;892,46;12;VISA;185,04"
"74;1;75,10;15;MAESTRO;8,06"
"74;1;63,92;16;BANCOMAT;892,46"

登录后复制

解决方案

问题的根源在于csv.reader和csv.writer的默认行为。默认情况下，csv.writer可能会自动对包含分隔符的字段进行引用（用双引号包裹）。为了避免这种情况，我们需要显式地指定delimiter（分隔符）、quotechar（引用符）和quoting（引用规则）等参数。

以下是修改后的代码示例：

import csv, io
import os, shutil

result = {}

csv_file_path = 'myreport.csv'
columns_to_process = ['money1', 'money2']
string_to_be_replaced = "."
string_to_replace_with = ","
mydelimiter =  ";"

# 检查文件是否存在
if not os.path.isfile(csv_file_path):
    raise IOError("csv_file_path is not valid or does not exists: {}".format(csv_file_path))

# 检查分隔符是否存在
with open(csv_file_path, 'r') as csvfile:
    first_line = csvfile.readline()
    if mydelimiter not in first_line:
        delimiter_warning_message = "No delimiter found in file first line."
        result['warning_messages'].append(delimiter_warning_message)

# 统计文件行数
NOL = sum(1 for _ in io.open(csv_file_path, "r"))

if NOL > 0:
    # 获取列名
    with open(csv_file_path, 'r') as csvfile:
        columnslist = csv.DictReader(csvfile, delimiter=mydelimiter)      
        list_of_dictcolumns = []
        for row in columnslist:
            list_of_dictcolumns.append(row)
            break  

    first_dictcolumn = list_of_dictcolumns[0]        
    list_of_column_names = list(first_dictcolumn.keys())
    number_of_columns = len(list_of_column_names)

    # 检查列是否存在
    column_existence = [ (column_name in list_of_column_names ) for column_name in columns_to_process ]
    if not all(column_existence):
        raise ValueError("File {} does not contains all the columns given in input for processing:
File columns names: {}
Input columns names: {}".format(csv_file_path, list_of_column_names, columns_to_process))

    # 确定要处理的列的索引
    indexes_of_columns_to_process = [i for i, column_name in enumerate(list_of_column_names) if column_name in columns_to_process]
    print("indexes_of_columns_to_process: ", indexes_of_columns_to_process)

    # 构建输出文件路径
    inputcsv_absname, inputcsv_extension = os.path.splitext(csv_file_path)
    csv_output_file_path = inputcsv_absname + '__output' + inputcsv_extension

    # 定义处理函数
    def replace_string_in_columns(input_csv, output_csv, indexes_of_columns_to_process, string_to_be_replaced, string_to_replace_with):
        number_of_replacements = 0

        with open(input_csv, 'r', newline='') as infile, open(output_csv, 'w', newline='') as outfile:
            reader = csv.reader(infile, quoting=csv.QUOTE_NONE, delimiter=mydelimiter, quotechar='',escapechar='\')
            writer = csv.writer(outfile, quoting=csv.QUOTE_NONE, delimiter=mydelimiter, quotechar='',escapechar='\')

            row_index=0

            for row in reader:              
                for col_index in indexes_of_columns_to_process:
                    # 处理空行
                    if not row:
                        continue

                    cell = row[col_index]
                    if string_to_be_replaced in cell and row_index != 0:                        
                        # 进行替换
                        cell = cell.replace(string_to_be_replaced, string_to_replace_with)
                        number_of_replacements += 1
                        row[col_index] = cell  # Update the row with the replaced cell

                # 写入新文件
                writer.writerow(row)
                row_index+=1

        return number_of_replacements

    # 执行替换
    result['number_of_modified_cells'] =  replace_string_in_columns(csv_file_path, csv_output_file_path, indexes_of_columns_to_process, string_to_be_replaced, string_to_replace_with)

    # 替换原始文件
    shutil.copyfile(csv_output_file_path, csv_file_path)
    os.remove(csv_output_file_path)

    result['changed'] = result['number_of_modified_cells'] > 0
else:
    result['changed'] = False

result['source_csv_number_of_raw_lines'] = NOL
result['source_csv_number_of_lines'] = NOL - 1

print("result:

", result)

登录后复制

关键修改

在上述代码中，我们修改了csv.reader和csv.writer的初始化方式：

reader = csv.reader(infile, quoting=csv.QUOTE_NONE, delimiter=mydelimiter, quotechar='',escapechar='\')
writer = csv.writer(outfile, quoting=csv.QUOTE_NONE, delimiter=mydelimiter, quotechar='',escapechar='\')

登录后复制

这里，我们做了以下设置：