Python的Excel表格处理——软测日常（复制粘贴太麻烦了）【章节一】

文851

1059人浏览 · 2025-08-21 16:25:13

文851 · 2025-08-21 16:25:13 发布

前言

公司给实习生上上强度了，派了一大堆整理文档的活，脑瓜子一抽，直接手动复制粘贴，人都给整麻了，索性，让AI帮我解决问题吧，So，祭出了Python代码大法，由于资料无法上传网上，而且工作的电脑不一样，索性就和AI合作吧（我才是工具人🐕‍🦺），写了一个Python的脚本，一键插入想要的行列内容，一键关联表格内容等等。

一、提示词

首先，打开我们亲爱的插件界面，还是熟悉的通义灵码，这个好使啊，来看看我的提示词，越完整越好，避免模型自行给你脑补东西进去，同时注意选择模型，防止模型幻觉

这样的，我需要你帮我写一个程序，让我能够实现表格的内容插入，要求：1、插入内容和插入目标的表格，有一栏问题标识，这一栏是互相对应的，要插入的内容是从插入目标表后的问题标识到插入内容表格的问题标识去找，在插入内容表格找到后，把相应的标题栏对应的内容插入到插入目标表格中。2、我不需要你在生成其他的表格，就是把插入内容表格里面的东西插入的我已经建立好的插入目标表格中。3、插入目标表格后面的问题标识栏可能不止一个问题，比如会有这样的情况：SJZZYZX-WT001SJZYZX-WT002，你会把他们认为是一个问题标识，但是实际上他们是2个问题标识，因为在表格中，他们是使用换行隔开的，注意这个要点，不要识别错误，同时你应该加上检测的功能，有可能下次我使用的是顿号“、”等其他符号隔开，防止你无法识别。4、我插入目标的表格可能有我以前插入的东西（见：插入目标-副本），你在插入的时候，不能把我表格原有的数据覆盖或者清除，你只能在我的数据后面进行继续添加插入。5、现在我需要实现你根据插入目标表格后面的问题标识栏的内容到插入内容表格搜索把插入内容表格的标题栏数据插入到插入目标表格的实际输出栏。6、注意，如果插入目标表格的实际输出栏如果没有编号，则你可以自行编号，如1、2、3、4等等这样，注意编号是以一个格子内为标准，举个例子：插入目标表格的序号1这一行过来，它的问题标识是有2个标识SJZYZX-WT002SJZYZX-WT054（我是用换行隔开的，但是这里显示不出来，所以你知道就可以了，这是2个问题标识）所以我需要到插入内容表格中，找到问题标识对应的栏目中他们的标题内容，然后把他们插入到插入目标表格序号1这1行中，因为有2个问题标识，所以编号时，编号为1和2，有2条，继续到下面插入目标表格序号2时的栏，继续找它后面的问题标识，然后编号，这个编号是重新开始，这一栏有1条，就是1，而不是继续着顺序下来。如果本来里面有内容，有了1和2两条，那么你找到问题标识栏的标题内容插入过来时，接着对实际输出里面有的内容继续编号，所以应该是3。7、你生成一个可视化界面让我方便操作和测试功能。

静等它生成代码，然后复制，测试完工，想想还是很兴奋的，AI可棒了大忙而且牛逼。

二、代码生成

生成的代码如下：

import tkinter as tk
from tkinter import filedialog, messagebox, ttk
import openpyxl
import re
from collections import defaultdict
#  插入工具初步

class ExcelInserter:
    def __init__(self):
        self.root = tk.Tk()
        self.root.title("Excel内容插入工具")
        self.root.geometry("600x500")

        self.target_file = tk.StringVar()
        self.content_file = tk.StringVar()

        self.create_widgets()

    def create_widgets(self):
        # 主框架
        main_frame = ttk.Frame(self.root, padding="10")
        main_frame.grid(row=0, column=0, sticky=(tk.W, tk.E, tk.N, tk.S))

        # 文件选择区域
        file_frame = ttk.LabelFrame(main_frame, text="文件选择", padding="10")
        file_frame.grid(row=0, column=0, columnspan=2, sticky=(tk.W, tk.E), pady=(0, 10))

        # 插入目标文件
        ttk.Label(file_frame, text="插入目标文件:").grid(row=0, column=0, sticky=tk.W, pady=(0, 5))
        ttk.Entry(file_frame, textvariable=self.target_file, width=50).grid(row=1, column=0, sticky=(tk.W, tk.E))
        ttk.Button(file_frame, text="浏览", command=self.browse_target).grid(row=1, column=1, padx=(5, 0))

        # 插入内容文件
        ttk.Label(file_frame, text="插入内容文件:").grid(row=2, column=0, sticky=tk.W, pady=(10, 5))
        ttk.Entry(file_frame, textvariable=self.content_file, width=50).grid(row=3, column=0, sticky=(tk.W, tk.E))
        ttk.Button(file_frame, text="浏览", command=self.browse_content).grid(row=3, column=1, padx=(5, 0))

        # 分隔符设置
        sep_frame = ttk.LabelFrame(main_frame, text="分隔符设置", padding="10")
        sep_frame.grid(row=1, column=0, columnspan=2, sticky=(tk.W, tk.E), pady=(0, 10))

        self.separator = tk.StringVar(value="\\n")
        ttk.Radiobutton(sep_frame, text="换行符", variable=self.separator, value="\\n").grid(row=0, column=0,
                                                                                             sticky=tk.W)
        ttk.Radiobutton(sep_frame, text="顿号(、)", variable=self.separator, value="、").grid(row=0, column=1,
                                                                                            sticky=tk.W)
        ttk.Radiobutton(sep_frame, text="分号(；)", variable=self.separator, value="；").grid(row=0, column=2,
                                                                                            sticky=tk.W)
        ttk.Radiobutton(sep_frame, text="逗号(,)", variable=self.separator, value=",").grid(row=1, column=0,
                                                                                            sticky=tk.W)
        ttk.Radiobutton(sep_frame, text="空格", variable=self.separator, value=" ").grid(row=1, column=1, sticky=tk.W)

        # 自定义分隔符
        ttk.Label(sep_frame, text="自定义分隔符:").grid(row=2, column=0, sticky=tk.W, pady=(10, 0))
        self.custom_separator = tk.StringVar()
        ttk.Entry(sep_frame, textvariable=self.custom_separator, width=20).grid(row=2, column=1, sticky=tk.W,
                                                                                pady=(10, 0))

        # 操作按钮
        button_frame = ttk.Frame(main_frame)
        button_frame.grid(row=2, column=0, columnspan=2, pady=(20, 0))

        ttk.Button(button_frame, text="执行插入", command=self.insert_content).pack(side=tk.LEFT, padx=(0, 10))
        ttk.Button(button_frame, text="退出", command=self.root.quit).pack(side=tk.LEFT)

        # 进度和状态显示
        self.status_var = tk.StringVar(value="请选择文件并点击执行插入")
        self.status_bar = ttk.Label(main_frame, textvariable=self.status_var, relief=tk.SUNKEN)
        self.status_bar.grid(row=3, column=0, columnspan=2, sticky=(tk.W, tk.E), pady=(20, 0))

        # 配置网格权重
        self.root.columnconfigure(0, weight=1)
        self.root.rowconfigure(0, weight=1)
        main_frame.columnconfigure(0, weight=1)
        file_frame.columnconfigure(0, weight=1)

    def browse_target(self):
        filename = filedialog.askopenfilename(
            title="选择插入目标文件",
            filetypes=[("Excel文件", "*.xlsx"), ("所有文件", "*.*")]
        )
        if filename:
            self.target_file.set(filename)

    def browse_content(self):
        filename = filedialog.askopenfilename(
            title="选择插入内容文件",
            filetypes=[("Excel文件", "*.xlsx"), ("所有文件", "*.*")]
        )
        if filename:
            self.content_file.set(filename)

    def get_separator(self):
        if self.custom_separator.get():
            return self.custom_separator.get()
        elif self.separator.get() == "\\n":
            return "\n"
        else:
            return self.separator.get()

    def parse_problem_ids(self, text, separator):
        """解析问题标识字符串，支持多种分隔符"""
        if not text:
            return []

        # 如果是自定义分隔符
        if separator != "\n":
            return [pid.strip() for pid in text.split(separator) if pid.strip()]
        else:
            # 处理换行符分隔
            return [pid.strip() for pid in text.splitlines() if pid.strip()]

    def insert_content(self):
        try:
            # 检查文件是否选择
            if not self.target_file.get() or not self.content_file.get():
                messagebox.showerror("错误", "请先选择两个Excel文件")
                return

            target_path = self.target_file.get()
            content_path = self.content_file.get()
            separator = self.get_separator()

            self.status_var.set("正在读取文件...")
            self.root.update()

            # 打开工作簿
            target_wb = openpyxl.load_workbook(target_path)
            content_wb = openpyxl.load_workbook(content_path)

            target_ws = target_wb.active
            content_ws = content_wb.active

            # 查找关键列
            target_headers = [cell.value for cell in target_ws[1]]
            content_headers = [cell.value for cell in content_ws[1]]

            # 确定各列位置
            problem_id_col = None
            actual_output_col = None
            serial_number_col = None

            for i, header in enumerate(target_headers):
                if header and "问题标识" in str(header):
                    problem_id_col = i
                elif header and "实际输出" in str(header):
                    actual_output_col = i
                elif header and "序号" in str(header):
                    serial_number_col = i

            if problem_id_col is None or actual_output_col is None:
                messagebox.showerror("错误", "在目标文件中未找到'问题标识'或'实际输出'列")
                return

            # 在内容文件中查找标题列
            title_col = None
            content_problem_id_col = None

            for i, header in enumerate(content_headers):
                if header and "标题" in str(header):
                    title_col = i
                elif header and "问题标识" in str(header):
                    content_problem_id_col = i

            if title_col is None or content_problem_id_col is None:
                messagebox.showerror("错误", "在内容文件中未找到'标题'或'问题标识'列")
                return

            # 创建问题标识到标题的映射
            problem_to_title = {}
            for row in range(2, content_ws.max_row + 1):
                problem_id = content_ws.cell(row=row, column=content_problem_id_col + 1).value
                title = content_ws.cell(row=row, column=title_col + 1).value
                if problem_id and title:
                    problem_to_title[str(problem_id).strip()] = str(title).strip()

            # 处理目标文件中的每一行
            for row_idx in range(2, target_ws.max_row + 1):
                problem_ids_cell = target_ws.cell(row=row_idx, column=problem_id_col + 1)
                actual_output_cell = target_ws.cell(row=row_idx, column=actual_output_col + 1)

                problem_ids_text = problem_ids_cell.value
                if not problem_ids_text:
                    continue

                # 解析问题标识
                problem_ids = self.parse_problem_ids(str(problem_ids_text), separator)
                if not problem_ids:
                    continue

                # 获取当前行已有的输出内容行数
                existing_content = []
                if actual_output_cell.value:
                    existing_content = str(actual_output_cell.value).splitlines()

                # 为新内容编号
                start_number = len(existing_content) + 1
                new_content = existing_content[:]

                # 根据问题标识查找并添加标题内容
                for i, problem_id in enumerate(problem_ids):
                    if problem_id in problem_to_title:
                        number = start_number + i
                        new_content.append(f"{number}.{problem_to_title[problem_id]}")

                # 更新单元格内容
                if new_content:
                    actual_output_cell.value = "\n".join(new_content)

            # 保存文件
            self.status_var.set("正在保存文件...")
            self.root.update()

            target_wb.save(target_path)

            messagebox.showinfo("成功", "内容插入完成！")
            self.status_var.set("操作完成")

        except Exception as e:
            messagebox.showerror("错误", f"操作过程中出现错误:\n{str(e)}")
            self.status_var.set("操作失败")

    def run(self):
        self.root.mainloop()


# 创建并运行应用程序
if __name__ == "__main__":
    app = ExcelInserter()
    app.run()

事实上，经过测试，其实还是有些地方没有达到自己的要求，以下见测试

三、效果测试

代码运行效果如下：

有了一个可视化界面，挺方便的，然后就是文件插入测试了，发现：

然后接着让AI生成，事实上，AI确实可以处理很多问题，虽然可能不是一次就解决完所有问题，这个时候，就需要你不断的去发现那些地方有问题，或者需要优化什么，感觉就是最忌过于笼统的给AI描述，这样的话，AI生成的代码，可以说是没有改变的，什么AI也在偷懒，尝试了几次后：

OK现在生成的代码比较符合要求了，继续测试优化。

四、进行优化

让AI帮我加一个日志的功能，这样有错，或者处理东西出问题时可以及时的发现，方便排查，这样可以让AI快速的帮我修改代码，很是作弊，但是AI目前来说，算力还是不足，超过一千行代码，可能就无法生成了，虽然可以尝试一部分一部分的生成，实际上AI的记忆能力有限，可能代码还没生成完，AI就开始选择性遗忘，导致代码逻辑出问题，或者生成的不是自己想要的内容。

后面发现问题，插入的时候，如果换行识别，可能会有问题，呢，接着让AI修改（事实上，一开始问了很多次，AI还是没有改处出来这个错误，只能自己手动去修改代码，确实很麻烦，后来，转念一想，其实是有日志的，我把日志的问题，结合详细的描述，然后让AI再次重新生成，这次终于把完解决了）

代码如下，这是优化后的代码和运行截图：

import tkinter as tk
from tkinter import filedialog, messagebox, ttk
import openpyxl
import os
from datetime import datetime
# 插入工具最终

class ExcelInserter:
    def __init__(self):
        self.root = tk.Tk()
        self.root.title("Excel内容插入工具")
        self.root.geometry("600x700")

        self.target_file = tk.StringVar()
        self.content_file = tk.StringVar()

        # 存储文件的列信息
        self.target_headers = []
        self.content_headers = []

        self.create_widgets()

    def create_widgets(self):
        # 主框架
        main_frame = ttk.Frame(self.root, padding="10")
        main_frame.grid(row=0, column=0, sticky=(tk.W, tk.E, tk.N, tk.S))

        # 标题
        title_label = ttk.Label(main_frame, text="Excel处理工具", font=("Arial", 16, "bold"))
        title_label.grid(row=0, column=0, columnspan=3, pady=(0, 20), sticky=tk.W)

        # 文件选择区域
        file_frame = ttk.LabelFrame(main_frame, text="文件选择", padding="10")
        file_frame.grid(row=1, column=0, columnspan=3, sticky=(tk.W, tk.E), pady=(0, 10))

        # 插入目标文件
        ttk.Label(file_frame, text="插入目标文件:").grid(row=0, column=0, sticky=tk.W, pady=(0, 5))
        ttk.Entry(file_frame, textvariable=self.target_file, width=50).grid(row=1, column=0, sticky=(tk.W, tk.E))
        ttk.Button(file_frame, text="浏览", command=self.browse_target).grid(row=1, column=1, padx=(5, 0))

        # 插入内容文件
        ttk.Label(file_frame, text="插入内容文件:").grid(row=2, column=0, sticky=tk.W, pady=(10, 5))
        ttk.Entry(file_frame, textvariable=self.content_file, width=50).grid(row=3, column=0, sticky=(tk.W, tk.E))
        ttk.Button(file_frame, text="浏览", command=self.browse_content).grid(row=3, column=1, padx=(5, 0))

        # 列选择区域
        col_selection_frame = ttk.LabelFrame(main_frame, text="列选择设置", padding="10")
        col_selection_frame.grid(row=2, column=0, columnspan=3, sticky=(tk.W, tk.E), pady=(0, 10))

        # 目标文件列选择
        ttk.Label(col_selection_frame, text="目标文件匹配列:").grid(row=0, column=0, sticky=tk.W, pady=(0, 5))
        self.target_match_col = tk.StringVar()
        self.target_match_combo = ttk.Combobox(col_selection_frame, textvariable=self.target_match_col,
                                               state="readonly", width=20)
        self.target_match_combo.grid(row=0, column=1, sticky=tk.W, pady=(0, 5), padx=(5, 0))

        ttk.Label(col_selection_frame, text="目标文件插入列:").grid(row=1, column=0, sticky=tk.W, pady=(0, 5))
        self.target_insert_col = tk.StringVar()
        self.target_insert_combo = ttk.Combobox(col_selection_frame, textvariable=self.target_insert_col,
                                                state="readonly", width=20)
        self.target_insert_combo.grid(row=1, column=1, sticky=tk.W, pady=(0, 5), padx=(5, 0))

        # 内容文件列选择
        ttk.Label(col_selection_frame, text="内容文件匹配列:").grid(row=0, column=2, sticky=tk.W, pady=(0, 5),
                                                                    padx=(20, 0))
        self.content_match_col = tk.StringVar()
        self.content_match_combo = ttk.Combobox(col_selection_frame, textvariable=self.content_match_col,
                                                state="readonly", width=20)
        self.content_match_combo.grid(row=0, column=3, sticky=tk.W, pady=(0, 5), padx=(5, 0))

        ttk.Label(col_selection_frame, text="内容文件插入列:").grid(row=1, column=2, sticky=tk.W, pady=(0, 5),
                                                                    padx=(20, 0))
        self.content_insert_col = tk.StringVar()
        self.content_insert_combo = ttk.Combobox(col_selection_frame, textvariable=self.content_insert_col,
                                                 state="readonly", width=20)
        self.content_insert_combo.grid(row=1, column=3, sticky=tk.W, pady=(0, 5), padx=(5, 0))

        # 分隔符检测设置
        sep_frame = ttk.LabelFrame(main_frame, text="分隔符检测设置", padding="10")
        sep_frame.grid(row=3, column=0, columnspan=3, sticky=(tk.W, tk.E), pady=(0, 10))

        ttk.Label(sep_frame, text="检测分隔符:").grid(row=0, column=0, sticky=tk.W, pady=(0, 5))
        self.separator = tk.StringVar(value="换行符")
        separator_choices = ["换行符", "顿号(、)", "分号(；)", "逗号(,)", "空格"]
        self.separator_combo = ttk.Combobox(sep_frame, textvariable=self.separator, values=separator_choices,
                                            state="readonly", width=20)
        self.separator_combo.grid(row=0, column=1, sticky=tk.W, pady=(0, 5))

        # 自定义分隔符
        ttk.Label(sep_frame, text="自定义分隔符:").grid(row=1, column=0, sticky=tk.W, pady=(5, 0))
        self.custom_separator = tk.StringVar()
        ttk.Entry(sep_frame, textvariable=self.custom_separator, width=20).grid(row=1, column=1, sticky=tk.W,
                                                                                pady=(5, 0))

        # 编号方式设置
        numbering_frame = ttk.LabelFrame(main_frame, text="编号方式设置", padding="10")
        numbering_frame.grid(row=4, column=0, columnspan=3, sticky=(tk.W, tk.E), pady=(0, 10))

        ttk.Label(numbering_frame, text="编号方式:").grid(row=0, column=0, sticky=tk.W, pady=(0, 5))
        self.numbering_style = tk.StringVar(value="数字(1,2,3...)")
        numbering_choices = ["数字(1,2,3...)", "字母(a,b,c...)", "罗马数字(i,ii,iii...)", "无编号", "顿号分隔"]
        self.numbering_combo = ttk.Combobox(numbering_frame, textvariable=self.numbering_style,
                                            values=numbering_choices, state="readonly", width=20)
        self.numbering_combo.grid(row=0, column=1, sticky=tk.W, pady=(0, 5))

        # 操作按钮
        button_frame = ttk.Frame(main_frame)
        button_frame.grid(row=5, column=0, columnspan=3, pady=(20, 0))

        ttk.Button(button_frame, text="执行", command=self.insert_content).pack(side=tk.LEFT, padx=(0, 10))
        ttk.Button(button_frame, text="清空", command=self.clear_inputs).pack(side=tk.LEFT, padx=(0, 10))
        ttk.Button(button_frame, text="退出", command=self.root.quit).pack(side=tk.LEFT)

        # 日志显示区域
        log_frame = ttk.LabelFrame(main_frame, text="执行日志", padding="10")
        log_frame.grid(row=6, column=0, columnspan=3, sticky=(tk.W, tk.E, tk.N, tk.S), pady=(10, 0))
        log_frame.columnconfigure(0, weight=1)
        log_frame.rowconfigure(0, weight=1)

        self.log_text = tk.Text(log_frame, height=10)
        log_scrollbar = ttk.Scrollbar(log_frame, orient="vertical", command=self.log_text.yview)
        self.log_text.configure(yscrollcommand=log_scrollbar.set)

        self.log_text.grid(row=0, column=0, sticky=(tk.W, tk.E, tk.N, tk.S))
        log_scrollbar.grid(row=0, column=1, sticky=(tk.N, tk.S))

        # 进度和状态显示
        self.status_var = tk.StringVar(value="就绪")
        self.status_bar = ttk.Label(main_frame, textvariable=self.status_var, relief=tk.SUNKEN)
        self.status_bar.grid(row=7, column=0, columnspan=3, sticky=(tk.W, tk.E), pady=(20, 0))

        # 配置网格权重
        self.root.columnconfigure(0, weight=1)
        self.root.rowconfigure(0, weight=1)
        main_frame.columnconfigure(0, weight=1)
        main_frame.rowconfigure(6, weight=1)  # 日志区域可扩展
        file_frame.columnconfigure(0, weight=1)

    def log(self, message):
        """记录日志信息"""
        timestamp = datetime.now().strftime("%H:%M:%S")
        self.log_text.insert(tk.END, f"[{timestamp}] {message}\n")
        self.log_text.see(tk.END)  # 自动滚动到最新日志
        self.root.update_idletasks()

    def browse_target(self):
        filename = filedialog.askopenfilename(
            title="选择插入目标文件",
            filetypes=[("Excel文件", "*.xlsx"), ("所有文件", "*.*")]
        )
        if filename:
            self.target_file.set(filename)
            self.load_target_headers()

    def browse_content(self):
        filename = filedialog.askopenfilename(
            title="选择插入内容文件",
            filetypes=[("Excel文件", "*.xlsx"), ("所有文件", "*.*")]
        )
        if filename:
            self.content_file.set(filename)
            self.load_content_headers()

    def load_target_headers(self):
        """加载目标文件的列头"""
        try:
            if self.target_file.get():
                target_wb = openpyxl.load_workbook(self.target_file.get())
                target_ws = target_wb.active
                self.target_headers = [str(cell.value) if cell.value else f"列{i + 1}"
                                       for i, cell in enumerate(target_ws[1])]

                # 更新下拉框选项
                self.target_match_combo['values'] = self.target_headers
                self.target_insert_combo['values'] = self.target_headers

                # 设置默认值
                if self.target_headers:
                    self.target_match_col.set(self.target_headers[0])
                    self.target_insert_col.set(
                        self.target_headers[0] if len(self.target_headers) > 1 else self.target_headers[0])

                self.log(f"已加载目标文件列头: {', '.join(self.target_headers)}")
        except Exception as e:
            messagebox.showerror("错误", f"读取目标文件列头时出错:\n{str(e)}")
            self.log(f"读取目标文件列头时出错: {str(e)}")

    def load_content_headers(self):
        """加载内容文件的列头"""
        try:
            if self.content_file.get():
                content_wb = openpyxl.load_workbook(self.content_file.get())
                content_ws = content_wb.active
                self.content_headers = [str(cell.value) if cell.value else f"列{i + 1}"
                                        for i, cell in enumerate(content_ws[1])]

                # 更新下拉框选项
                self.content_match_combo['values'] = self.content_headers
                self.content_insert_combo['values'] = self.content_headers

                # 设置默认值
                if self.content_headers:
                    self.content_match_col.set(self.content_headers[0])
                    self.content_insert_col.set(
                        self.content_headers[1] if len(self.content_headers) > 1 else self.content_headers[0])

                self.log(f"已加载内容文件列头: {', '.join(self.content_headers)}")
        except Exception as e:
            messagebox.showerror("错误", f"读取内容文件列头时出错:\n{str(e)}")
            self.log(f"读取内容文件列头时出错: {str(e)}")

    def clear_inputs(self):
        """清空输入"""
        self.target_file.set("")
        self.content_file.set("")
        self.status_var.set("就绪")
        self.custom_separator.set("")

        # 清空下拉框选项
        self.target_match_combo['values'] = []
        self.target_insert_combo['values'] = []
        self.content_match_combo['values'] = []
        self.content_insert_combo['values'] = []

        # 清空选择
        self.target_match_col.set("")
        self.target_insert_col.set("")
        self.content_match_col.set("")
        self.content_insert_col.set("")

        # 清空日志
        self.log_text.delete(1.0, tk.END)

        self.log("已清空所有输入和日志")

    def get_separator(self):
        if self.custom_separator.get():
            return self.custom_separator.get()
        elif self.separator.get() == "换行符":
            return "\n"
        elif self.separator.get() == "顿号(、)":
            return "、"
        elif self.separator.get() == "分号(；)":
            return "；"
        elif self.separator.get() == "逗号(,)":
            return ","
        elif self.separator.get() == "空格":
            return " "
        else:
            return "\n"

    def get_number_prefix(self, number):
        """根据选择的编号方式返回前缀"""
        style = self.numbering_style.get()

        if style == "无编号" or style == "顿号分隔":
            return ""
        elif style == "字母(a,b,c...)":
            # 简单的字母编号 (a-z, 然后是 aa, ab 等)
            result = ""
            n = number
            while n > 0:
                n, remainder = divmod(n - 1, 26)
                result = chr(97 + remainder) + result
            return result
        elif style == "罗马数字(i,ii,iii...)":
            # 简单的罗马数字转换 (支持1-3999)
            return self.int_to_roman(number).lower()
        else:  # 默认数字编号
            return str(number)

    def int_to_roman(self, num):
        """将整数转换为罗马数字"""
        val = [
            1000, 900, 500, 400,
            100, 90, 50, 40,
            10, 9, 5, 4,
            1
        ]
        syms = [
            "M", "CM", "D", "CD",
            "C", "XC", "L", "XL",
            "X", "IX", "V", "IV",
            "I"
        ]
        roman_num = ''
        i = 0
        while num > 0:
            for _ in range(num // val[i]):
                roman_num += syms[i]
                num -= val[i]
            i += 1
        return roman_num

    def parse_problem_ids(self, text, separator):
        """解析问题标识字符串，支持多种分隔符"""
        if not text:
            return []

        # 如果是自定义分隔符
        if separator != "\n":
            return [pid.strip() for pid in text.split(separator) if pid.strip()]
        else:
            # 处理换行符分隔
            return [pid.strip() for pid in text.splitlines() if pid.strip()]

    def smart_parse_problem_ids(self, text):
        """
        智能解析问题标识字符串，支持混合分隔符
        优先检测换行符，然后是顿号，分号，逗号，空格
        """
        if not text:
            return []

        # 先尝试按换行符分割
        lines = text.splitlines()
        if len(lines) > 1:
            # 如果有多个换行，说明主要用换行符分隔
            result = []
            for line in lines:
                if line.strip():
                    result.append(line.strip())
            self.log(f"检测到换行符分隔，解析出 {len(result)} 个项目")
            return result

        # 检查是否有顿号分隔
        if "、" in text:
            items = [item.strip() for item in text.split("、") if item.strip()]
            self.log(f"检测到顿号分隔，解析出 {len(items)} 个项目")
            return items

        # 检查是否有分号分隔
        if "；" in text:
            items = [item.strip() for item in text.split("；") if item.strip()]
            self.log(f"检测到分号分隔，解析出 {len(items)} 个项目")
            return items

        # 检查是否有逗号分隔
        if "," in text:
            items = [item.strip() for item in text.split(",") if item.strip()]
            self.log(f"检测到逗号分隔，解析出 {len(items)} 个项目")
            return items

        # 检查是否有空格分隔
        if " " in text:
            items = [item.strip() for item in text.split(" ") if item.strip()]
            self.log(f"检测到空格分隔，解析出 {len(items)} 个项目")
            return items

        # 如果没有明显分隔符，返回单个项目
        if text.strip():
            self.log("未检测到明显分隔符，作为单个项目处理")
            return [text.strip()]
        else:
            return []

    def insert_content(self):
        try:
            self.log("开始执行插入操作...")

            # 检查文件是否选择
            if not self.target_file.get() or not self.content_file.get():
                messagebox.showerror("错误", "请先选择两个Excel文件")
                self.log("错误: 未选择文件")
                return

            # 检查列是否选择
            if not all([self.target_match_col.get(), self.target_insert_col.get(),
                        self.content_match_col.get(), self.content_insert_col.get()]):
                messagebox.showerror("错误", "请先选择所有列")
                self.log("错误: 未选择所有列")
                return

            target_path = self.target_file.get()
            content_path = self.content_file.get()
            separator = self.get_separator()
            numbering_style = self.numbering_style.get()

            self.status_var.set("正在读取文件...")
            self.log(f"读取目标文件: {target_path}")
            self.log(f"读取内容文件: {content_path}")
            self.log(f"使用分隔符设置: {self.separator.get()}")
            self.log(f"编号方式: {numbering_style}")
            self.root.update()

            # 打开工作簿
            target_wb = openpyxl.load_workbook(target_path)
            content_wb = openpyxl.load_workbook(content_path)

            target_ws = target_wb.active
            content_ws = content_wb.active

            # 获取列索引（0-based）
            target_match_idx = self.target_headers.index(self.target_match_col.get())
            target_insert_idx = self.target_headers.index(self.target_insert_col.get())
            content_match_idx = self.content_headers.index(self.content_match_col.get())
            content_insert_idx = self.content_headers.index(self.content_insert_col.get())

            self.log(f"目标文件匹配列: {self.target_match_col.get()} (列 {target_match_idx + 1})")
            self.log(f"目标文件插入列: {self.target_insert_col.get()} (列 {target_insert_idx + 1})")
            self.log(f"内容文件匹配列: {self.content_match_col.get()} (列 {content_match_idx + 1})")
            self.log(f"内容文件插入列: {self.content_insert_col.get()} (列 {content_insert_idx + 1})")

            # 创建匹配列到插入内容的映射（支持一对多）
            match_to_content = {}
            content_rows = content_ws.max_row - 1  # 减去标题行
            self.log(f"开始处理内容文件，共 {content_rows} 行数据...")

            for row in range(2, content_ws.max_row + 1):
                match_value = content_ws.cell(row=row, column=content_match_idx + 1).value
                insert_value = content_ws.cell(row=row, column=content_insert_idx + 1).value
                if match_value:
                    # 使用智能解析来处理内容文件中的匹配值
                    match_keys = self.smart_parse_problem_ids(str(match_value))
                    content_text = str(insert_value).strip() if insert_value else ""

                    for match_key in match_keys:
                        if match_key not in match_to_content:
                            match_to_content[match_key] = []
                        match_to_content[match_key].append(content_text)

                        if len(match_to_content[match_key]) <= 3:  # 只记录前3个匹配项避免日志过多
                            self.log(f"映射: {match_key} -> {content_text}")

            self.log(f"共创建 {len(match_to_content)} 个唯一匹配项映射")

            # 处理目标文件中的每一行
            total_rows = target_ws.max_row - 1  # 减去标题行
            processed_rows = 0
            updated_rows = 0

            self.log(f"开始处理目标文件，共 {total_rows} 行数据...")

            for row_idx in range(2, target_ws.max_row + 1):
                # 更新进度
                progress = (row_idx - 1) / total_rows * 100
                self.status_var.set(f"正在处理第 {row_idx - 1}/{total_rows} 行...")

                if row_idx % 10 == 0 or row_idx == target_ws.max_row:  # 每10行记录一次进度
                    self.log(f"处理进度: {row_idx - 1}/{total_rows} ({progress:.1f}%)")

                self.root.update_idletasks()

                match_cell = target_ws.cell(row=row_idx, column=target_match_idx + 1)
                insert_cell = target_ws.cell(row=row_idx, column=target_insert_idx + 1)

                match_text = match_cell.value
                if not match_text:
                    continue

                # 解析匹配值 - 使用设置的分隔符
                match_values = self.parse_problem_ids(str(match_text), separator)
                if not match_values:
                    # 如果按设置的分隔符没解析出内容，尝试智能解析
                    match_values = self.smart_parse_problem_ids(str(match_text))
                    if match_values:
                        self.log(f"第 {row_idx} 行: 使用智能解析检测到 {len(match_values)} 个匹配项")

                if not match_values:
                    continue

                # 获取当前行已有的输出内容行数
                existing_content = []
                if insert_cell.value:
                    existing_content = str(insert_cell.value).splitlines()

                # 根据匹配值查找并添加内容（支持一对多）
                found_matches = []
                found_contents = []

                for match_value in match_values:
                    if match_value in match_to_content:
                        found_matches.append(match_value)
                        for content_text in match_to_content[match_value]:
                            if content_text:  # 只有当内容不为空时才添加
                                found_contents.append(content_text)

                # 更新单元格内容
                if found_contents:
                    updated_rows += 1

                    if numbering_style == "顿号分隔":
                        # 使用顿号分隔内容
                        insert_cell.value = "、".join(found_contents)
                        self.log(
                            f"第 {row_idx} 行: 找到匹配项 {', '.join(found_matches)}, 使用顿号分隔插入 {len(found_contents)} 条内容")
                    else:
                        # 使用原有编号方式
                        # 获取当前行已有的输出内容行数
                        existing_content = []
                        if insert_cell.value:
                            existing_content = str(insert_cell.value).splitlines()

                        # 为新内容编号
                        start_number = len(existing_content) + 1
                        new_content = existing_content[:]

                        # 添加新内容
                        content_index = 0
                        for content_text in found_contents:
                            number_prefix = self.get_number_prefix(start_number + content_index)
                            if number_prefix:  # 如果有编号
                                new_content.append(f"{number_prefix}. {content_text}")
                            else:  # 无编号
                                new_content.append(content_text)
                            content_index += 1

                        insert_cell.value = "\n".join(new_content)
                        self.log(
                            f"第 {row_idx} 行: 找到匹配项 {', '.join(found_matches)}, 添加 {len(found_contents)} 条内容")

                processed_rows += 1

            # 保存文件（直接覆盖原文件）
            self.status_var.set("正在保存文件...")
            self.log("正在保存文件...")
            self.root.update()

            target_wb.save(target_path)

            messagebox.showinfo("成功",
                                f"内容插入完成！\n处理了 {processed_rows} 行，更新了 {updated_rows} 行。\n结果已保存到原文件中")
            self.log(f"操作完成! 处理了 {processed_rows} 行，更新了 {updated_rows} 行")
            self.status_var.set("操作完成")

        except Exception as e:
            messagebox.showerror("错误", f"操作过程中出现错误:\n{str(e)}")
            self.log(f"错误: {str(e)}")
            self.status_var.set("操作失败")

    def run(self):
        self.root.mainloop()


# 创建并运行应用程序
if __name__ == "__main__":
    app = ExcelInserter()
    app.run()