亚马逊-商品详情页违禁词检查_实在智能RPA源码解读

zy0803wyl

2052人浏览 · 2025-08-10 08:00:00

zy0803wyl · 2025-08-10 08:00:00 发布

一、项目简介

本项目是一款基于实在智能RPA技术的亚马逊商品详情页违禁词检查工具。该工具能够自动遍历指定的亚马逊商品链接列表，提取商品标题、图片和描述信息，并检查其中是否包含预设的违禁词。工具支持自定义违禁词库，检查结果会自动导出到Excel文件中，方便用户查看和处理。

二、项目结构

/
├── components_var_type.json
├── config.json
├── customConfig/
├── diaFile/
├── edit-state.json
├── elements.json
├── extensions/
├── fileList.json
├── file_index_700.json
├── file_sourcemap.json
├── globalVariable
├── log/
├── main.sz
├── process.template.json
├── project.flow
├── projects/
│   ├── child_modules/
│   │   ├── IB7A5.py
│   │   └── g6cCz.py
│   ├── code_modules/
│   │   └── utils.py
│   ├── dependency_modules/
│   ├── flow.py
│   ├── flow_modules/
│   │   └── io5cd.py
│   ├── global_data.py
│   ├── gpt_modules/
│   ├── old_custom_components/
│   ├── requirements.txt
│   └── rpaRoot.py
├── res/
├── szrpa.json
├── variableInfo
├── 商品图片/
└── 商品描述/

三、项目特点和核心代码

1. 自动化流程控制

项目使用流程队列管理任务执行，确保检查过程稳定可靠：

# projects/flow.py
FLOW_QUEUE = []

# 开始
def flow_node_hsxAW():
    FLOW_QUEUE.append("flow_node_io5cd")

# 启动流程
def main(rpa):
    flow_node_hsxAW()
    while len(FLOW_QUEUE) > 0:
        flowName = FLOW_QUEUE.pop(0)
        eval(flowName + "()")

2. 核心检查逻辑

工具能够自动打开网页、提取信息并进行违禁词检查：

# projects/flow_modules/io5cd.py
for 循环位置,循环项 in get_list(globalVar['glv_dict']["商品链接列表"],0, 1,-1, version="1"):
    if ("http" not in 循环项[0]):
        continue
    违禁词列表 = Basic.SetVariable(SZEnv['rpa'], globalVar['glv_dict']['英文默认违禁词库'], var_ret=0, var_names=["违禁词列表"])
    url = Basic.SetVariable(SZEnv['rpa'], 循环项[0], var_ret=0, var_names=["url"])
    网页对象 = WebBrowser.CreateV3(SZEnv['rpa'], "chrome", url, 0, "", "61619", "", 1, 1)
    # 提取商品标题并检查
    商品title = Element.GetElementInfo(SZEnv['rpa'], elementsFormatNew(SHIZAI_ELEMENT_DICT["cUVy16G8Sh"]["selector"]), 1, 网页对象, "gettext")
    商品title违禁词 = run_module({ "module_path": "code_modules.utils" }, "check_forbidden_words", 商品title, 违禁词列表)
    # 提取商品图片并检查
    商品图片链接 = Element.GetElementInfo(SZEnv['rpa'], elementsFormatNew(SHIZAI_ELEMENT_DICT["LP8xJ6NRRf"]["selector"]), 1, 网页对象, "attr", 0, "data-a-dynamic-image")
    # ...更多代码...

3. 违禁词检查算法

项目实现了智能违禁词检查函数，支持英文单词完全匹配：

# projects/code_modules/utils.py
def check_forbidden_words(text: str, forbidden_words: list) -> str:
    """
    检查文本中是否包含违禁词
    @param text: 文本
    @param forbidden_words: 违禁词列表
    @return: 违禁词列表
    """
    hit_words = [item for item in forbidden_words if item in text]
    def _check_english_word(word: str) -> bool:
        # 如果命中的违禁词是英文单词，则需要判断完全匹配
        if not re.match(r'^[a-zA-Z ]+$', word):
            return True
        return re.search(r'(^|[^a-zA-Z])' + re.escape(word) + r'($|[^a-zA-Z])', text)
    return ",".join(filter(_check_english_word, hit_words))