构建 API 让您的营销电子邮件远离垃圾邮件

ID:21979 / 打印

开展电子邮件营销活动时，最大的挑战之一是确保您的邮件到达收件箱而不是垃圾邮件文件夹。

apache spamassassin 是许多电子邮件客户端和电子邮件过滤工具广泛使用的工具，用于将邮件分类为垃圾邮件。在这篇文章中，我们将探讨如何利用 spamassassin 来验证您的电子邮件是否会被标记为垃圾邮件以及为什么会被标记为垃圾邮件。
逻辑将被打包为 api 并在线部署，以便可以集成到您的工作流程中。

为什么选择 apache spamassassin？

apache spamassassin 是一个由 apache 软件基金会维护的开源垃圾邮件检测平台。它使用多种规则、贝叶斯过滤和网络测试来为给定的电子邮件分配垃圾邮件“分数”。一般来说，得分为 5 或以上的电子邮件被标记为垃圾邮件的风险很高。

由于 spamassassin 的评分是透明且有据可查的，因此您还可以使用它来准确识别电子邮件的哪些方面导致了高垃圾邮件分数并提高您的写作水平。

spamassassin 入门

spamassassin 设计为在 linux 系统上运行。您需要 linux 操作系统或创建 docker 虚拟机来安装和运行它。

在 debian 或 ubuntu 系统上，使用以下命令安装 spamassassin：

apt-get update && apt-get install -y spamassassin sa-update

sa-update 命令确保 spamassassin 的规则是最新的。

安装后，您可以将电子邮件消息通过管道传输到 spamassassin 的命令行工具中。输出包括带有垃圾邮件分数的电子邮件的带注释版本，并解释了触发哪些规则。

典型用法可能如下所示：

187755762573

results.txt 将包含已处理的电子邮件以及 spamassassin 的标头和分数。

使用 fastapi 将 spamassassin 包装为 api

接下来，让我们创建一个简单的 api，它接受两个电子邮件字段：主题和 html_body。它将把字段传递给 spamassassin 并返回验证结果。

fastapi 代码示例

from fastapi import fastapi from datetime import datetime, timezone from email.utils import format_datetime from pydantic import basemodel import subprocess import re  def extract_analysis_details(text):     rules_section = re.search(r"content analysis details:.*?(pts rule name.*?description.*?)  ", text, re.dotall)     if not rules_section:         return []      rules_text = rules_section.group(1)     pattern = r"^s*([-d.]+)s+(s+)s+(.+)$"     rules = []     for line in rules_text.splitlines()[1:]:         match = re.match(pattern, line)         if match:             score, rule, description = match.groups()             rules.append({                 "rule": rule,                 "score": float(score),                 "description": description.strip()             })     return rules  app = fastapi()  class email(basemodel):     subject: str     html_body: str  @app.post("/spam_check") def spam_check(email: email):     # assemble the full email     message = f"""from: example@example.com to: recipient@example.com subject: {email.subject} date: {format_datetime(datetime.now(timezone.utc))} content-type: text/html; charset="utf-8"  {email.html_body}"""      # run spamassassin and capture the output directly     output = subprocess.run(["spamassassin", "-t"],                              input=message.encode('utf-8'),                              capture_output=true)      output_str = output.stdout.decode('utf-8', errors='replace')     details = extract_analysis_details(output_str)     return {"result": details}

回复将包含 spamassassin 结果的分析详细信息。

让我们以此输入为例：

subject: test email  html_body: <html>   <body>     <p>this is an <b>html</b> test email.</p>   </body> </html>

响应将是这样的：

[   {     "rule": "MISSING_MID",     "score": 0.1,     "description": "Missing Message-Id: header"   },   {     "rule": "NO_RECEIVED",     "score": -0.0,     "description": "Informational: message has no Received headers"   },   {     "rule": "NO_RELAYS",     "score": -0.0,     "description": "Informational: message was not relayed via SMTP"   },   {     "rule": "HTML_MESSAGE",     "score": 0.0,     "description": "BODY: HTML included in message"   },   {     "rule": "MIME_HTML_ONLY",     "score": 0.1,     "description": "BODY: Message only has text/html MIME parts"   },   {     "rule": "MIME_HEADER_CTYPE_ONLY",     "score": 0.1,     "description": "'Content-Type' found without required MIME headers"   } ]