Automate Content Moderation with n8n + Ollama (AI Comment Filter)

Published March 24, 2026 · 12 min read · By Workflow Forge

Content moderation is one of the most expensive and mentally draining tasks for any online community or platform. Cloud moderation APIs charge $1–3 per 1,000 requests and send every piece of user content to a third party. With n8n and Ollama, you can build a fully automated moderation pipeline that runs on your own hardware — zero per-request costs, unlimited volume, and complete data privacy.

In this tutorial you'll build a workflow that:

  1. Receives user-generated content (comments, posts, reviews) via webhook
  2. Classifies each piece of content into safe, toxic, spam, inappropriate, or needs review
  3. Auto-approves safe content and auto-rejects obvious spam/toxic content
  4. Routes borderline content to a human review queue
  5. Logs every moderation decision for audit compliance

Why Local AI for Content Moderation?

FeatureCloud APIs (Perspective, OpenAI)Self-Hosted (n8n + Ollama)
Cost per 1K requests$1–3$0 (your hardware)
Data privacyContent sent to third partyNever leaves your server
Custom rulesLimited to provider's categoriesFully customizable prompt
Latency200–500ms (network + inference)100–300ms (local inference only)
Rate limitsYes (often 1–10 QPS)No limits beyond your hardware
GDPR complianceRequires DPA with providerInherent — data stays local
Key advantage: For communities handling sensitive content (healthcare forums, children's platforms, internal corporate tools), keeping moderation fully local eliminates a major compliance headache. You never have to worry about a third party processing user data.

Prerequisites

Architecture Overview

User Content (webhook)
    ↓
[Preprocessing] — Strip HTML, normalize text, extract metadata
    ↓
[Ollama Classification] — Classify: safe / toxic / spam / inappropriate / needs_review
    ↓
[Decision Router] — Switch node based on classification
    ↓                   ↓                    ↓
[Auto-Approve]     [Auto-Reject]        [Human Review Queue]
    ↓                   ↓                    ↓
[Callback API]     [Notify User]        [Slack/Email Alert]
    ↓                   ↓                    ↓
[Audit Log]        [Audit Log]          [Audit Log]

Step-by-Step Build

STEP 1 — Webhook Trigger

Receive content to moderate

Add a Webhook node as the trigger. Your application sends a POST request whenever new content is submitted:

POST /webhook/moderate
Content-Type: application/json

{
  "content_id": "comment_12345",
  "text": "This is the user's comment text",
  "author_id": "user_789",
  "content_type": "comment",
  "context": "product-review"
}
STEP 2 — Preprocessing

Clean and normalize input

Add a Code node to strip HTML tags, normalize whitespace, and extract metadata that helps with classification:

// Strip HTML and normalize
const text = $input.item.json.text
  .replace(/<[^>]*>/g, '')
  .replace(/\s+/g, ' ')
  .trim();

// Extract basic signals
const hasUrls = /https?:\/\/\S+/i.test(text);
const hasExcessiveCaps = (text.replace(/[^A-Z]/g, '').length / text.length) > 0.5;
const wordCount = text.split(/\s+/).length;

return {
  json: {
    ...$input.item.json,
    cleaned_text: text,
    signals: { hasUrls, hasExcessiveCaps, wordCount }
  }
};
STEP 3 — Ollama Classification

Classify content with local AI

Add an HTTP Request node to call your local Ollama instance. The prompt is the most critical part — it defines your moderation policy:

POST http://localhost:11434/api/generate

{
  "model": "mistral:7b",
  "prompt": "You are a content moderator. Classify the following user-generated content into exactly one category.\n\nCategories:\n- SAFE: Normal, constructive content. Opinions, questions, feedback are fine.\n- TOXIC: Hate speech, threats, severe insults, harassment, discrimination.\n- SPAM: Promotional content, repeated text, SEO spam, phishing links.\n- INAPPROPRIATE: Sexual content, graphic violence, personal information exposure.\n- NEEDS_REVIEW: Borderline content that could go either way. Sarcasm, dark humor, heated but not hateful debate.\n\nContent to classify:\n\"{{cleaned_text}}\"\n\nAdditional signals:\n- Contains URLs: {{signals.hasUrls}}\n- Excessive capitals: {{signals.hasExcessiveCaps}}\n- Word count: {{signals.wordCount}}\n\nRespond with ONLY a JSON object:\n{\"category\": \"SAFE|TOXIC|SPAM|INAPPROPRIATE|NEEDS_REVIEW\", \"confidence\": 0.0-1.0, \"reason\": \"brief explanation\"}",
  "stream": false,
  "options": {
    "temperature": 0.1,
    "num_predict": 100
  }
}

Setting temperature: 0.1 keeps the output deterministic. The num_predict: 100 limit prevents the model from rambling — moderation decisions should be short.

STEP 4 — Parse Response

Extract the classification

Add a Code node to parse Ollama's JSON response with error handling:

const response = $input.item.json.response;

// Extract JSON from response (model sometimes wraps in markdown)
const jsonMatch = response.match(/\{[\s\S]*?\}/);
let classification;

try {
  classification = JSON.parse(jsonMatch[0]);
} catch (e) {
  // Fallback: if JSON parsing fails, send to human review
  classification = {
    category: "NEEDS_REVIEW",
    confidence: 0,
    reason: "Failed to parse AI response"
  };
}

// Normalize category
const validCategories = ["SAFE", "TOXIC", "SPAM", "INAPPROPRIATE", "NEEDS_REVIEW"];
if (!validCategories.includes(classification.category)) {
  classification.category = "NEEDS_REVIEW";
}

return {
  json: {
    ...$input.first().json,
    moderation: classification
  }
};
STEP 5 — Decision Router

Route based on classification

Add a Switch node that routes content based on the category:

  • SAFE → Auto-approve (callback to your API)
  • TOXIC / SPAM → Auto-reject (callback + notify user)
  • INAPPROPRIATE → Auto-reject with specific reason
  • NEEDS_REVIEW → Human review queue (Slack/email alert)

Switch condition: {{ $json.moderation.category }}

STEP 6 — Callbacks & Notifications

Act on the decision

For auto-approve, send a callback to your application:

POST https://your-app.com/api/moderation/callback

{
  "content_id": "{{content_id}}",
  "action": "approve",
  "category": "{{moderation.category}}",
  "confidence": {{moderation.confidence}}
}

For human review, send a Slack message with the content and one-click approve/reject buttons:

{
  "text": "Content needs review",
  "blocks": [
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "*Content ID:* {{content_id}}\n*Author:* {{author_id}}\n*AI Classification:* {{moderation.category}} ({{moderation.confidence}})\n*Reason:* {{moderation.reason}}\n\n> {{cleaned_text}}"
      }
    }
  ]
}
STEP 7 — Audit Log

Log every decision for compliance

Every moderation action must be logged. Add a final Spreadsheet/Database node that records:

{
  "timestamp": "{{$now.toISO()}}",
  "content_id": "{{content_id}}",
  "author_id": "{{author_id}}",
  "category": "{{moderation.category}}",
  "confidence": {{moderation.confidence}},
  "reason": "{{moderation.reason}}",
  "action_taken": "approve|reject|review",
  "reviewed_by": "ai"
}

This audit trail is essential for GDPR compliance (users can request why their content was rejected) and for improving your moderation rules over time.

Tuning Your Moderation Policy

Confidence thresholds

Don't trust the AI blindly. Add confidence-based routing:

Custom rules per context

Different content types need different policies. A product review has different standards than a children's forum. Adjust the prompt per context field:

// In the preprocessing Code node:
const prompts = {
  "product-review": "Allow negative opinions but flag personal attacks...",
  "kids-forum": "Strict moderation. Flag anything remotely inappropriate...",
  "internal-chat": "Relaxed moderation. Only flag clearly toxic content..."
};

const moderationPrompt = prompts[$input.item.json.context] || prompts["product-review"];

Model selection

ModelSpeedAccuracyBest for
mistral:7bFast (~200ms)GoodHigh-volume, clear-cut content
qwen2.5:14bMedium (~500ms)Very goodNuanced content, sarcasm detection
llama3.1:8bFast (~250ms)GoodGeneral purpose, good multilingual
qwen2.5:32bSlow (~1.5s)ExcellentSecond-pass review of borderline content
Pro tip: Use a two-pass system for production. The first pass uses a fast 7B model for initial classification. Only content classified as NEEDS_REVIEW gets a second pass with the more accurate 14B or 32B model. This gives you the speed of the small model for 80%+ of content while maintaining accuracy on the edge cases.

Scaling to Production

Batch processing

For platforms with high volume, modify the webhook to accept arrays of content items and process them in parallel using n8n's Split In Batches node. A single 7B model on a modern GPU can handle 10–20 classifications per second.

Feedback loop

When human reviewers override AI decisions, log the correction. Periodically review these overrides to improve your prompts:

Multi-language support

Ollama models like qwen2.5 and llama3.1 support 20+ languages natively. Add a language detection step before classification, and use language-specific prompt variants for better accuracy.

Common Use Cases

Want the complete workflow + 10 more AI templates?

Get the full Self-Hosted AI Workflow Pack for n8n + Ollama. 11 production-ready templates including content moderation, email automation, lead scoring, and more.

Get the Pack — $39

Summary

Content moderation is a perfect use case for local AI: it's high-volume, privacy-sensitive, and the classification task is well-suited for smaller language models. By running it on your own hardware with n8n + Ollama, you eliminate per-request costs, keep user data private, and get full control over your moderation policy.

The two-pass architecture (fast model for initial screening + accurate model for edge cases) gives you the best of both worlds: speed for the 80% of content that's clearly safe or clearly bad, and accuracy for the 20% that needs careful judgment.

Start with the basic pipeline, then iterate on your prompts based on real moderation data. The audit log is your best friend — it shows you exactly where the AI gets it wrong so you can fix it.