Automate Content Moderation with n8n + Ollama (AI Comment Filter)

Published March 24, 2026 · 12 min read · By Workflow Forge

Content moderation is one of the most expensive and mentally draining tasks for any online community or platform. Cloud moderation APIs charge $1–3 per 1,000 requests and send every piece of user content to a third party. With n8n and Ollama, you can build a fully automated moderation pipeline that runs on your own hardware — zero per-request costs, unlimited volume, and complete data privacy.

In this tutorial you'll build a workflow that:

Receives user-generated content (comments, posts, reviews) via webhook
Classifies each piece of content into safe, toxic, spam, inappropriate, or needs review
Auto-approves safe content and auto-rejects obvious spam/toxic content
Routes borderline content to a human review queue
Logs every moderation decision for audit compliance

Why Local AI for Content Moderation?

Feature	Cloud APIs (Perspective, OpenAI)	Self-Hosted (n8n + Ollama)
Cost per 1K requests	$1–3	$0 (your hardware)
Data privacy	Content sent to third party	Never leaves your server
Custom rules	Limited to provider's categories	Fully customizable prompt
Latency	200–500ms (network + inference)	100–300ms (local inference only)
Rate limits	Yes (often 1–10 QPS)	No limits beyond your hardware
GDPR compliance	Requires DPA with provider	Inherent — data stays local

            Key advantage: For communities handling sensitive content (healthcare forums, children's platforms, internal corporate tools), keeping moderation fully local eliminates a major compliance headache. You never have to worry about a third party processing user data.
        

Prerequisites

n8n — self-hosted instance (Docker or npm install)
Ollama — running locally with a model pulled (we'll use mistral:7b for speed or qwen2.5:14b for accuracy)
~4GB RAM for the 7B model, ~8GB for the 14B model

Architecture Overview

User Content (webhook)
    ↓
[Preprocessing] — Strip HTML, normalize text, extract metadata
    ↓
[Ollama Classification] — Classify: safe / toxic / spam / inappropriate / needs_review
    ↓
[Decision Router] — Switch node based on classification
    ↓                   ↓                    ↓
[Auto-Approve]     [Auto-Reject]        [Human Review Queue]
    ↓                   ↓                    ↓
[Callback API]     [Notify User]        [Slack/Email Alert]
    ↓                   ↓                    ↓
[Audit Log]        [Audit Log]          [Audit Log]

Step-by-Step Build

STEP 1 — Webhook Trigger

Receive content to moderate

Add a Webhook node as the trigger. Your application sends a POST request whenever new content is submitted:

POST /webhook/moderate
Content-Type: application/json

{
  "content_id": "comment_12345",
  "text": "This is the user's comment text",
  "author_id": "user_789",
  "content_type": "comment",
  "context": "product-review"
}

STEP 2 — Preprocessing

Clean and normalize input

Add a Code node to strip HTML tags, normalize whitespace, and extract metadata that helps with classification:

// Strip HTML and normalize
const text = $input.item.json.text
  .replace(/<[^>]*>/g, '')
  .replace(/\s+/g, ' ')
  .trim();

// Extract basic signals
const hasUrls = /https?:\/\/\S+/i.test(text);
const hasExcessiveCaps = (text.replace(/[^A-Z]/g, '').length / text.length) > 0.5;
const wordCount = text.split(/\s+/).length;

return {
  json: {
    ...$input.item.json,
    cleaned_text: text,
    signals: { hasUrls, hasExcessiveCaps, wordCount }
  }
};

STEP 3 — Ollama Classification

Classify content with local AI

Add an HTTP Request node to call your local Ollama instance. The prompt is the most critical part — it defines your moderation policy:

POST http://localhost:11434/api/generate

{
  "model": "mistral:7b",
  "prompt": "You are a content moderator. Classify the following user-generated content into exactly one category.\n\nCategories:\n- SAFE: Normal, constructive content. Opinions, questions, feedback are fine.\n- TOXIC: Hate speech, threats, severe insults, harassment, discrimination.\n- SPAM: Promotional content, repeated text, SEO spam, phishing links.\n- INAPPROPRIATE: Sexual content, graphic violence, personal information exposure.\n- NEEDS_REVIEW: Borderline content that could go either way. Sarcasm, dark humor, heated but not hateful debate.\n\nContent to classify:\n\"{{cleaned_text}}\"\n\nAdditional signals:\n- Contains URLs: {{signals.hasUrls}}\n- Excessive capitals: {{signals.hasExcessiveCaps}}\n- Word count: {{signals.wordCount}}\n\nRespond with ONLY a JSON object:\n{\"category\": \"SAFE|TOXIC|SPAM|INAPPROPRIATE|NEEDS_REVIEW\", \"confidence\": 0.0-1.0, \"reason\": \"brief explanation\"}",
  "stream": false,
  "options": {
    "temperature": 0.1,
    "num_predict": 100
  }
}

Setting temperature: 0.1 keeps the output deterministic. The num_predict: 100 limit prevents the model from rambling — moderation decisions should be short.

STEP 4 — Parse Response

Extract the classification

Add a Code node to parse Ollama's JSON response with error handling:

const response = $input.item.json.response;

// Extract JSON from response (model sometimes wraps in markdown)
const jsonMatch = response.match(/\{[\s\S]*?\}/);
let classification;

try {
  classification = JSON.parse(jsonMatch[0]);
} catch (e) {
  // Fallback: if JSON parsing fails, send to human review
  classification = {
    category: "NEEDS_REVIEW",
    confidence: 0,
    reason: "Failed to parse AI response"
  };
}

// Normalize category
const validCategories = ["SAFE", "TOXIC", "SPAM", "INAPPROPRIATE", "NEEDS_REVIEW"];
if (!validCategories.includes(classification.category)) {
  classification.category = "NEEDS_REVIEW";
}

return {
  json: {
    ...$input.first().json,
    moderation: classification
  }
};

STEP 5 — Decision Router

Route based on classification

Add a Switch node that routes content based on the category:

SAFE → Auto-approve (callback to your API)
TOXIC / SPAM → Auto-reject (callback + notify user)
INAPPROPRIATE → Auto-reject with specific reason
NEEDS_REVIEW → Human review queue (Slack/email alert)

Switch condition: {{ $json.moderation.category }}

STEP 6 — Callbacks & Notifications

Act on the decision

For auto-approve, send a callback to your application:

POST https://your-app.com/api/moderation/callback

{
  "content_id": "{{content_id}}",
  "action": "approve",
  "category": "{{moderation.category}}",
  "confidence": {{moderation.confidence}}
}

For human review, send a Slack message with the content and one-click approve/reject buttons:

{
  "text": "Content needs review",
  "blocks": [
    {
      "type": "section",
      "text": {
        "type": "mrkdwn",
        "text": "*Content ID:* {{content_id}}\n*Author:* {{author_id}}\n*AI Classification:* {{moderation.category}} ({{moderation.confidence}})\n*Reason:* {{moderation.reason}}\n\n> {{cleaned_text}}"
      }
    }
  ]
}

STEP 7 — Audit Log

Log every decision for compliance

Every moderation action must be logged. Add a final Spreadsheet/Database node that records:

{
  "timestamp": "{{$now.toISO()}}",
  "content_id": "{{content_id}}",
  "author_id": "{{author_id}}",
  "category": "{{moderation.category}}",
  "confidence": {{moderation.confidence}},
  "reason": "{{moderation.reason}}",
  "action_taken": "approve|reject|review",
  "reviewed_by": "ai"
}

This audit trail is essential for GDPR compliance (users can request why their content was rejected) and for improving your moderation rules over time.

Tuning Your Moderation Policy

Confidence thresholds

Don't trust the AI blindly. Add confidence-based routing:

High confidence (>0.85) — Auto-action (approve or reject)
Medium confidence (0.5–0.85) — Send to human review regardless of category
Low confidence (<0.5) — Always send to human review

Custom rules per context

Different content types need different policies. A product review has different standards than a children's forum. Adjust the prompt per context field:

// In the preprocessing Code node:
const prompts = {
  "product-review": "Allow negative opinions but flag personal attacks...",
  "kids-forum": "Strict moderation. Flag anything remotely inappropriate...",
  "internal-chat": "Relaxed moderation. Only flag clearly toxic content..."
};

const moderationPrompt = prompts[$input.item.json.context] || prompts["product-review"];

Model selection

Model	Speed	Accuracy	Best for
`mistral:7b`	Fast (~200ms)	Good	High-volume, clear-cut content
`qwen2.5:14b`	Medium (~500ms)	Very good	Nuanced content, sarcasm detection
`llama3.1:8b`	Fast (~250ms)	Good	General purpose, good multilingual
`qwen2.5:32b`	Slow (~1.5s)	Excellent	Second-pass review of borderline content

            Pro tip: Use a two-pass system for production. The first pass uses a fast 7B model for initial classification. Only content classified as NEEDS_REVIEW gets a second pass with the more accurate 14B or 32B model. This gives you the speed of the small model for 80%+ of content while maintaining accuracy on the edge cases.
        

Scaling to Production

Batch processing

For platforms with high volume, modify the webhook to accept arrays of content items and process them in parallel using n8n's Split In Batches node. A single 7B model on a modern GPU can handle 10–20 classifications per second.

Feedback loop

When human reviewers override AI decisions, log the correction. Periodically review these overrides to improve your prompts:

If the AI flags too many safe comments as toxic → adjust the TOXIC definition in your prompt
If spam is getting through → add more spam indicators to the signals
If a specific topic keeps getting flagged incorrectly → add it as an exception in the prompt

Multi-language support

Ollama models like qwen2.5 and llama3.1 support 20+ languages natively. Add a language detection step before classification, and use language-specific prompt variants for better accuracy.

Common Use Cases

Forum/community moderation — Auto-moderate user posts and comments
Product review filtering — Filter fake reviews, spam, and inappropriate content
Support ticket triage — Flag abusive customer messages for manager review
Internal tool protection — Moderate AI chatbot outputs before they reach users
Social media management — Pre-screen comments on your brand's social accounts
E-commerce listing moderation — Verify product descriptions meet platform guidelines

Want the complete workflow + 10 more AI templates?

Get the full Self-Hosted AI Workflow Pack for n8n + Ollama. 11 production-ready templates including content moderation, email automation, lead scoring, and more.

Get the Pack — $39

Summary

Content moderation is a perfect use case for local AI: it's high-volume, privacy-sensitive, and the classification task is well-suited for smaller language models. By running it on your own hardware with n8n + Ollama, you eliminate per-request costs, keep user data private, and get full control over your moderation policy.

The two-pass architecture (fast model for initial screening + accurate model for edge cases) gives you the best of both worlds: speed for the 80% of content that's clearly safe or clearly bad, and accuracy for the 20% that needs careful judgment.

Start with the basic pipeline, then iterate on your prompts based on real moderation data. The audit log is your best friend — it shows you exactly where the AI gets it wrong so you can fix it.