n8n + AI Models 2026: GPT, Claude, Gemini, Ollama

Wito AI

n8n + AI Models Integration 2026: GPT, Claude, Gemini, Local Models

How SMEs connect AI models directly in n8n workflows — from cloud APIs (OpenAI, Anthropic, Google) to EU providers (Aleph Alpha) and 100% on-premises with Ollama. GDPR-compliant, pricing comparison, 6 production-ready workflow templates.

Request AI integration Free Wito Digital Audit (WDA)

n8n + AI models refers to the direct integration of large language models and AI APIs (GPT, Claude, Gemini, Aleph Alpha, Ollama) into n8n workflow nodes — so that automated processes no longer just move data, but understand it, classify it, generate content and make decisions.

Why AI integration in n8n is the true multiplier

Classic workflow automation moves data: a trigger fires, a node processes, a recipient receives the result. Efficient, but limited. The qualitative leap comes when a language model is embedded in the workflow: n8n can then not just forward incoming emails but understand their content, assess their urgency and draft a context-sensitive reply. Workflows no longer automate only sequences — they automate judgements.

According to the Forrester State of AI 2024, 61% of companies already use AI APIs in at least one automation workflow — an increase of 28 percentage points since 2022. The Gartner AI Hype Cycle 2024 classifies "Generative AI in Workflow Automation" as a technology reaching the "Plateau of Productivity" faster than expected: the typical productivity effect is measurable, reproducible and transferable to new workflows.

For SMEs, this translates into a concrete pattern: Step 1 — workflow automation without AI (data transfers, notifications, simple transformations). Step 2 — AI enrichment (a language model evaluates content, classifies, prioritises). Step 3 — autonomous AI agents (the model makes decisions and calls further tools). n8n supports all three stages on a single platform.

The combination of structured workflow nodes and AI models eliminates the traditional either/or choice: either rule-based automation (deterministic but rigid) or AI (flexible but hard to control). n8n bridges both: the workflow logic stays transparent and auditable, while the AI node delivers semantic intelligence. This is why the Gartner AI Hype Cycle 2024 lists n8n as an early mover in the "AI-Augmented Automation" category.

73%

of SMEs are evaluating AI integration into existing workflows — 2025

Quelle: Bitkom AI Study 2025, 2025

47%

cost reduction with AI-assisted data enrichment

Quelle: McKinsey Global Institute AI Adoption 2024, 2024

0.002 EUR/1k tokens

GPT-4o-mini — cheapest capable cloud model

Quelle: OpenAI Pricing 2025, 2025

100%

local processing with Ollama — no data transfer to US cloud

Quelle: n8n Community Nodes Marketplace 2025, 2025

Three integration paths for AI models in n8n

n8n supports three fundamentally different ways to integrate AI models into workflows. Each has its own profile of setup effort, ongoing costs, latency and GDPR maturity. The right choice depends on the data flowing through the workflow — not on general preferences.

Path 1: Cloud API (OpenAI, Anthropic, Google Gemini)

The fastest starting point: enter an API key in the n8n credential store, drag the OpenAI, Claude or Gemini node into the workflow — done. OpenAI GPT-4o-mini costs 0.002 EUR per 1,000 input tokens (as of 2025) and is ideal for classifications, summaries and straightforward generation tasks. Anthropic Claude 3.5 Haiku is in a comparable price range and delivers particularly strong results for structured output and longer documents. Google Gemini 1.5 Flash is the cheapest option for long context windows (up to 1 million tokens).

The downside: all data sent to these APIs leaves the EU and is processed on US servers. For workflows that contain no personal data (e.g. public market data, internal product copy without customer references), this is often acceptable — provided a Data Processing Agreement (DPA) under Art. 28 GDPR is concluded with the respective provider. OpenAI, Anthropic and Google all offer such DPAs.

Path 2: EU API (Aleph Alpha Luminous)

Aleph Alpha from Heidelberg operates its entire inference infrastructure in German data centres (Hetzner, Schwandorf). The Luminous model is the only fully EU-based LLM with a commercial API that n8n natively supports via a community node. The price is higher than US competitors — approximately 0.008 EUR/1k tokens for the Luminous Base model — but it offers maximum legal certainty for sensitive data: no cross-border data transfers, full EU GDPR protection level, no US Cloud Act exposure.

Aleph Alpha is especially suited for workflows that process personal customer or employee data — for example, automated ticket classification in CRM, HR document analysis or medical correspondence summarisation.

Path 3: Local models (Ollama, Llama 3, Mistral)

Ollama is an open-source tool that runs language models such as Llama 3 (Meta), Mistral 7B or Phi-3 (Microsoft) locally on a server. n8n has a native Ollama community node — the model is hosted on your own server (or a GPU-equipped Hetzner VM), with no external data transfer. This is maximum GDPR sovereignty: data never leaves your own infrastructure.

The catch: Ollama needs a GPU for good performance. An Nvidia A10G (available on Hetzner GPU servers from approximately 3 EUR/hour on-demand) is sufficient for Llama 3 8B with 4-bit quantisation. For continuously running production workflows, a dedicated GPU server is recommended (approximately 200–400 EUR/month on Hetzner Robot). For SMEs without their own GPU infrastructure, Ollama is available on-demand via Hetzner Cloud GPU instances — for batch runs only, not for real-time webhooks.

Bis Ende 2026 werden 80 Prozent aller produktiven Workflow-Automatisierungsplattformen native KI-Modell-Integration anbieten. n8n gehört zu den frühen Bewegern, die diese Konvergenz bereits 2023/2024 produktionsreif umgesetzt haben — mit messbarem Vorteil für frühe Adopter.

Gartner Research, Gartner Hype Cycle for Artificial Intelligence 2024, Gartner, Inc., 2024

6 production-ready use cases: AI in n8n workflows

No abstract scenarios — these six workflows are live in German SMEs following the pattern described. Each follows the same principle: n8n handles the orchestration, an AI model handles the semantic interpretation.

1. Classify incoming emails (support / sales / spam)

Trigger: new email in a shared inbox (Gmail or Outlook via Microsoft Graph). Node 1: GPT-4o-mini analyses the subject and first paragraph, returning a JSON output with three fields (`category`, `urgency`, `suggested_assignee`). Node 2: a Switch node routes based on `category` to the responsible Slack channel or CRM contact. Result: no more manual triage in the support team, first response in under 3 minutes. Forrester State of AI 2024 documents 52% time savings with AI-assisted email routing.

2. Summarise customer requests and save them in the CRM

Trigger: new ticket in Zendesk or Freshdesk. Node 1: Claude 3.5 Haiku summarises the request in three sentences and extracts: problem category, sentiment (positive/neutral/negative), potential revenue relevance. Node 2: HubSpot node writes the summary as a note on the contact record and sets a category tag. Advantage over manual processing: every customer interaction is fully documented in HubSpot — with no effort for the customer service team.

3. Invoice OCR (Mindee or Azure Form Recognizer node)

Trigger: new PDF attachment in email (IMAP node). Node 1: Mindee API extracts structured data from the invoice (amount, vendor, IBAN, due date). Node 2: GPT-4o-mini validates the extracted fields against company master data and flags discrepancies. Node 3: DATEV or Lexware node creates the pre-entry booking record. According to McKinsey Global Institute AI Adoption 2024, AI-assisted document processing reduces manual accounting effort by an average of 47%.

4. Social media content generation (GPT)

Trigger: new blog post in CMS (via webhook or RSS). Node 1: GPT-4o generates five LinkedIn post variants in the defined brand voice, each with hashtag suggestions and a CTA. Node 2: human approval via a Slack approval node (thumbs up/down button). After approval: automatic scheduling in Buffer or direct publishing. Result: social media team saves 3–5 hours per article.

5. Translate a newsletter into 5 languages

Trigger: new newsletter draft in Notion or Google Docs. Node 1: text is split into five parallel branches (DE, EN, FR, ES, IT). Node 2 (per language): GPT-4o translates with a tone prompt tailored to each language. Node 3: results are written back to Notion and marked for review. What used to cost 300–800 EUR per newsletter via a translation agency now runs for under 0.10 EUR in API fees.

6. Sentiment analysis of customer reviews

Trigger: daily cron job fetching new reviews from Google Business, Trustpilot and kununu (via HTTP Request node). Node 1: Claude 3.5 Haiku analyses each review: sentiment score (-1 to +1), main topics (product, service, delivery, price), action needed (yes/no + urgency). Node 2: aggregated daily report is sent as a Slack message to management. Node 3: reviews with negative sentiment and `handlungsbedarf: true` create a ticket in Zendesk. Management receives a 2-minute daily briefing instead of hours of manual review reading.

AI API pricing comparison for n8n workflows (as of 2025)

The cost of AI integration in n8n workflows depends directly on the model chosen. Below is a practical comparison of the five most important options — with the criteria that matter for GDPR-conscious SMEs.

Cloud models: price-performance at a glance

GPT-4o-mini (OpenAI): ~0.002 EUR/1k input tokens, ~0.008 EUR/1k output tokens. Very affordable, excellent JSON output. Data residency: US. GDPR: DPA available, but cross-border transfer to the US. Latency: ~0.5–1 sec for 500-token requests.
Claude 3.5 Haiku (Anthropic): ~0.002 EUR/1k input tokens, ~0.010 EUR/1k output tokens. Best structured output, ideal for document analysis. Data residency: US. GDPR: DPA available, cross-border transfer. Latency: ~0.8–1.5 sec.
Gemini 1.5 Flash (Google): ~0.001 EUR/1k input tokens (under 128k context), cheapest cloud model. 1-million-token context window. Data residency: US/EU (selectable). GDPR: DPA available, EU processing available at extra cost. Latency: ~0.5–1 sec.
Aleph Alpha Luminous Base (EU): ~0.008 EUR/1k tokens. Full EU data residency, ISO 27001 certified, no US Cloud Act exposure. GDPR: fully compliant without restrictions. Latency: ~1–2 sec. Recommended for personal data workflows.
Llama 3 8B via Ollama (local): 0 EUR in API costs, server costs only (Hetzner GPU ~200–400 EUR/month for continuous operation). 100% local processing. GDPR: maximum sovereignty. Latency: 1–5 sec depending on GPU. Recommended for highly sensitive data or high-volume batches.

For most SME workflows the rule of thumb applies: GPT-4o-mini or Claude 3.5 Haiku for non-personal data at high volume (cost efficiency), Aleph Alpha for workflows involving customer personal data (GDPR safety), Ollama/Llama 3 for particularly confidential processes or batches where GPU costs are justified by volume.

Frequently asked questions about AI models in n8n

The cheapest capable cloud model is GPT-4o-mini from OpenAI (approximately 0.002 EUR/1k input tokens, as of 2025). For classifications and short text summaries, the cost per workflow execution is typically under 0.001 EUR. Google Gemini 1.5 Flash is even cheaper for very long contexts. If you want to avoid API costs entirely, Ollama with a local model is the alternative — though you will incur server costs for a GPU-capable instance.

n8n has a native Ollama community node. Prerequisites: Ollama is running on an accessible server (local or a Hetzner Cloud GPU instance) and reachable over HTTP. Steps: (1) Install Ollama and pull a model (e.g. "ollama pull llama3"). (2) In n8n under "Credentials", create a new Ollama connection with the server URL (e.g. http://your-server:11434). (3) Drag the Ollama node into the workflow, select the model, configure the prompt. Important: bind Ollama to localhost only by default, and secure it with Traefik or nginx as a reverse proxy.

All three paths can be GDPR-compliant — the difference lies in the effort required and the risk class of the data being processed. (1) Cloud API (OpenAI, Anthropic, Google): GDPR-compliant with a valid DPA, but cross-border transfer to the US — legally possible but with residual risk. (2) Aleph Alpha: fully EU-compliant, no cross-border transfer, recommended for personal data. (3) Ollama local: maximum sovereignty, no data leaves your own server. For workflows without personal data, US cloud APIs with a DPA are sufficient. For customer or employee data we recommend Aleph Alpha or Ollama.

Hallucination risk is real and must be considered when designing workflows. Best practices: (1) use AI only for classification and summarisation tasks, not for legally binding decisions without human review. (2) Always validate model output with a JSON schema (n8n code node or zod schema). (3) Define confidence thresholds: below a certain confidence score the workflow is redirected to a human approval queue. (4) Test and iterate prompts with few-shot examples. n8n error-handling nodes catch invalid model responses.

Four effective measures: (1) rate limiting in the n8n workflow: use a "Wait" node or a "Split in Batches" node to bundle API calls. (2) Shorter prompts: precise system prompts drastically reduce token consumption. (3) Model hierarchy: use GPT-4o-mini for pre-classifications and escalate to a stronger model only when uncertain. (4) Caching: if similar requests recur frequently, store results in a PostgreSQL table or Redis cache and skip the API call on a cache hit. OpenAI also offers a prompt caching feature that automatically saves costs on longer system prompts.

For production workflows, yes — without a GPU, local models are too slow for real-time webhooks. Llama 3 8B on a CPU-only instance takes 20–60 seconds per request; with an Nvidia A10G GPU it takes 1–3 seconds. For batch workflows (e.g. nightly processing of 500 documents), CPU operation is acceptable. Hetzner offers GPU cloud instances (GEX44 with Nvidia A16, from approximately 2.49 EUR/hour on-demand) — ideal for occasional batch runs. For permanent operation, a dedicated Hetzner Robot GPU server is recommended.

Community nodes are extensions developed by the n8n community, published in the npm registry, and not included in the official n8n node directory. Among them are integrations for Ollama, Aleph Alpha, HuggingFace Inference API, Pinecone (vector database), Qdrant and other AI tools. They are activated in the n8n admin interface under "Settings > Community Nodes" and used like regular nodes. Important: community nodes are not reviewed by n8n — before activating one, check its npm registry page for freshness and security.

For English-language content, the following strengths stand out in practice: GPT-4o and GPT-4o-mini (OpenAI) produce high-quality, idiomatically natural English text. Claude 3.5 Sonnet (Anthropic) delivers particularly strong results for structured English output and translations. Aleph Alpha Luminous was trained with a significant share of European-language data and is the first choice for regulatory or legal texts. Llama 3 8B has solid but not outstanding English capabilities — sufficient for classifications and simple summaries. For high-quality English content generation we recommend GPT-4o or Claude 3.5 Sonnet.