n8n Cloud 60-second timeout: dispatcher-worker pattern

Part of the Data Pipeline Reliability series Post 3 of 4

If you are running an AI-built data pipeline on n8n Cloud and seeing rows quietly missing from the destination after a long run, the most likely cause is not your code. It is n8n Cloud’s 60-second Code-node task-runner timeout firing silently mid-batch. The runner returns whatever it had finished, the workflow continues with the truncated result, and the downstream nodes process the partial input as if it were complete. No error appears in the run log. Operators only notice the next day when the row counts do not match.

The fix is the dispatcher-worker pattern: a parent workflow that chunks the input and triggers a child workflow per chunk via webhook, giving every chunk a fresh execution clock. This post covers how the pattern works, how to size chunks correctly, and where the boundary is between “stay on Cloud” and “graduate to self-hosted.”

What are n8n Cloud’s execution timeout limits?

n8n Cloud applies two distinct execution caps that interact in non-obvious ways:

Code-node task-runner timeout: 60 seconds. Hard-coded across all Cloud tiers. Code nodes are matched to a runner via an external task-runner architecture; if a Code node does not complete within 60 seconds, the runner returns whatever has been produced so far. This is documented in the n8n configuration reference and tracked in an open community thread that has been active since the v2 upgrade.
Total-workflow execution timeout. Around 5 minutes on the free tier, roughly 40 minutes on Pro, longer on Enterprise. This cap covers the entire workflow run from the trigger to the final node.

The Code-node cap is the one that hurts production migrations. Most operators design workflows assuming the workflow-level cap is the constraint and never check the Code-node cap, because the Code node feels like part of the same workflow. It is not; it runs out-of-process on a separate runner with its own clock.

How the 60-second cap fails silently in long migrations

The failure mode is the most expensive shape a bug can take: the workflow returns success, downstream nodes process the partial result, and the destination ends up with the first N records instead of all of them.

Here is what happens in detail. The pipeline reads a list of, say, 1,676 records from the source. A Code node runs a for loop that processes each record (transform, dedup check, write to destination). Each record takes about 50 milliseconds, so the full loop should take roughly 84 seconds. After 60 seconds, the runner has finished about 1,200 records; it returns the array of those 1,200 to the workflow context, and exits. The workflow’s next node (say, a “Verify count” or “Send Slack notification”) receives [1,200 items] and proceeds.

The workflow log shows green. The Code node’s “items processed” output reads 1,200, but there is no annotation that this was a truncation rather than a completion. The destination has the first 1,200 records. The 476 missing records produce no error, no alert, no log entry.

I have walked into this exact failure on a Bubble.io migration; the silent failures case study covers it as failure #1 of four.

The dispatcher-worker pattern, in detail

The pattern decomposes the long-running job into two workflows:

The dispatcher (parent workflow)

Reads the full input set from the source.
Splits it into chunks small enough to fit under the 60-second cap with margin.
Loops over the chunks. For each chunk:
- Fires a webhook to the worker workflow with the chunk as the payload.
- Waits for the worker to respond (using HTTP Request with Wait for response or a Wait-for-webhook node).
- Logs success/failure for that chunk.
Reports a final summary at the end.

The dispatcher’s own Code work is trivial: it splits arrays and dispatches HTTP calls. None of its operations approach the 60-second cap.

The worker (child workflow)

Triggered by the webhook the dispatcher fires.
Receives one chunk’s worth of records.
Processes them (transform, dedup, write).
Returns a response with success/failure status and processed-count.

The worker’s processing happens in a fresh execution context. Its Code-node clock starts at zero when the webhook fires. As long as one chunk’s processing fits under 60 seconds, the worker never hits the cap.

Why webhook triggering specifically

n8n offers several ways for one workflow to invoke another (the “Execute Workflow” node, sub-workflows, webhook calls). Webhook calls are the right primitive here because:

They cross the execution-context boundary cleanly. The worker’s invocation does not inherit the dispatcher’s clock.
They are observable. Each call produces an HTTP request you can log and replay.
They are recoverable. If a worker fails, the dispatcher can retry the same payload by re-firing the webhook.
They scale across instances. If you graduate to self-hosted with multiple workers, the webhook URL can fan out to a load-balanced pool without changing the dispatcher logic.

Sizing chunks correctly

The chunk size is the single tuning parameter that determines whether the pattern works. Three rules:

Target 30 seconds of worker processing per chunk, not 60. This leaves headroom for transient slowness in the destination platform, network blips, and OAuth token refreshes that happen mid-chunk.
Start empirically. Do not pre-compute the optimal size. Start at 50 records per chunk, run the worker, time it. If it lands at 5 seconds, increase to 200. If it lands at 25 seconds, hold. If it lands at 45 seconds, halve.
Make chunk size a parameter on the dispatcher. Hard-coding it forces a workflow re-edit every time the destination platform’s latency profile changes. Reading it from a configuration node or the workflow’s start parameters means tuning is a one-line change.

Reasonable starting points by workload:

Workload	Chunk size
Bubble.io single-record upsert with verification re-query	25–50 records
n8n API-Connector calls with retry middleware	25–100 records
Pure-JavaScript transforms, no network	200–500 records
Heavy text processing (LLM tokenization, similarity scoring)	5–20 records

State management and resume-ability

The dispatcher should persist progress between chunks so a partial run can be resumed without re-processing already-completed chunks. Two storage approaches:

External store (Postgres, Airtable, or a Bubble Thing). Each chunk has a row with status: pending | done | failed. The dispatcher reads the table at start, dispatches only pending chunks, and marks each one done as the worker returns. Failures are explicit.
In-payload checkpoint. The chunk payload includes a chunk_index and total_chunks. The dispatcher can resume from any index by reading the destination’s natural-key fingerprints (see the natural-key fingerprint pattern) and skipping already-written chunks.

The external-store approach is more explicit and easier to debug. The in-payload approach is lighter weight and avoids the coordination overhead. For the migrations I have run, external store wins on production work and in-payload wins for one-shot dev migrations.

When to graduate to self-hosted n8n

The dispatcher-worker pattern lets you stay on Cloud Pro for migrations that would otherwise require self-hosting. Three signals it is time to graduate anyway:

Operations that cannot be chunked. Long-running ML inference, video processing, large file transforms where each unit takes more than 60 seconds. Self-hosted lets you raise EXECUTIONS_TIMEOUT and the task-runner config to fit.
Execution volume. Above roughly 50,000 executions per month, self-hosted on a small VPS is cheaper than Cloud Pro. The break-even point depends on workflow complexity but is reliably within an order of magnitude of that number.
Workflows longer than 40 minutes. Cloud Pro’s total-workflow cap is the second constraint. Multi-hour ETLs need self-hosted.

Until at least one of these is true, stay on Cloud and use the dispatcher-worker pattern. Cloud removes the operational surface of running n8n yourself, which is worth a real amount on a single-operator practice.

The general rule

Long-running n8n work fails silently on Cloud unless you give every chunk a fresh execution clock.

The dispatcher-worker pattern is how you do that. It is not the only way (you can also write everything as scheduled jobs that run in 60-second windows, or use the Wait-on-webhook resume pattern), but it is the most operationally clean and the most resumable.

If you are running production migrations on n8n Cloud across North America, the UK and Ireland, the EU and EEA, or the ANZ region, and your batches are silently truncating at the 60-second mark, let’s talk. The dispatcher-worker architecture is what gets installed on day one of every n8n migration engagement.

Silent failures: the bug class no AI tool catches in your data pipeline : the case study where this pattern fixed a real-world 639-row gap.
Idempotent data pipelines: the natural-key fingerprint pattern : pairs naturally with dispatcher-worker for resumable, dedup-safe runs.
Why I stopped trusting Bubble.io’s list fields and re-query the database instead : the verification-side companion.

Frequently asked questions

What are n8n Cloud's execution timeout limits?

Two limits apply on n8n Cloud. First, Code nodes are matched to a runner within a 60-second task-runner timeout (this is hard-coded; you cannot configure it on Cloud). Second, total workflow execution has an upper bound that varies by plan tier (around 5 minutes on the free tier, roughly 40 minutes on Pro, longer on Enterprise). The 60-second Code-node cap is the one that surprises most operators because it fires silently: the runner returns whatever was produced and the workflow continues without an error log.

Why do Code nodes time out at 60 seconds in n8n Cloud?

n8n Cloud runs Code nodes through an external task-runner architecture introduced in v2.x. Each Code node task is queued and matched to a runner; if no runner is available within 60 seconds, or the task takes longer than 60 seconds to complete, the runner returns. The [n8n community has tracked this](https://community.n8n.io/t/task-request-timed-out-after-60-seconds-n8n-cloud/260133) as a recurring issue since the v2 upgrade. It is infrastructural, not a configurable timeout, so the workaround is architectural: keep individual Code nodes well under 60 seconds by chunking the input.

How does the dispatcher-worker pattern work in n8n?

The dispatcher is a parent workflow that takes the full input set and splits it into chunks small enough to comfortably fit under the 60-second cap (typically 50-200 records per chunk depending on processing complexity). For each chunk, the dispatcher fires a webhook to a child workflow (the worker), which processes that chunk and reports back. The dispatcher waits for the worker's response before firing the next chunk. Every worker execution starts a fresh clock, so the only thing that has to fit in 60 seconds is one chunk's worth of work, not the whole job.

Can I increase n8n Cloud's timeout limit?

Not on the Code-node side; the 60-second runner timeout is fixed across all Cloud tiers. The total-workflow cap can be raised by upgrading from free (5 min) to Pro (around 40 min) to Enterprise (longer), but Code-node cap stays the same. If you need individual operations to run longer than 60 seconds, you have two options: (1) decompose the operation into a chunked dispatcher-worker, or (2) graduate to self-hosted n8n where both timeouts are configurable via environment variables (`EXECUTIONS_TIMEOUT` and the task-runner config).

When should I move from n8n Cloud to self-hosted n8n?

Three signals. First, you have individual operations that genuinely cannot be chunked into sub-60-second units (long-running ML inference, video processing, large file transforms). Second, your monthly execution count is hitting Cloud Pro limits; self-hosted on a small VPS becomes cheaper above roughly 50,000 executions per month. Third, you need to run workflows longer than 40 minutes (large data migrations, multi-hour ETLs). Until at least one of these is true, Cloud is usually the right call because it removes the operational surface of running n8n yourself.

How big should each chunk be in the dispatcher-worker pattern?

Size each chunk so that worker processing time is comfortably under 30 seconds, leaving headroom for transient slowness. For a Bubble.io upsert pipeline writing single records, that is typically 50-100 records per chunk. For an n8n workflow doing API calls with retries, 25-50. For pure JavaScript transforms with no network, 200-500. The right number is empirical: start at 50, time the worker, halve or double until you hit the 30-second target. Building this in as a parameter at the dispatcher level means you can tune without redesigning the workflow.

Tagged

n8n Cloud's 60-second timeout: the dispatcher-worker pattern that beats it

What are n8n Cloud’s execution timeout limits?

How the 60-second cap fails silently in long migrations