intermediateorchestration

Scheduled Batch Processor

Process a queue of items on a recurring schedule — daily reconciliation, weekly reports, hourly sync jobs. The workhorse pattern for routine data processing.

BPMN 2.0

On this page

Visual Flow

Rendering diagram…

When to Use This Pattern

Use a scheduled batch processor when:

You need to process items on a recurring schedule (hourly, daily, weekly)
Items accumulate in a queue and should be processed in bulk (not one-by-one as they arrive)
The processing involves external systems that prefer batch API calls over individual requests
You run regular jobs like reconciliation, report generation, or data cleanup

How It Works

Phase	Description
Trigger	Schedule fires (e.g., daily at 6 AM)
Fetch	Query for unprocessed items (status = "pending")
Process	Loop through items, applying business logic to each
Update	Mark each item as processed/failed
Report	Generate a summary of what was processed

Implementation Guide

Step 1: Define the Schedule and Scope

Parameter	Example	Notes
Frequency	Daily at 6:00 AM UTC	Use UTC to avoid DST issues
Batch size	100 items per run	Prevents timeout on large queues
Scope query	`WHERE status = 'pending' AND created_on < NOW() - INTERVAL 1 HOUR`	Only process items older than 1 hour (avoids race conditions)

Step 2: Fetch the Batch

Query your data source for items to process:

Filter by status (pending, ready, queued)
Order by priority, then by age (oldest first)
Limit to batch size
Lock the items — update their status to "processing" immediately to prevent another workflow instance from picking them up

Step 3: Process Each Item

For each item in the batch:

Validate the data
Apply business logic (calculations, lookups, transformations)
Call external APIs if needed
On success: update status to "completed"
On failure: update status to "failed", log the error

Step 4: Handle Stragglers

After the loop completes:

Failed items: Log them for investigation. After N failures, flag for manual review.
Stuck items: If an item has been in "processing" for too long (>1 hour), it probably failed silently. Reset to "pending" for the next batch.

Step 5: Generate a Summary

At the end of each run, produce a report:

Metric	Value
Total processed	95
Successful	90
Failed	3
Skipped (already done)	2
Duration	4m 32s
Next scheduled run	Tomorrow 6:00 AM

Send this summary to the operations team or write it to a log.

Tips & Best Practices

Warning

Never let two instances of the same batch job run simultaneously. Use a lock mechanism: check for a running instance before starting, and always release the lock when done (even on failure).

Idempotency is critical. If the workflow crashes mid-batch and restarts, processing the same item twice should not create duplicate data.
Stagger large batches. If you have 10,000 items, process them in chunks of 100 with a short pause between chunks to avoid overloading systems.
Monitor no-ops. If the batch runs and processes 0 items repeatedly, it might mean your queue is empty (good) or your filter query is wrong (bad). Alert if appropriate.
Log timing data. Track how long each batch run takes. If it's creeping upward, you'll spot capacity issues before they become outages.

Related patterns

orchestrationFeatured

Saga with Compensating Transactions

Coordinate a multi-step business transaction that spans several systems by pairing each step with a rollback action. If a later step fails, run the rollbacks in reverse to restore a consistent state.

Advanced

orchestrationFeatured

State Machine Workflow

Model a business process as a set of defined states with explicit transitions between them. Unlike linear workflows, items can move forwards, backwards, and loop — matching how real business processes actually behave.

Advanced

orchestrationFeatured

Fan-Out / Fan-In

Split a workload into parallel branches, process them simultaneously, then aggregate the results. Dramatically reduces processing time for batch operations.

Advanced