Patterns
intermediateorchestration

Scheduled Batch Processor

Process a queue of items on a recurring schedule — daily reconciliation, weekly reports, hourly sync jobs. The workhorse pattern for routine data processing.

Views28
BPMN 2.0
On this page

Visual Flow

Rendering diagram…

When to Use This Pattern

Use a scheduled batch processor when:

  • You need to process items on a recurring schedule (hourly, daily, weekly)
  • Items accumulate in a queue and should be processed in bulk (not one-by-one as they arrive)
  • The processing involves external systems that prefer batch API calls over individual requests
  • You run regular jobs like reconciliation, report generation, or data cleanup

How It Works

PhaseDescription
TriggerSchedule fires (e.g., daily at 6 AM)
FetchQuery for unprocessed items (status = "pending")
ProcessLoop through items, applying business logic to each
UpdateMark each item as processed/failed
ReportGenerate a summary of what was processed

Implementation Guide

Step 1: Define the Schedule and Scope
ParameterExampleNotes
FrequencyDaily at 6:00 AM UTCUse UTC to avoid DST issues
Batch size100 items per runPrevents timeout on large queues
Scope queryWHERE status = 'pending' AND created_on < NOW() - INTERVAL 1 HOUROnly process items older than 1 hour (avoids race conditions)
Step 2: Fetch the Batch

Query your data source for items to process:

  1. Filter by status (pending, ready, queued)
  2. Order by priority, then by age (oldest first)
  3. Limit to batch size
  4. Lock the items — update their status to "processing" immediately to prevent another workflow instance from picking them up
Step 3: Process Each Item

For each item in the batch:

  1. Validate the data
  2. Apply business logic (calculations, lookups, transformations)
  3. Call external APIs if needed
  4. On success: update status to "completed"
  5. On failure: update status to "failed", log the error
Step 4: Handle Stragglers

After the loop completes:

  • Failed items: Log them for investigation. After N failures, flag for manual review.
  • Stuck items: If an item has been in "processing" for too long (>1 hour), it probably failed silently. Reset to "pending" for the next batch.
Step 5: Generate a Summary

At the end of each run, produce a report:

MetricValue
Total processed95
Successful90
Failed3
Skipped (already done)2
Duration4m 32s
Next scheduled runTomorrow 6:00 AM

Send this summary to the operations team or write it to a log.

Tips & Best Practices

Warning

Never let two instances of the same batch job run simultaneously. Use a lock mechanism: check for a running instance before starting, and always release the lock when done (even on failure).

  • Idempotency is critical. If the workflow crashes mid-batch and restarts, processing the same item twice should not create duplicate data.
  • Stagger large batches. If you have 10,000 items, process them in chunks of 100 with a short pause between chunks to avoid overloading systems.
  • Monitor no-ops. If the batch runs and processes 0 items repeatedly, it might mean your queue is empty (good) or your filter query is wrong (bad). Alert if appropriate.
  • Log timing data. Track how long each batch run takes. If it's creeping upward, you'll spot capacity issues before they become outages.

Related patterns