DEV Community

Janardhan Chejarla
Janardhan Chejarla

Posted on

Distributed Spring Batch Coordination, Part 2: How Database-Backed Partitioning Works

πŸ“˜ Part 2: How Database-Backed Partitioning Works

In Part 1, we discussed the challenges with traditional Spring Batch scaling β€” especially when relying on Kafka or RabbitMQ. In this part, let’s explore how we can simplify distributed coordination using a relational database as the central source of truth.


πŸ’‘ Key Idea

Rather than broadcasting partition instructions via messaging middleware, the master node writes coordination state into the database. Worker nodes read from the database to discover which partitions they are responsible for β€” no messaging layer required.


βš™οΈ Core Coordination Tables

This model relies on three lightweight tables:

  • BATCH_NODES: Registers active nodes in the cluster
  • BATCH_JOB_COORDINATION: Tracks coordination for each partitioned step
  • BATCH_PARTITIONS: Stores partition metadata and execution state (assigned node, status, result)

These tables allow for real-time visibility into job execution without external queues or in-memory state.


πŸ” Execution Flow

  1. Master node receives the job request
  2. It queries BATCH_NODES to find all currently active nodes
  3. Using either:
    • πŸŒ€ Round-Robin, or
    • 🎯 Fixed-Node allocation the master assigns partitions and stores them in BATCH_PARTITIONS
  4. Workers poll for tasks where assigned_node = self
  5. Once complete, they update their partition status
  6. The master monitors for completion and performs final aggregation, if needed

βœ… This works with any JDBC-compatible database and integrates well with Kubernetes, CI/CD workflows, or container-based clusters.


πŸ“Š Visual Overview

Coordination Sequence


🧠 Why This Works

This architecture avoids common pitfalls like:

  • Tight coupling to messaging infrastructure
  • Delayed worker availability
  • Lack of visibility into node state
  • Hard-to-debug coordination failures

Instead, all coordination is transparent, queryable, and resilient to restarts.


🏁 What’s Next

In the next part of the series, we’ll cover:

πŸ”Ή Failure handling strategies
πŸ”Ή Retry logic
πŸ”Ή Node liveness and rebalancing


πŸ“˜ Read Part 1 here

🌟 GitHub Repo: jchejarla/spring-batch-db-cluster-partitioning

Top comments (0)