π Part 2: How Database-Backed Partitioning Works
In Part 1, we discussed the challenges with traditional Spring Batch scaling β especially when relying on Kafka or RabbitMQ. In this part, letβs explore how we can simplify distributed coordination using a relational database as the central source of truth.
π‘ Key Idea
Rather than broadcasting partition instructions via messaging middleware, the master node writes coordination state into the database. Worker nodes read from the database to discover which partitions they are responsible for β no messaging layer required.
βοΈ Core Coordination Tables
This model relies on three lightweight tables:
-
BATCH_NODES
: Registers active nodes in the cluster -
BATCH_JOB_COORDINATION
: Tracks coordination for each partitioned step -
BATCH_PARTITIONS
: Stores partition metadata and execution state (assigned node, status, result)
These tables allow for real-time visibility into job execution without external queues or in-memory state.
π Execution Flow
- Master node receives the job request
- It queries
BATCH_NODES
to find all currently active nodes - Using either:
- π Round-Robin, or
- π― Fixed-Node allocation
the master assigns partitions and stores them in
BATCH_PARTITIONS
- Workers poll for tasks where
assigned_node = self
- Once complete, they update their partition status
- The master monitors for completion and performs final aggregation, if needed
β This works with any JDBC-compatible database and integrates well with Kubernetes, CI/CD workflows, or container-based clusters.
π Visual Overview
π§ Why This Works
This architecture avoids common pitfalls like:
- Tight coupling to messaging infrastructure
- Delayed worker availability
- Lack of visibility into node state
- Hard-to-debug coordination failures
Instead, all coordination is transparent, queryable, and resilient to restarts.
π Whatβs Next
In the next part of the series, weβll cover:
πΉ Failure handling strategies
πΉ Retry logic
πΉ Node liveness and rebalancing
π Read Part 1 here
π GitHub Repo: jchejarla/spring-batch-db-cluster-partitioning
Top comments (0)