[Avg. reading time: 8 minutes]
Batch vs Streaming
Batch Processing
Batch means collect first, process later.
- Works on large chunks of accumulated data
- High throughput, cheaper, simpler
- Results are not real-time
- Typically minutes, hours, or days delayed
Examples:
- Daily or weekly sales reports
- End-of-day stock portfolio reconciliation
- Monthly billing cycles
- ETL pipelines that refresh a data warehouse
Use cases
- Data does not need to be acted on immediately
- A few minutes or hours of delay is acceptable
- You’re cleaning, transforming, aggregating large datasets
Stream Processing
Streaming means process events the moment they arrive.
- Low-latency (milliseconds to seconds)
- Continuous, event-by-event processing
- Ideal for real-time analytics and alerting
- Stateful systems maintain event history or running context
Examples:
- Stock price updates
- Fraud detection for credit cards
- Real-time gaming leaderboards
- IoT sensor monitoring
Use cases
- You need instant reactions
- Delays cause risk, loss, or bad UX
Micro Batch
Micro-batching groups incoming events into tiny batches and processes each mini-batch as a unit, giving near real-time outputs without true event-by-event streaming.
Micro-batch is not full streaming, and not full batch.
It’s a hybrid model where data is processed in very small batches at very short intervals (usually 100 ms to a few seconds).
Batch processing, but done so frequently that it feels like streaming.
Example: Realtime vs Microbatch
Credit Card Fraud Detection (Realtime)
Fraud scoring must be event-by-event or at worst sub-second.
- The bank must decide immediately: approve or decline
- Customer is standing at a checkout counter
- Delay = blocked transaction or fraud slipping through
- Regulatory requirements often demand immediate response
Credit Card Payment Posting (Micro Batch)
When a customer makes a payment toward their balance (online, app, ACH, etc), updating the backend systems does not require millisecond consistency.
Even if the balance updates with a 1-minute delay:
- No fraud risk
- No UX problem
- No operational impact
+------------------------------+
| STREAMING |
| Event → Process → Output |
| Latency: milliseconds |
+------------------------------+
+------------------------------+
| MICRO-BATCH |
| Tiny windows → Process |
| Latency: 0.5–10 seconds |
+------------------------------+
+------------------------------+
| BATCH |
| Accumulate → Process |
| Latency: minutes–hours |
+------------------------------+
Why Redis Pub/Sub is NOT real streaming
Redis Pub/Sub is often mistaken as “streaming”, but:
- Messages are not persisted
- No replay capability
- No consumer groups
- No fault tolerance
- If a subscriber goes offline — the message is gone
Use cases
- Lightweight notifications
- Chat-like message passing
- Ephemeral real-time signals
Not suitable
- Analytics
- Compliance or auditing
- Durable event logs
- Replaying data
- Multi-consumer systems