[Avg. reading time: 5 minutes]

Types of Streaming

Stateless Streaming

  • Processes each record independently
  • No memory of previous events
  • Simple transformations and filtering
  • Highly scalable

Examples of Stateless

  • Unit conversion (Celsius to Fahrenheit) for each reading
  • Data validation (checking if temperature is within realistic range)
  • Simple transformations (rounding values)
  • Filtering (removing invalid readings)
  • Basic alerting (if current temperature exceeds threshold)

Use Cases:

  • You only need to process current readings
  • Simple transformations are sufficient
  • Horizontal scaling is important
  • Memory resources are limited

Stateful Streaming:

  • Maintains state across events
  • Enables complex processing like windowing and aggregations
  • Requires state management strategies
  • Good for pattern detection and trend analysis

Examples of Stateful

  • Calculating moving averages of temperature
  • Detecting temperature trends over time
  • Computing daily min/max temperatures
  • Identifying temperature patterns
  • Calculating rate of temperature change
  • Detecting anomalies based on historical patterns
  • Unusual suspicious financial activity

Use Cases:

  • You need historical context
  • Analyzing patterns or trends
  • Computing moving averages
  • Detecting anomalies
  • Time-window based analysis is required

Different Ingestion Services

Stream Processing Frameworks:

Structured Streaming (Databricks/Apache Spark)

A processing framework for handling streaming data Part of Apache Spark ecosystem

Message Brokers/Event Streaming Platforms:

Apache Kafka (Open Source)

  • Distributed event streaming platform
  • Self-managed

Amazon MSK

  • Managed Kafka service
  • AWS managed version of Kafka

Amazon Kinesis

  • AWS native streaming service
  • Different from Kafka-based solutions

Azure Event Hubs

  • Cloud-native event streaming service
  • Azure’s equivalent to KafkaVer 5.5.3
Last change: 2025-10-15