Table Of Contents
In today’s digitally driven world, processing streaming data in real-time is a requirement for business success.
The introduction of 5G networks will only increase the data volume and speed requirements that are already putting pressure on traditional data architectures.
Organizations need to ingest this unprecedented increase in data traffic, while also driving actions by making intelligent, dynamic decisions across multiple data streams. Current data stream processing architectures are usually sufficient to act as processing pipelines, they do not meet the needs of mission-critical applications which are underscored by low latency and responsive multi-step decisions. In addition, with a projected increase in density of connected things per sq. Km (1 million per sq. km), and the prescribed low latency in single digit milliseconds, data and processing is going to be decentralised with several edge data centres, as opposed to the traditional few central hub data centres.
There is a confluence of incomplete information coming into play where traditional, and many contemporary choices for processing streaming data, are going to fail. For interactive low latency applications and streaming pipelines to coexist, they must use the same data to drive cross functional consistency.
The top four pieces of incomplete information are:
1. Microservices architecture mandates separation of state and logic.
What’s missing is an understanding of the types of business logic and where what should exist. While the application flow control logic can stay in the application layer, thus making the compute containers truly stateless, the data-driven business logic must exist with the data.
2. Network bandwidth usage efficiency.
When you have the state stored in a NoSQL data store and the container instance is going to have to move 10 to 25 kilobytes of data payload per interaction both ways (i.e. read the object from the store, modify it and send it back to the data store), the application quickly starts to consume high amounts of network bandwidth. In a virtualized or containerised world, network resources are like gold. One should not squander it for frivolous data movements.
3. Fundamental premise of stream processing.
Stream processing today is based on one of the time windowing concepts:event time window or process time window. This is not truly representative of reality. Organisations need continuous processing of events as they arrive either individually or contextually. This approach will avoid problems like missed events because they arrived late, without having to bloat the database to wait for the late arriving known last event.
4. Multiple streams of data get cross-polled to build complex events that drive decisions.
The event driven architecture is a stream of messages, each tied to an event driving some action. The challenge organisations face is building complex events from multiple streams of data, or a single stream of data driving changes to multiple state machines based on complex business logic.
A smart stream processing architecture allows one to:
- Ingest incoming event data into a state machine
- Build a contextual entity state from multiple streams of ingestion
- Apply a ruleset of business rules to drive decisions
- Enhance and enrich these rules by incorporating new learnings derived from machine learning initiatives iteratively
- Let the decisions propagate to drive actions
- Migrate the contextually completed/processed data to an archival store once they are not needed in the real time processing
The Smart Stream Processing Architecture consists of one unified environment for ingestion, processing, and storage. This integrated approach with built-in intelligence performs the analysis right where the data resides. It utilizes a blazing fast In-memory Relational Data Processing Platform (IMRDPP) to not only make streaming “smart”, but to also provide linear scale, predictable low latency, strict ACID, and a much lower hardware footprint that can easily be deployed at the edge. With built-in analytical capabilities such as aggregations, filtering, sampling and correlation — along with stored procedures / embedded supervised and unsupervised Machine Learning — all the essentials of real-time decision-oriented stream processing are available in one integrated platform.
To learn more, including 5 critical benefits of smart stream processing architecture, check out “The Evolution of the Smart Stream Processing Architecture”, our recent whitepaper.
Editor’s note: This is an excerpt of a piece that was originally published by ITProPortal on May 8, 2019. Read it in its original form here.