Datastream Module

Datastream Module is useful for intra- and inter-server communication and asynchronous data processing. It is an important building block for other DataKernel modules.

You can add Datastream module to your project by inserting dependency in pom.xml:


DataStream is:

  • Modern implementation of async reactive streams (unlike streams in Java 8 and traditional thread-based blocking streams)
  • Asynchronous with extremely efficient congestion control, to handle natural imbalance in speed of data sources
  • Composable stream operations (mappers, reducers, filters, sorters, mergers/splitters, compression, serialization)
  • Stream-based network and file I/O on top of Eventloop module
  • Compatibility with CSP module

Datastream has a lot in common with CSP module. Although they both were designed for I/O processing, there are several important distinctions:

  Datastream CSP
Overhead: Extremely low: stream can be started with 1 virtual call, short-circuit evaluation optimizes performance No short-circuit evaluation, overhead is higher
Throughput speed: Extremely fast Fast, but slower than Datastream
Optimized for: Small pieces of data Medium-sized objects, ByteBufs
Programming model: More complicated Simple and convenient

To provide maximum efficiency, our framework widely utilizes combinations of CSP and Datastream. For this purpose, ChannelSupplier, ChannelConsumer, StreamSupplier and StreamConsumer have transformWith() methods and special Transformer interfaces. Using them, you can seamlessly transform channels into other channels or datastreams and vice versa, creating chains of such transformations.

You can explore more Datastream examples here

This module on GitHub repository