Streams: Reduction Operations in Java

The reduce method is a general mechanism for computing a value from a stream. The simplest form takes a binary function and keeps applying it, starting with the first two elements. It’s easy to explain this if the function is the sum:

List<Integer> values = …;

Optionat<Integer> sum = values.stream().reduce((x, y) -> x + y);

In this case, the reduce method computes vo + v1 + v2 + . . . , where vi are the stream elements. The method returns an Optional because there is no valid result if the stream is empty.

More generally, you can use any operation that combines a partial result x with the next value y to yield a new partial result.

Here is another way of looking at reductions. Given a reduction operation op, the reduction yields v0 op v1 op v2 op . . . , where vi op vi + 1 denotes the function call op(vi, vi + 1). There are many operations that might be useful in practice—such as sum, product, string concatenation, maximum and minimum, set union or intersection.

If you want to use reduction with parallel streams, the operation must be associative: It shouldn’t matter in which order you combine the elements. In math notation, (x op y) op z must be equal to x op (y op z). An example of an operation that is not associative is subtraction. For example, (6 – 3) – 2 # 6 – (3 – 2).

Often, there is an identity e such that e op x = x, and that element can be used as the start of the computation. For example, 0 is the identity for addition, and you can use the second form of reduce:

List<Integer> values = . . .;

Integer sum = values.stream().reduce(0, (x, y) -> x + y);

// Computes 0 + v0 + v1 + v2 + . . .

The identity value is returned if the stream is empty, and you no longer need to deal with the Optional class.

Now suppose you have a stream of objects and want to form the sum of some property, such as lengths in a stream of strings. You can’t use the simple form of reduce. It requires a function (T, T) -> T, with the same types for the arguments and the result, but in this situation you have two types: The stream elements have type String, and the accumulated result is an integer. There is a form of reduce that can deal with this situation.

First, you supply an “accumulator” function (total, word) -> total + word.length(). That function is called repeatedly, forming the cumulative total. But when the computation is parallelized, there will be multiple computations of this kind, and you need to combine their results. You supply a second function for that purpose. The complete call is

int result = words.reduce(0,

(total, word) -> total + word.length(),

(total1, total2) -> total1 + total2);

Source: Horstmann Cay S. (2019), Core Java. Volume II – Advanced Features, Pearson; 11th edition.

Leave a Reply

Your email address will not be published. Required fields are marked *