Java 8 Streams
Bhaskar S | 10/24/2014 |
A typical Java application will invariably use some form of Collection class to iterate over a sequence of data elements and perform operation(s) on each of them.
Imagine we had a Collection with a large number of data elements. Iterating over them and processing them one by one would be very inefficient. A more efficient approach would be to slice up the large Collection into smaller chunks and process each chunk in a separate thread. Writing multi-threaded code is complex and error-prone.
What if Java provided an out-of-the-box capability to automatically iterate over and process a large Collection of data elements in parallel without one having to write a single line of multi-threaded code ???
This is exactly what Java 8 Streams does !!!
In other words, Streams is a new capability in Java 8 that can perform operation(s) on a Collection of data elements, either in sequential or in parallel mode.
Without further ado, lets jump right into some examples to illustrate the power of Streams.
The following is the simple program that filters names beginning with the letter "j" using the traditional Java language constructs:
Executing the program from Listing.1 will generate the following output:
<1> Checking n = alice <1> Checking n = bob <1> Checking n = charlie <1> Checking n = joe <1> Starts with j: joe <1> Checking n = john <1> Starts with j: john
The following is same simple program that filters names beginning with the letter "j" using Java 8 Streams:
Executing the program from Listing.2 will generate the following output:
<2> Checking n = alice <2> Checking n = bob <2> Checking n = charlie <2> Checking n = joe <2> Starts with j: joe <2> Checking n = john <2> Starts with j: john
The code in Listing.2 needs a little explanation.
Arrays.stream() :: Returns a sequential stream of elements for the specified array of String as the source
Stream.filter(Predicate) :: Returns a stream of elements after applying the specified Predicate as the filter
Stream.forEach(Consumer) :: Invokes the specified Consumer on each of the elements in the stream
The code in Listing.2 is far more elegant compared to Listing.1.
The following is a simple program which demonstrates the various operations on a stream of numbers such as counting, finding the distinct numbers, finding the sum, average, min, and max:
Executing the program from Listing.3 will generate the following output:
<1> Count of all numbers: 16 <2> Count of all numbers: 0 <1> Count of distinct numbers: 9 <2> Count of distinct numbers: 0 <1> Sum of all numbers: 88 <2> Sum of all numbers: 0 <1> Average number: 5.5 <1> Min number: 1 <1> Max number: 9
The code in Listing.3 needs a little explanation.
Stream.count() :: Returns the count of elements in the stream. The count is zero for an empty stream
Stream.distinct() :: Returns a stream consisting of distinct elements from this stream
Stream.mapToInt(ToIntFunction) :: Returns an IntStream (an IntStream is a stream of ints supporting sequential and parallel aggregate operations, such as, count, sum, average, min, max, etc) by applying the specified ToIntFunction function to each of the elements in this stream
IntStream.sum() :: Returns the sum of all the elements in this stream
IntStream.average() :: Returns an OptionalDouble that contains the arithmetic mean of all the elements in this stream (if not empty)
Stream.min(Comparator) :: Returns the minimum element of this stream according to the specified Comparator
Stream.min(Comparator) :: Returns the maximum element of this stream according to the specified Comparator
Continuing our journey on Streams, the following is a simple program that demonstrates the use of operations such as map, flatMap, sorted, and collect on streams:
Executing the program from Listing.4 will generate the following output:
[9, 33, 73, 129, 129, 129, 73, 33] [1, 3, 5, 7, 7, 7, 9, 3] [2, 4, 6, 8, 8, 8, 6, 4] :: [1, 3, 5, 7, 7, 7, 9, 3] 2, 4, 6, 8, 8, 8, 6, 4, 1, 3, 5, 7, 7, 7, 9, 3
The code in Listing.4 needs a little explanation.
Stream.map(Function) :: Returns a stream of elements after applying the specified transformation Function to each of the elements in this stream.
map(n -> 2*n*n + 1) takes each element n (which is a number) from the stream and returns the value of 2 times n to the power 2 plus 1
map(Object::toString) takes each element from the stream and converts it into a String
Stream.collect(Collector) :: A terminal operator that returns the result of performing the specified accumulation operation on all elements of this stream.
Collectors is a utility class that provides various common implementations for the Collector interface such as accumulating elements of a stream into a collection such as a List or concatenating elements of a stream into a String
Collectors.toList() :: Returns a Collector that accumulates the elements of a stream into a List
Collectors.joining(CharSequence) :: Returns a Collector that concatenates the elements of a stream, separated by the specified delimiter sequence, into a String
Stream.sorted() :: Returns a sorted order (natural order) of elements from this stream
Stream.flatMap() :: Returns a stream of elements by converting each element in this stream into a sub-stream and then merging elements from all the sub-streams.
In our example, flatMap takes elements from the two Lists, numbers1 and numbers2, and returns a stream of all the numbers from flattening and merging the two Lists
Until now, all our examples only used primitive types. In this example we will demonstrate the use of Streams with a collection of user defined objects.
The following is a simple grade object that encapsulates a subject and the corresponding grade:
The following is a simple program that finds the average grade and finds the best grade:
Executing the program from Listing.6 will generate the following output:
Average grade: 73.8 Best grade - Math <85.5>
The code in Listing.6 needs a little explanation.
Stream.mapToDouble(ToDoubleFunction) :: Returns an DoubleStream (a DoubleStream is a stream of doubles supporting sequential and parallel aggregate operations, such as, count, sum, average, min, max, etc) by applying the specified ToDoubleFunction function to each of the elements in this stream
Stream.reduce(BinaryOperator) :: Returns a single element after applying the specified reduction operator on all elements of this stream.
In our example, reduce starts with the first two grades, finds the best grade among the two, uses the resulting grade and compares with the next grade and so on until it exhausts all the grades in this stream to find the best grade
Now for the most exciting part. We introduced Streams as a way to perform operation(s) on a Collection of data elements, either in sequential or in parallel mode.
Up until now, all our examples were executed in a sequential fashion. In this example, we will demonstrate the use of Streams in both sequential and parallel mode without writing a single line of multi-threaded code.
Excited ???
The following is a simple object that encapsulates a number and its corresponding factors:
The following is a simple program that finds the factors for all the numbers in a list in sequential mode:
Executing the program from Listing.8 will generate the following output:
main: factors for num = 976534 Num: 976534, Factors: [1, 2, 13, 23, 26, 46, 71, 142, 299, 529, 598, 923, 1058, 1633, 1846, 3266, 6877, 13754, 21229, 37559, 42458, 75118, 976534] main: factors for num = 1267543 Num: 1267543, Factors: [1, 47, 149, 181, 7003, 8507, 26969, 1267543] main: factors for num = 1543087 Num: 1543087, Factors: [1, 7, 13, 31, 91, 217, 403, 547, 2821, 3829, 7111, 16957, 49777, 118699, 220441, 1543087] main: factors for num = 845987 Num: 845987, Factors: [1, 845987] main: factors for num = 1720728 Num: 1720728, Factors: [1, 2, 3, 4, 6, 8, 9, 12, 18, 24, 36, 72, 23899, 47798, 71697, 95596, 143394, 191192, 215091, 286788, 430182, 573576, 1720728] main: factors for num = 590630 Num: 590630, Factors: [1, 2, 5, 10, 59063, 118126, 590630] main: factors for num = 1935609 Num: 1935609, Factors: [1, 3, 13, 31, 39, 93, 403, 1209, 1601, 4803, 20813, 49631, 62439, 148893, 645203, 1935609] main: factors for num = 1390765 Num: 1390765, Factors: [1, 5, 349, 797, 1745, 3985, 278153, 1390765] main: factors for num = 320609 Num: 320609, Factors: [1, 320609] main: factors for num = 1123001 Num: 1123001, Factors: [1, 11, 121, 9281, 102091, 1123001] Total time (ms): 147
The following is the same simple program that finds the factors for all the numbers in a list, expect it executes in a parallel mode with a small code change:
The only change is the addition of the operation parallel() to the stream.
Executing the program from Listing.9 will generate the following output:
main: factors for num = 1935609 ForkJoinPool.commonPool-worker-1: factors for num = 1543087 ForkJoinPool.commonPool-worker-2: factors for num = 320609 ForkJoinPool.commonPool-worker-3: factors for num = 1390765 Num: 320609, Factors: [1, 320609] ForkJoinPool.commonPool-worker-2: factors for num = 1123001 Num: 1123001, Factors: [1, 11, 121, 9281, 102091, 1123001] ForkJoinPool.commonPool-worker-2: factors for num = 1267543 Num: 1390765, Factors: [1, 5, 349, 797, 1745, 3985, 278153, 1390765] ForkJoinPool.commonPool-worker-3: factors for num = 1720728 Num: 1935609, Factors: [1, 3, 13, 31, 39, 93, 403, 1209, 1601, 4803, 20813, 49631, 62439, 148893, 645203, 1935609] main: factors for num = 590630 Num: 1267543, Factors: [1, 47, 149, 181, 7003, 8507, 26969, 1267543] ForkJoinPool.commonPool-worker-2: factors for num = 976534 Num: 1543087, Factors: [1, 7, 13, 31, 91, 217, 403, 547, 2821, 3829, 7111, 16957, 49777, 118699, 220441, 1543087] ForkJoinPool.commonPool-worker-1: factors for num = 845987 Num: 590630, Factors: [1, 2, 5, 10, 59063, 118126, 590630] Num: 976534, Factors: [1, 2, 13, 23, 26, 46, 71, 142, 299, 529, 598, 923, 1058, 1633, 1846, 3266, 6877, 13754, 21229, 37559, 42458, 75118, 976534] Num: 845987, Factors: [1, 845987] Num: 1720728, Factors: [1, 2, 3, 4, 6, 8, 9, 12, 18, 24, 36, 72, 23899, 47798, 71697, 95596, 143394, 191192, 215091, 286788, 430182, 573576, 1720728] Total time (ms): 112
Notice from the Output.7 that Streams automatically handles parallelism under-the-hood using Fork-Join.
We did not have to write a single line of multi-threaded code - ain't this powerful !!!