Effective Java TL;DR - Lambdas and Streams

Lambdas and streams are awesome and powerful, but with that power comes the responsibility. Just because they are modern and cool doesn’t mean that you should rewrite everything to use them.

Check the rest of the series here.

Item 42: Prefer lambdas to anonymous classes

Interfaces with a single abstract method, known as functional interfaces, Java handles with special care. Instances of these interfaces can be created with, of course, anonymous classes, but also with lambda expressions or lambdas. The rule is: don’t use anonymous classes for instantiating functional interfaces, use lambdas instead. Worth noting is that if an interface, besides a single abstract method, has multiple default methods, it is still treated as a functional interface.

“Omit the types of all lambda parameters unless their presence makes your program clearer.” Only if the compiler can’t infer your type, you specify it. The thing which allows you to omit parameter type is a compiler feature called type inference (JLS chapter 18). It also enables the diamond operator and the var keyword.

“Unlike methods and classes, lambdas lack names and documentation; if a computation isn’t self-explanatory, or exceeds a few lines, don’t put it in a lambda.”

Where lambdas can’t help is when you need to instantiate an abstract class or an interface with multiple abstract methods. Also, with anonymous class, you can obtain a reference to itself but when you use this keyword inside a lambda, it refers to the enclosing type.

One important thing to note: lambdas are not temporary instantiations of an anonymous class, they are implemented using invokedynamic bytecode instruction (learn more about it here).

Item 43: Prefer method references to lambdas

If you like how short and sweet lambdas can be (lambdas bigger than 3-4 lines could and should be refactored), then you will like how method references can make that code even shorter and sweeter. Generally, method references should be preferred over lambdas, but there are situations where lambdas would produce clearer code. For example, in a situation where input parameters can provide good documentation of what is happening then use lambda, or, when a method is in the same class as lambda and the class name is pretty long, then calling that method from lambda will be shorter.

There are four types of method references:

Static – reference to a static method – SomeClass::staticMethodName
Bound – reference to an instance method of a particular object: someObjectInstance::instanceMethodName
Unbound – Reference to an instance method of an arbitrary object of a particular type: SomeObject::instanceMethodName
Constructor – Reference to a constructor (class or array): SomeClass::new or int[]::new

In simple terms: “Where method references are shorter and clearer, use them; where they aren’t, stick with lambdas.“

Item 44: Favor the use of standard functional interfaces

One of the core packages that you should be familiar with is java.util.function package. The large collection of functional interfaces it provides (forty-three to be exact) will certainly cover most of your cases.
“If one of the standard functional interfaces does the job, you should generally use it in preference to a purpose-built functional interface.”

There are six basic interfaces you should know: UnaryOperator<t>, BinaryOperator<T>, Predicate<T>, Function<T, R>, Supplier<T> and Consumer<T>. The rest of the interfaces can be easily derived and you can scan the whole package here.

Some situations where you should write your own functional interface even if there is a Java provided that could be used are when the interface would provide a descriptive name or you could add useful custom default methods to it. “Always annotate your functional interfaces with the @FunctionalInterface annotation.”

Item 45: Use streams judiciously

Stream API, introduced with Java 8, is a fluent API consisting of a stream (finite or infinite sequence of data) and intermediate operations with one terminal operation. Intermediate operations with a terminal operation is known as a stream pipeline. The stream pipeline is evaluated lazily, meaning that the data processing doesn’t start until the terminal operation is invoked. An important thing to note is that streams are not just an API, but a paradigm based on functional programming.

Streams are a great tool, but that does not mean that you should stream everything. “Overusing streams makes programs hard to read and maintain.” Trying to put everything in a single expression with a giant stream pipeline as a result can really hurt the eyes. Be smart about them and resist the urge to convert all your loops into streams. Some things you can do with them really nicely and elegantly (combine sequences of elements, filter them or search for an element) but some things you can’t do at all, like modifying local variables or returning from an enclosing method.

“In the absence of explicit types, careful naming of lambda parameters is essential to the readability of stream pipelines.” Avoid using single letter parameter names, typing the whole name doesn’t hurt at all. Besides this, use helper methods instead of multiline lambdas to greatly improve readability.

Generally, when not sure what to use, streams or loops then try both implementations and see what seems to be better. If you’re in doubt between the two then maybe iteration is a safer choice.

Item 46: Prefer side-effect-free functions in streams

A wrong way of using streams is the one where your intermediate operations depend on some mutable state and they themselves update any state. This is the way of using streams only as an API but not as a (functional) paradigm.

The right way of using them is to strive using pure functions as function objects for stream intermediate operations. This means that the result of your lamdas in the stream pipeline should depend only on the input they are provided with. They should not depend on the external state nor they should update any. This is usually the case with forEach operation where the computation is done in the operation itself, similarly like it would be done in a classical for-each loop.

“The forEach operation should be used only to report the result of a stream computation, not to perform the computation.”

Besides side-effect-free functions, to use streams properly you should know Collectors API. Collector interface, returned by many of Collectors factory methods, is like some reduction strategy. It basically produces a single object from the stream elements that is typically some sort of a collection. Most important Collectors factories are toList, toSet, toMap, groupingBy and joining.

Item 47: Prefer Collection to Stream as a return type

Return a stream only when you are sure that the method is going to be used in a stream pipeline and return a Iterable when you know that a result of a method is going to be used for iteration.

“Collection or an appropriate subtype is generally the best return type for a public, sequence- returning method.”

Besides a Collection, you can also return an array because you can nicely get it to work for iteration with Arrays.asList or for streaming with Stream.of methods.

In a situation where your collection is greater than Integer.MAX_VALUE, the limit which Collection size method can return (because it returns an integer), then you will be better off with a Stream or an iterable as a return type.

Item 48: Use caution when making streams parallel

With Streams API, Java 8 also provided the ability to process a stream pipeline in parallel with a fork-join execution model. Simply by calling Stream parallel method (or parallelStream on a Collection), you invoke fork-join pool and your stream pipeline is executed much faster ( sometimes almost number of your CPU cores times faster, which is the default number of threads that a ForkJoinPool spins up).

But to achieve this it takes more than just calling parallel method. You need have appropriate stream source, avoid using intermediate operations which could hinder the fork-join algorithm (operation like limit) and stay away from short-circuiting terminal operations like anyMatch or allMatch.

“As a rule, performance gains from parallelism are best on streams over ArrayList, HashMap, HashSet, and ConcurrentHashMap instances; arrays; int ranges; and long ranges.” The reason is that these data structures are easily split up by the Stream spliterator (returned by the spliterator method). Also, these structures provide a good locality of reference, meaning that their elements are stored close to one another in memory, which is very important for parallelizing bulk operations.

Parallelizing streams is purely a performance optimization and as such, it should be used only when your measurements tell you that it has to be done. Then, after you do it, make sure that it runs correctly and measure again to see if there is a speed improvement or something opposite of that. Also, an important thing to know is that all your paralelle stream executions will by default use the same common fork-join pool, which means that one bad execution can hinder the performance for the rest of the good ones.

“Not only can parallelizing a stream leads to poor performance, including liveness failures; it can lead to incorrect results and unpredictable behavior (safety failures).”

You liked this, then you can share it so others can like it as well ?