Stream Reduction Operations

In this post I will make overview with examples of Collectors class which provides very cool functionality that can be applied on a stream. The functionality is quite similar to some features in SQL language. For instance, it allows you to make an averaging, grouping, partitioning, summing and many other cool things. In order to use stream reduction API you need to be familiar with Collector interface and Collectors class.

Screen Shot 2016-03-30 at 20.34.29

Collectors is an utility class that contains factory methods which you should use most of the time, whereas Collector is an interface that can be used for implementation more specific reduction operation.

For the below examples I decided to use the collection of movies:

private static List<Movie> getMovies() {
    List<Movie> books = new ArrayList<>(10);

    // Hindi
    books.add(new Movie("3 Idiots", GENRE.COMEDY, "Rajkumar Hirani", 2009, LANG.HINDI));
    books.add(new Movie("Special Chabbis", GENRE.DRAMA, "Neeraj Pandey", 2013, LANG.HINDI));
    books.add(new Movie("Lagaan: Once Upon a Time in India", GENRE.SPORT, "Ashutosh Gowariker", 2001, LANG.HINDI));

    // Rus
    books.add(new Movie("Брестская крепость", GENRE.DRAMA, "Александр Котт", 2010, LANG.RUSSIAN));
    books.add(new Movie("Операция Ы", GENRE.COMEDY, "Леонид Гайдай", 1965, LANG.RUSSIAN));
    books.add(new Movie("Брат", GENRE.CRIME, "Алексей Балабанов", 1997, LANG.RUSSIAN));

    // Eng
    books.add(new Movie("Karate Kid", GENRE.SPORT, "John G. Avildsen", 1984, LANG.ENGLISH));
    books.add(new Movie("Titanic", GENRE.DRAMA, "James Cameron", 1997, LANG.ENGLISH));
    books.add(new Movie("Goodfellas", GENRE.CRIME, "Martin Scorsese", 1990, LANG.ENGLISH));

    // German
    books.add(new Movie("Knockin' on Heaven's Door", GENRE.CRIME, "Thomas Jahn", 1997, LANG.GERMAN));

    return books;
}

For the collection you will need the class that represents movie and two enumerations: one for genre and another for language.

enum GENRE { SPORT, CRIME, DRAMA, COMEDY }

enum LANG { ENGLISH, RUSSIAN, HINDI, GERMAN }

final class Movie {

    private final String title;
    private final GENRE genre;
    private final String director;
    private final int year;
    private final LANG lang;

    public Movie(String title, GENRE genre, String director, int year, LANG lang) {
        this.title = title;
        this.genre = genre;
        this.director = director;
        this.year = year;
        this.lang = lang;
    }

    public String getTitle() { return title; }

    public GENRE getGenre() { return genre; }

    public String getDirector() { return director; }

    public int getYear() { return year; }

    public LANG getLang() { return lang; }

    @Override
    public String toString() {
        return "Movie {" +
                "title='" + title + '\'' +
                ", genre=" + genre +
                ", director='" + director + '\'' +
                ", year=" + year +
                ", lang=" + lang +
                '}';
    }
    
    // Equals and hashcode mthods.....
    
}

Ok, we are ready to continue our journey into stream reduction operations and let’s start from the Collectors class.

Collectors

Converting

Very common operation that provides possibility to convert the stream into a collection. You have several options:

You can convert the stream to ArrayList:

List<Movie> arrayList = movies.stream().collect(Collectors.toList());

You can convert the stream to HashSet:

Set<Movie> arrayList = movies.stream().collect(Collectors.toSet());

You can convert the stream to HashMap:

Map<String, Integer> map;
map = movies.stream()
            .skip(7)
            .collect(
                 Collectors.toMap(Movie::getDirector, Movie::getYear)
            );

System.out.println(map);

// Output:
// {Thomas Jahn=1997, Ashutosh Gowariker=2001, Neeraj Pandey=2013}

There is an interesting moment with toMap method. In case if you are reducing a stream to Map it’s very likely that the stream may contain two elements with the same key. To avoid the collision you can use an overloaded version of the method that accepts merge function, otherwise you will get IllegalStateException.

Example with merge function:

Map<LANG, Integer> map;

map = movies.stream()
            .skip(7)
            .collect(Collectors.toMap(
                    m -> m.getLang(),
                    Movie::getYear,
                    (m1, m2) -> {
                        System.out.println("Hello Collision");
                        return m1;
                    }
                )
            );

System.out.println(map);

// Output:
// Hello Collision
// {HINDI=2013, GERMAN=1997}

In case if you need to convert the steam to some specific type of collection like TreeMap or TreeSet then you can use the below example:

Set<String> set;

set = movies.stream()
        .skip(7)
        .map(movie -> movie.getTitle())
        .collect(Collectors.toCollection(TreeSet::new));

System.out.println(set);

// Output:
// [Knockin' on Heaven's Door, Lagaan: Once Upon a Time in India, Special Chabbis]

Averaging

Use one of the averaging methods to get the arithmetic mean.

int avg = movies.stream()
        .collect(Collectors.averagingInt(Movie::getYear)).intValue();

System.out.println(avg);

// Output:
// 1996

Maximizing/Minimizing

There are two very simple methods for getting min and max values from stream. The below example shows how to get minimal year among all movies.

int movie = movies.stream()
       .map(Movie::getYear)
       .collect(Collectors.minBy(Comparator.naturalOrder())).get();

System.out.println(movie);

// Output:
// 1965

Grouping

Grouping allows you to create a  Map in which elements will be grouped by specific key. In my example grouped movies by genre:

Map<GENRE, List<String>> moviesByGanre;

moviesByGanre = movies.stream().collect(
        Collectors.groupingBy(Movie::getGenre, Collectors.mapping(Movie::getTitle, Collectors.toList()))
);

System.out.println(moviesByGanre);

// Output:
//{
// COMEDY=[Операция Ы, 3 Idiots], 
// CRIME=[Goodfellas, Брат, Knockin' on Heaven's Door], 
// SPORT=[Karate Kid, Lagaan: Once Upon a Time in India], 
// DRAMA=[Titanic, Брестская крепость, Special Chabbis]
//}

Joining

It’s very common situation when you need to join elements of the collection to one single string. Joining operator provides possibility to do that. It has several overloaded versions of the method, but we will take a look at fancy version. My goal is to concatenate all directors into a string where each director is comma separated.

String str;

str = movies.stream()
            .map(Movie::getDirector)
            .collect(Collectors.joining(", ", "[", "]"));

System.out.println(str);

// Output: 
// [John G. Avildsen, James Cameron, Martin Scorsese, Александр Котт, Леонид Гайдай, Алексей Балабанов, Rajkumar Hirani, Neeraj Pandey, Ashutosh Gowariker, Thomas Jahn]

Partitioning

Partitioning is a special version of grouping. The difference is that it uses a Predicate to test values. As a result, all elements go into two groups. Let say I need to group my movies into two groups. In the first group I want to keep movies that older than 15 years and rest of the movies go to the another partition.

Map<Boolean, List<String>> map;

map = movies.stream()
       .collect(
          Collectors.partitioningBy(
             m -> (LocalDate.now().getYear() - m.getYear()) > 15,
             Collectors.mapping(
                           Movie::getTitle, 
                           Collectors.toList()
             )
          )
       );

System.out.println(map);

Reducing

It works in the same way as reduce method in Stream class. But you can used it in “multi-level” reduction. Let say I need a Map where key is genre and value is the the oldest movie in the genre.

Comparator<Movie> minYear = Comparator.comparing(Movie::getYear);

Map<GENRE, Optional<Movie>> oldestMovieInGenre;

oldestMovieInGenre = movies.stream()
                .collect(
                    groupingBy(
                        Movie::getGenre,
                        reducing(minBy(minYear))
                    )
                );

System.out.println(oldestMovieInGenre);

// Output:

{

COMEDY=Optional[Movie {title='Операция Ы', genre=COMEDY, director='Леонид Гайдай', year=1965, lang=RUSSIAN}], 

CRIME=Optional[Movie {title='Goodfellas', genre=CRIME, director='Martin Scorsese', year=1990, lang=ENGLISH}], 

DRAMA=Optional[Movie {title='Titanic', genre=DRAMA, director='James Cameron', year=1997, lang=ENGLISH}], 

SPORT=Optional[Movie {title='Karate Kid', genre=SPORT, director='John G. Avildsen', year=1984, lang=ENGLISH}]

}

Summarizing

If you are interested in more exhausted information like min, max, count etc. Then you should use one of the summarizingXXXX methods that returns summary statistics object.

IntSummaryStatistics sum = null;

sum = movies.stream()
        .collect(
                Collectors.summarizingInt(Movie::getYear)
        );

System.out.println(sum);

// Output:
// IntSummaryStatistics{count=10, sum=19963, min=1965, average=1996,300000, max=2013}

Summing

Allows you to make summing in “multi-level” reduction.

int sum = movies.stream()
        .collect(
                Collectors.summingInt(Movie::getYear)
        );

System.out.println(sum);

// Output:
// 19963

Leave a Reply