Count characters in a piece of text as a one-liner using Java 8 Streaming API
A WordCount app is arguably the most commonly used example of the MapReduce principle – see Hadoop or Spark tutorials. I wanted to do a similar thing in plain old Java as “a one-liner” (kind of). My objective was to explore the power and expressiveness of the Streaming API and Lambda expressions.
Here is what I wanted to achieve:
Input: "Hello world!" Result: ' ': 1 '!': 1 'd': 1 'e': 1 'h': 1 'l': 3 'o': 2 'r': 1 'w': 1
Essentially, the program takes the input text, makes all letters lowercase, counts occurrences of each and every character and prints a sorted array of unique characters along with their counts.
Without further ado, here is how I went about solving the problem:
import java.util.List;
import java.util.stream.Collectors;
public class CharCounter {
public List<Count> count(String text) {
return text
.toLowerCase()
.chars()
.distinct()
.mapToObj(i -> new Count((char) i,
text
.toLowerCase()
.chars()
.filter(j -> j == i).count()))
.sorted()
.collect(Collectors.toList());
}
// Java (still!) lacks Tuples, so I created my own.
static class Count implements Comparable<Count> {
char key; // a character
long count; // the character count
Count(char key, long count) {..}
@Override String toString() {..} // pretty print
@Override int compareTo(Count o) {
return this.key - o.key;
}
}
public static void main(String[] args) {
new CharCounter()
.count("Hello World!")
.stream() // Yeah, it's a list and thus 'streamable'
.forEach(System.out::println);
}
}
Hope it makes sense. You might argue that I could have used something like a Map.Entry instead of a custom Count object. However, I find it much cleaner that way. The complete example is available on GitHub.
Thanks for reading and definitely let me know your thoughts on how this could be done differently.