Question

What is the best way to convert a long string to a data structure with words and counts.

I would do .split(" ") to split on spaces and presumably make an arraylist, then maybe go through arraylist and add each item to a hashmap or multiset? I'm not sure what the best way to do this is/if it can be done directly with some sort of hashmap without making an arraylist first.

Thanks!

Was it helpful?

Solution 2

import java.util.HashMap;
import java.util.Map;

public class Test {
    private static Map<String, Integer> count = new HashMap<String, Integer>();

    public static void main(String[] args) {
        addToCountMap("This is my test string and it contains Test and test and string and some more");
        addToCountMap("This is my test string and it contains Test and test and string and some more");
        addToCountMap("This is my test string and it contains Test and test and string and some more");
        addToCountMap("This is my test string and it contains Test and test and string and some more");
        addToCountMap("This is my test string and it contains Test and test and string and some more");

        mergeWithCountMap(count);

        System.out.println(count);
    }

    private static void addToCountMap(String test) {
        String[] split = test.split(" ");
        for (String string : split) {
            if (!count.containsKey(string)) {
                count.put(string, 0);
            }
            count.put(string, count.get(string) + 1);
        }
    }

    private static void mergeWithCountMap(Map<String, Integer> mapToMerge) {
        for (String string : mapToMerge.keySet()) {
            if (!count.containsKey(string)) {
                count.put(string, 0);
            }
            count.put(string, count.get(string) + mapToMerge.get(string));
        }
    }
}

OTHER TIPS

If you're referring to a Guava Multiset, this is just the one line

HashMultiset.create(
  Splitter.on(CharMatcher.WHITESPACE).omitEmptyStrings()
    .split(string));
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top