Pergunta

I want to count all the letters from an input url. I don't want to discriminate between uppercase or lowercase letters. The total amounts of a's will be stored as an integer in total[0], total amount of b's in total[1], etc. etc.

Any idea how I can achieve this using InputStream?

    public static int[] letterFrequency(String url) throws IOException {
        InputStream inn= new BufferedInputStream((new URL(url)).openStream());
        char[] c= {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'æ', 'ø', 'å'};
        int[] total= new int[29];

        for(int i= 0; i< c.length; i++)   {
            int counter= 0;
            while(inn.available()!= 0)  {
                if(inn.read()== c[i])
                    counter++;
            }

            total[i]= counter;
        }
        return total;
    }

EDIT:

Thank you for all the anwsers! You are great!! ;)

Foi útil?

Solução

Don't use a Stream. Those are meant to read byte. Use a Reader if you want characters. byte might work for ASCII but characters can be up to 4 byte and may have different encodings.

public static int[] letterFrequency(String url) throws IOException {
    Reader inn = new InputStreamReader(new BufferedInputStream((new URL(url)).openStream()), "UTF-8");
    char[] c = {
            'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
            'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't',
            'u', 'v', 'w', 'x', 'y', 'z', 'æ', 'ø', 'å'
    };
    String chars = new String(c);

    int[] total = new int[c.length];
    int read;
    while ((read = inn.read()) != -1) {
        read = Character.toLowerCase(read);
        int index = chars.indexOf(read);
        if (index != -1) {
            total[index]++;
        }
    }
    return total;
}

Outras dicas

You could do something like this (pseudo code):

int aCnt = totalInput.length() - totalInput.replaceIgnoreCase('a', '').length();
int bCnt = totalInput.length() - totalInput.replaceIgnoreCase('b', '').length();

Here's the solution using a Map:

public static Map letterFrequency(String url) throws IOException {
    Map<Character, Integer> m = new HashMap<Character, Integer>();
    char[] urlCharArray = url.toCharArray(); 
    for (char a : urlCharArray) {
        Integer freq = m.get(a);
        m.put(a, (freq == null) ? 1 : freq + 1);
    }
    return m;
}

What's the encoding of the chars? not all encoding have 1 byte per character.

Assuming this will not be a problem, and by your first start of the question I want to count all the letters from an input url. just create a byte array for the 256 values of one byte, and count on it, like:

char[] b = new byte[256]; //one byte for each value
while (loop) {
    int r = inn.read();
    //verify for end-of-stream or other errors
    b[r]++;
}

this would give the char count for each value of the byte, like that:

b['a'] = a count
b['A'] = A count

now to turn to case insensitive

b['a'] + b['A']
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top