Question

I am trying to iterate through a txt file and count all characters. This includes \n new line characters and anything else. I can only read through the file once. I am also recording letter frequency, amount of lines, amount of words, and etc. I can't quite figure out where to count the total amount of characters. (see code below) I know I need to before I use the StringTokenizer. (I have to use this by the way). I have tried multiple ways, but just can't quite figure it out. Any help would be appreciated. Thanks in advance. Note* my variable numChars is only counting alpha characters(a,b,c etc) edit posting class variables to make more sense of the code

private final int NUMCHARS = 26;
private int[] characters = new int[NUMCHARS];
private final int WORDLENGTH = 23;
private int[] wordLengthCount = new int[WORDLENGTH];
private int numChars = 0;
private int numWords = 0;
private int numLines = 0;
private int numTotalChars = 0;
DecimalFormat df = new DecimalFormat("#.##");

public void countLetters(Scanner scan) {
    char current;
    //int word;
    String token1;

    while (scan.hasNext()) {

        String line = scan.nextLine().toLowerCase();
        numLines++;

        StringTokenizer token = new StringTokenizer(line,
            " , .;:'\"&!?-_\n\t12345678910[]{}()@#$%^*/+-");
        for (int w = 0; w < token.countTokens(); w++) {
            numWords++;
        }

        while (token.hasMoreTokens()) {
            token1 = token.nextToken();
            if (token1.length() >= wordLengthCount.length) {
                wordLengthCount[wordLengthCount.length - 1]++;
            } else {
                wordLengthCount[token1.length() - 1]++;

            }

        }
        for (int ch = 0; ch < line.length(); ch++) {
            current = line.charAt(ch);
            if (current >= 'a' && current <= 'z') {
                characters[current - 'a']++;
                numChars++;

            }
        }
    }
}
Was it helpful?

Solution

Use string.toCharArray(), something like:

while (scan.hasNext()) {
    String line = scan.nextLine();
    numberchars += line.toCharArray().length;
    // ...
}

An Alternative would be to use directly the string.length:

while (scan.hasNext()) {
    String line = scan.nextLine();
    numberchars += line.length;
    // ...    
}

Using the BfferedReader you can do it like this:

BufferedReader reader = new BufferedReader(
    new InputStreamReader(
        new FileInputStream(file), charsetName));
int charCount = 0;
while (reader.read() > -1) {
    charCount++;
}

OTHER TIPS

I would read by char from file with BufferedReader and use Guava Multiset to count chars

BufferedReader rdr = Files.newBufferedReader(path, charSet);
HashMultiset < Character > ms = HashMultiset.create();
for (int c;
(c = rdr.read()) != -1;) {
    ms.add((char) c);
}
for (Multiset.Entry < Character > e: ms.entrySet()) {
    char c = e.getElement();
    int n = e.getCount();
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top