Question

I'm writing a little calculator application and use a simple tokenizer to parse the user input into a list for subsequent processing.

The method looks like this:

public LinkedList<String> tokenize(String infix) throws Exception 
{
    LinkedList<String> tokens = new LinkedList<String>();
    StringBuilder operand = new StringBuilder("");

    char current;

    int index = 0;
    while(index <= infix.length() -1) {
        current = infix.charAt(index);

        if(!isOperator(current)) {
            operand.append(current);
        } else {
            // Add the operand stack
            tokens.add(operand.toString());
            operand = new StringBuilder("");
            // Add the operator
            tokens.add(Character.toString(current));
        }
        index++;
    }
    // The trailing operator
    tokens.add(operand.toString());
    return tokens;
}

The test I've set up for this method, looks like this:

public void testTokenizer() throws Exception 
{   
    LinkedList<String> list = parser.tokenize("35+35");
    assertTrue(list.get(0) == "35" &&
        list.get(1) == "+"  &&
        list.get(2) == "35");
}

However, this fails because the tokenizer seems to add whitespace to the tokens. For example, printing the list tokenized from the string "35+35" gives me:

[35, +, 35]

What's going on here?

Was it helpful?

Solution

This is caused by the way how the String representation of the List is created during the call to List#toString. This is basically implemented as

firstElement + ", " + secondElement + ", " + ....

So these whitespaces are not in the elements, but only in the String representation of the List itself.

EDIT: You may also verify this by printing something like

System.out.println(">"+list.get(1)+"<");

This will print

>+<

and not

> +<

OTHER TIPS

The tokenizer is not adding space to the tokens. It is most likely being done by the LinkedList's toString() method when you display the list:

The string representation consists of a list of the collection's elements in the order they are returned by its iterator, enclosed in square brackets ("[]"). Adjacent elements are separated by the characters ", " (comma and space).

The assertion doesn't work because it uses ==. E.g., list.get(0) == "35" should be list.get(0).equals("35"). These are run-time String objects rather than compile time constants, so == won't do you want for comparisons.

Your test fails because it's comparing strings improperly.

If you change it to use the .equals method as you should, this test passes.

@Test
public void testTokenizer() throws Exception
{
    LinkedList<String> list = parser.tokenize("35+35");
    assertTrue(list.get(0).equals("35")
            && list.get(1).equals("+")
            && list.get(2).equals("35"));
}

The tokenizer is not adding whitespace.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top