Question

I'm having trouble understanding how tokens are counted in the StringTokenizer class with constructors that have multiple arguments.

String line = ("This is a sample sentence, how many tokens are inside this sentence?") ;

new StringTokenizer(String str)
new StringTokenizer(String str, String delims )

If i use the first StringToknenizer constructor with a single argument of line and write a sample program the result is "thisisasamplesentencehowmanytokensareinside?" and 12 tokens. It returns the whole sentence without any spaces. I understand how this works.

If use the second constructor with two arguments, (line, ",") my test program will get "this is a test sentence how many tokens are in this sentence? " with spaces for each word but no comma and only 2 tokens. I thought it should count BOTH the spaces and the comma as token separators but it counts everything before the comma as 1 token and everything after the comma as 1 token. This part is confusing me.

My problem is I don't understand how tokens are separated using delimiters using constructors that take multiple arguments such as (line, " ,"). Am i missing something or misunderstanding something?

Was it helpful?

Solution

If no delimiter specified, it

uses the default delimiter set, which is " \t\n\r\f": the space character, the tab character, the newline character, the carriage-return character, and the form-feed character.

Refer this link.

http://docs.oracle.com/javase/7/docs/api/java/util/StringTokenizer.html#StringTokenizer(java.lang.String)

If specified, only that delimiter will be applied.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top