Question

I am using fscanf to read a file which has lines like
Number <-whitespace-> string <-whitespace-> optional_3rd_column

I wish to extract the number and string out of each column, but ignore the 3rd_column if it exists

Example Data:
12 foo something
03 bar
24 something #randomcomment

I would want to extract 12,foo; 03,bar; 24, something while ignoring "something" and "#randomcomment"

I currently have something like

while(scanf("%d %s %*s",&num,&word)>=2)
{ 
assign stuff 
}

However this does not work with lines with no 3rd column. How can I make it ignore everything after the 2nd string?

Was it helpful?

Solution

It would appear to me that the simplest solution is to scanf("%d %s", &num, &word) and then fgets() to eat the rest of the line.

OTHER TIPS

The problem is that the %*s is eating the number on the next line when there's no third column, and then the next %d is failing because the next token is not a number. To fix it without using gets() followed by sscanf(), you can use the character class specified:

while(scanf("%d %s%*[^\n]", &num, &word) == 2)
{ 
    assign stuff 
}

The [^\n] says to match as many characters as possible that aren't newlines, and the * suppresses assignment as before. Also note that you can't put a space between the %s and the %*[\n], because otherwise that space in the format string would match the newline, causing the %*[\n] to match the entire subsequent line, which is not what you want.

Use fgets() to read a line at a time and then use sscanf() to look for the two columns you are interested in, more robust and you don't have to do anything special to ignore trailing data.

I often use gets() followed by an sscanf() on the string you just, er, gots.

Bonus: you can separate the test for end-of-input from the parsing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top