If performance is not the most important issue, the following would be a general-purpose tokenizer for your input format. Whether this is a feasible solution depends of course on what you actually want to do with the input.
#include <fstream>
#include <iostream>
#include <sstream>
#include <string>
static void handle_number_string(std::string& literal) {
if (!literal.empty()) {
std::istringstream iss {literal};
int value;
if (iss >> value) {
std::clog << "<" << value << ">";
} else {
// TODO: Handle malformed integer literal
}
literal.clear();
}
}
int main(int argc, char** argv) {
for (int i = 1; i < argc; i++) {
std::string aux;
std::ifstream istr {argv[i]};
std::clog << argv[i] << ": ";
while (istr.good()) {
const int next = istr.get();
switch (next) {
case ' ':
handle_number_string(aux);
std::clog << "<SPC>";
break;
case '\n':
handle_number_string(aux);
std::clog << "<EOL>";
break;
default:
aux.push_back(next);
}
}
// Handle case that the last line was not terminated with '\n'.
handle_number_string(aux);
std::clog << std::endl;
}
return 0;
}
Addendum: I'd only do this if I absolutely had to. Handling all possibilities (multiple spaces, non-breaking spaces, tabs, \r\n
,…) correctly will be a lot of work. If what you actually want to handle are the logical tokens field separator and end of line, manually parsing whitespace seems to be the wrong way to go. It would be sad if your program crashes just because a user has justified the columns in the input file (thus using a variable number of spaces).