Question

I would like to peek the next characters of a QTextStream reading a QFile, in order to create an efficient tokenizer.

However, I don't find any satisfying solution to do so.

QFile f("test.txt");
f.open(QIODevice::WriteOnly);
f.write("Hello world\nHello universe\n");
f.close();

f.open(QIODevice::ReadOnly);
QTextStream s(&f);
int i = 0;
while (!s.atEnd()) {
  ++i;
  qDebug() << "Peek" << i << s.device()->peek(3);
  QString v;
  s >> v;
  qDebug() << "Word" << i << v;
}

Gives the following output:

Peek 1 "Hel" # it works only the first time
Word 1 "Hello" 
Peek 2 "" 
Word 2 "world" 
Peek 3 "" 
Word 3 "Hello" 
Peek 4 "" 
Word 4 "universe" 
Peek 5 "" 
Word 5 ""

I tried several implementations, also with QTextStream::pos() and QTextStream::seek(). It works better, but pos() is buggy (returns -1 when the file is too big).

Does anyone have a solution to this recurrent problem? Thank you in advance.

Was it helpful?

Solution

You peek from QIODevice, but then you read from QTextStream, that's why peek works only once. Try this:

while (!s.atEnd()) {
      ++i;
      qDebug() << "Peek" << i << s.device()->peek(3);
      QByteArray v = s.device()->readLine ();
      qDebug() << "Word" << i << v;
}

Unfortunately, QIODevice does not support reading single words, so you would have to do it yourself with a combination of peak and read.

OTHER TIPS

Try disable QTextStream::autoDetectUnicode. This may read device ahead to perform detection and cause your problem.

Set also a codec just in case.

Add to the logs s.device()->pos() and s.device()->bytesAvailable() to verify that.


I've check QTextStream code. It looks like it always caches as much data as possible and there is no way to disable this behavior. I was expecting that it will use peek on device, but it only reads in greedy way. Bottom line is that you can't use QTextStream and peak device at the same time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top