Question

I'm reviving an old script in Matlab which uses "[d h v c t] = textread(fn,'%s %*s %s %f %s %s');" to import data, I want to replace the textread with textscan as that seems to be recommended.

My problem (with both the old and the new) is that my fourth column of data - the floating point value- has some gaps in it. As whitespace is my delimiter this means that matlab tries to insert the fifth column which contains letters as a floating point value and therefore gives me an error.

Any suggestions on how to make it automatically skip lines without a value? I have about 100 files which need to be periodically updated and therefore an manual methods are too time consuming. My data looks like this but over a long period of time:

31/12/1991 @ 00:00:00 Q25 T2
01/01/1992 @ 00:00:00 Q25 T2
02/01/1992 @ 00:00:00 24.451330 Q25 T2
03/01/1992 @ 00:00:00 24.674587 Q25 T2
04/01/1992 @ 00:00:00 25.264880 Q25 T2

Thanks

Was it helpful?

Solution

Okay, this is a bit of a hack, but it works. textscan can be so much faster than other methods that it is often worth it to play around a bit if your data has particular constraints.

fid = fopen('test.txt');
t = textscan(fid,'%s%*s%s%f%s%s','TreatAsEmpty','Q');
fclose(fid);
t{:}

You'll see that t{3} is a 5-by-1 array with the default NaN for the empty values. However, you still need to do one more thing as t{4} is missing the leading 'Q' for the first two elements. There are probably several ways to accomplish this, but here's an easy one-liner that uses isnan to index into the rows where the 'Q' needs to be added:

t{4}(isnan(t{3})) = cellfun(@(c)['Q' c],t{4}(isnan(t{3})),'UniformOutput',false);


How does using the 'TreatAsEmpty' parameter work?

In the case of the fourth column (the third non-skipped column) we're dealing with a numeric field. This option only applies to when detecting numeric fields ('%f'). The string 'Q25' is broken in to the number NaN and the string '25', effectively adding a column. The 'Q25' elements in the fifth column don't matter because they're scanned as strings. So it should be fine if the letter 'Q' appears elsewhere in the data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top