Importing data into Matlab from poorly formatted file
-
12-06-2021 - |
Domanda
I have a large set of text files (tab delimited data) I need to parse. They are mostly well formatted. However, there are randomly interspersed rows that include erroneous characters, like what is shown below. The location of the bad rows is different in each file, but the characters added are always the same.
1 3
2 873
3 46
23 99798
23 1
353 79
"23 ," 967
35 8028
253 615
"235 ," 3924
345 188
345 579
345 419
56 16835
23 449
importdata(filename) imports all of the data up to the first badly formatted line, then ignores the rest of the file. I think I could do what I am trying to do with a combination of fopen and textscan, but I can't seem to get the right combination of arguments to make it work.
Soluzione
Have a go at using textread
function with the %q
format string. Assuming the test data in the question is saved as test.txt
:
>> [a, b] = textread('test.txt', '%q %q');
>> a'
ans =
Columns 1 through 9
'1' '2' '3' '23' '23' '353' '23 ,' '35' '253'
Columns 10 through 15
'235 ,' '345' '345' '345' '56' '23'
>> b'
ans =
Columns 1 through 9
'3' '873' '46' '99798' '1' '79' '967' '8028' '615'
Columns 10 through 15
'3924' '188' '579' '419' '16835' '449'
Then you can use str2double
to remove the trailing columns in a
. For example:
>> str2double(a)'
ans =
Columns 1 through 13
1 2 3 23 23 353 23 35 253 235 345 345 345
Columns 14 through 15
56 23