Python equivalent of Matlab textscan

https://stackoverflow.com/questions/13125447

15-07-2021
|

質問

I'm working with transferring some Matlab code to Python. I'm relatively new to Python and am unsure of a Python equivalent of Matlab's textscan method. Any help would be greatly appreciated.

解決

If you're translating Matlab to Python, I'll assume you're already using NumPy.

In that case, you can use np.loadtxt (if no values are missing) or np.genfromtxt (if there are missing values: I'm not sure whether Matlab's textscan handles that).

Give us a few more details if you need more help!

他のヒント

Example of conversion of MATLAB's textscan to Python + NumPy's np.loadtxt:

Let our data file results.csv contain:

0.6236,sym2,1,5,10,10
0.6044,sym2,2,5,10,10
0.548,sym2,3,5,10,10
0.6238,sym2,4,5,10,10
0.6411,sym2,5,5,10,10
0.7105,sym2,6,5,10,10
0.6942,sym2,7,5,10,10
0.6625,sym2,8,5,10,10
0.6531,sym2,9,5,10,10

Matlab code:

fileID = fopen('results.csv');
d = textscan(fileID,'%f %s %d %d %d %d', 'delimiter',',');
fclose(fileID);

Python + NumPy code:

fd = open('results2.csv','r')    
d = np.loadtxt(fd,
           delimiter=',',
           dtype={'names': ('col1', 'col2', 'col3', 'col4', 'col5', 'col6'),
           'formats': ('float', 'S4', 'i4', 'i4', 'i4', 'i4')})
fd.close()

For more info on types, see Data type objects (dtype).

you have to look for Numpy and py2mat. If my understanding of textscan() is correct you could just use open()

If your results are more complicated than simple delimited text, such as if there are other, useless bits of text mixed in, then you can use Numpy's fromregex function to replace textscan. fromregex lets you read in based on a regular expression, with groups (parts surrounded by ()) as the values.

So for example say you have lines like this:

field1 is 1, field 2 is 5 to 6.6
field1 is 2, field 2 is 7 to 0.1

And you want to get the value numbers (not the field names):

[[1, 5, 6.6],
 [2, 7, 0.1]]

You can do

data = np.fromregex('temp.txt', r'field1 is ([\d\.]+), field 2 is ([\d\.]+) to ([\d\.]+)', dtype='float')

The [\d\.]+ matches any number, including decimal places, and the () tells numpy to use that result as a value. You can also specify more complicated dtypes, such as having different columns have different types, as well as specifying column names to give a structured array. That is covered in the documentation.

However, it is more complicated than other approaches when dealing with simple delimited or fixed-width data.

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow