Convert 2-d array of strings to float in python_removing scientific notation

https://stackoverflow.com/questions/20279013

06-08-2022
|

Question

How can I convert scientific notations to the original format when I use numpy.astype in x array? Here's my code:

with open ('outfile.csv', 'r') as infile:
    reader = csv.reader(infile)
    reader_list = list(reader)
    reader_array = array(reader_list)
    x = reader_array[:,5].astype(np.float)

    #original array:
    print reader_array[:,5]

    #converted to float
    print x

#original array:
['-0.00041955436132607246' '-0.00036612800229292086'  '0.00022313364860991641' ..., '73.418371245304215' '73.417384428365267'  '73.416718169781149'] 

#converted to float
[ -4.19554361e-04  -3.66128002e-04   2.23133649e-04 ...,   7.34183712e+01    7.34173844e+01   7.34167182e+01]

To be more specific, I want to convert array of strings to floats, but keep the same format as the original array, and do some analysis on it:

#find row number of max value in column 1: (This piece works fine)
max_index = where(reader_array[:,1] == max(reader_array[:,1]))

#take last element in column 5: (This one is also fine)
total_ = (reader_array[(len(reader_array[:,5])-1),5])

#find row number where element in column 5 is equal to 0.1*total_: (here's the problem!)
0.1_index = where((reader_array[:,5]) == (total_)*0.1)

So I think changing the strings to floats but with the same format as the original array allows multiplying array members by another float (0.1 here).

Please note that the value (0.1*total_) may not match any of the rows values in column 5, which I have to think how to solve. But I cannot progress without being able to compare rows with (0.1*total_).

I appreciate if someone can give a hint how to approach please.

Solution

You're intrinsically limited by the fact that floating point numbers are stored using IEEE 754. You can't have arbitrary precision floating points, so in your case, you can't expect them to necessarily be exactly the same as a string representation.

However, in your case, the more pressing issue is that want to compare a string to a float, so of course they are going to be different. Python is dynamically, but strongly typed.

Given both the above points, you need to better define your problem. Why do you need to compare with an array of strings? (what does this even mean!?)

Can you test for closeness rather than equality once you have your data types sorted out (e.g. using numpy.close)?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow