extract float numbers from data file

https://stackoverflow.com/questions/21452347

04-10-2022
|

Domanda

I'm trying to extract the values (floats) from my datafile. I only want to extract the first value on the line, the second one is the error. (eg. xo @ 9.95322254_0.00108217853 means 9.953... is value, 0.0010.. is error)

Here is my code:

import sys
import re

inf = sys.argv[1]
out = sys.argv[2]
f = inf
outf = open(out, 'w')
intensity = []

with open(inf) as f:    
    pattern = re.compile(r"[^-\d]*([\-]{0,1}\d+\.\d+)[^-\d]*")  

    for line in f:
        f.split("\n")
        match = pattern.match(line)
        if match:
            intensity.append(match.group(0))


for k in range(len(intensity)):
    outf.write(intensity[k])

but it doesn't work. The output file is empty. the lines in data file look like:

xo_Is 
xo  @  9.95322254`_0.00108217853
SPVII_to_PVII_Peak_type
PVII_m(@, 1.61879`_0.08117)
PVII_h(@, 0.11649`_0.00216)
I @  0.101760618`_0.00190314017

each time the first number is the value I want to extract and the second one is the error.

Soluzione

You were almost there, but your code contains errors preventing it from running. The following works:

pattern = re.compile(r"[^-\d]*(-?\d+\.\d+)[^-\d]*")  

with open(inf) as f, open(out, 'w') as outf:
    for line in f:
        match = pattern.match(line)
        if match:
            outf.write(match.group(1) + '\n')

Altri suggerimenti

I think you should test your pattern on a simple string instead of file. This will show where is the error: in pattern or in code which parsing file. Pattern looks good. Additionally in most languages i know group(0) is all captured data and for your number you need to use group(1) Are you sure that f.slit('\n') must be inside for?

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow