سؤال

I have a file that looks like this:

1   0.1951  0.1766  0.1943  0.1488  
0.1594  0.2486  0.2044  0.2013  0.1859  
0.1559  0.1761  0.1666  0.1737  0.1595  
0.1940  1   0.2398  0.1894  0.1532  
0.1749  0.2397  1   0.1654  0.1622  
0.1940  0.1895  0.1659  1   0.1384  
0.1489  0.1547  0.1648  0.1390  1   
0.1840  0.2472  0.2256  0.2281  0.1878  

Somehow it's created in windows so it has this irritating \r character at the end. But I am running my shell line in linux.

In python I could have read the file and do a line.strip('\r') while looping through the lines in the file. But i have to use shell to run a loop and somehow the '\r' keep appearing.

Is there any way to remove it while in the while loop? I am trying to do a loop for this script in shell: https://github.com/alvations/meanie/blob/master/amgm.py:

# -*- coding: utf-8 -*-

import math, operator

def arithmetic_mean(x): # x is a list of values.
  """ Returns the arithmetic mean given a list of values. """
  return sum(x)/len(x)

def geometric_mean(x): # x is a list of values
  """ Returns the geometric mean given a list of values. """
  return math.pow(reduce(operator.mul, x, 1), 1/float(len(x)))

def arigeo_mean(x, threshold = 1e-10): # x is a list of values
  arith = arithmetic_mean(x)
  geo = geometric_mean(x)
  while math.fabs(arith - geo) > threshold: 
    [arith,geo] = [(arith + geo) / 2.0, math.sqrt(arith * geo)]
  return arith

def main(means):
  print means
  means = map(float,means)
  print "arithmetic mean = ", arithmetic_mean(means)
  print "geometric mean = ", geometric_mean(means)
  print "arithmetic-geometric mean = ", arigeo_mean(means)

if __name__ == '__main__':
  import sys
  if len(sys.argv) < 2:
    sys.stderr.write('Usage: python %s mean1 mean2 mean3 ... \n' % sys.argv[0])
    sys.exit(1)

so I tried the following shell line to iterate through my textfile:

alvas@ubi:~/git/meanie$ while read line; do python amgm.py $line; done < out.tab

and got these errors:

['1', '0.1951', '0.1766', '0.1943', '0.1488', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1594', '0.2486', '0.2044', '0.2013', '0.1859', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1559', '0.1761', '0.1666', '0.1737', '0.1595', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1940', '1', '0.2398', '0.1894', '0.1532', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1749', '0.2397', '1', '0.1654', '0.1622', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1940', '0.1895', '0.1659', '1', '0.1384', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1489', '0.1547', '0.1648', '0.1390', '1', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 
['0.1840', '0.2472', '0.2256', '0.2281', '0.1878', '\r']
Traceback (most recent call last):
  File "amgm.py", line 32, in <module>
    main(sys.argv[1:])
  File "amgm.py", line 22, in main
    means = map(float,means)
ValueError: could not convert string to float: 

Note: I'm NOT allowed to change the python script though =(

هل كانت مفيدة؟

المحلول

You should be able to strip any trailing whitespace from the line before passing it to your python program.

Consider one of the suggestions here.

If you're running bash, the second answer doesn't involve invoking an additional program, if you're not, you may look at the the sed option if your shell doesn't support parameter expansion.

Another option would be to use the dos2unix (or similar) utility that converts the windows line ending (\r\n) to the unix line ending (\n).

Alternatively, you could get crazy and always trim off the last character irrespective of what it was with a bash parameter expansion like the one described here. But I would consider this the least-favorable option.

نصائح أخرى

To work in your constraints, I would convert the file from the MS format to Unix format with the dos2unix tool.

alvas@ubi:~/git/meanie$ dos2unix /my/file/path.txt

If you gain the ability to edit the Python script, I would open it in Python using the Universal Newlines approach.

with open('/file/path.txt', 'rU') as f:
    for line in f:
        parse(f)

dos2unix is popular but it's not installed everywhere. A more portable solution would be

tr -d '\r'

If you need to preserve carriage returns in the middle of lines (unlikely for text files) you could use

sed 's/\r$//'

These should both work fine for the simple case where you do not need to distinguish between text and binary files.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top