Using Sed through subprocess.call in python to conduct in file replacements

https://stackoverflow.com/questions/23274325

09-07-2023
|

質問

I've got a column in one file that I'd like to replace with a column in another file. I'm trying to use sed to do this within python, but I'm not sure I'm doing it correctly. Maybe the code will make things more clear:

 20 for line in infile1.readlines()[1:]:
 21         element = re.split("\t", line)
 22         IID.append(element[1])
 23         FID.append(element[0])
 24 
 25 os.chdir(binary_dir)
 26 
 27 for files in os.walk(binary_dir):
 28         for file in files:
 29                 for name in file:
 30                         if name.endswith(".fam"):
 31                                 infile2 = open(name, 'r+')
 32 
 33 for line in infile2.readlines():
 34         parts = re.split(" ", line)
 35         Part1.append(parts[0])
 36         Part2.append(parts[1])
 37 
 38 for i in range(len(Part2)):
 39         if Part2[i] in IID:
 40                 regex = '"s/\.*' + Part2[i] + '/' + Part1[i] + ' ' + Part2[i] + '/"' + ' ' + phenotype 
 41                 print regex
 42                 subprocess.call(["sed", "-i.orig", regex], shell=True)

This is what print regex does. The system appears to hang during the sed process, as it remains there for quite some time without doing anything.

"s/\.*131006/201335658-01 131006/" /Users/user1/Desktop/phenotypes2

Thanks for your help, and let me know if you need further clarification!

解決

You don't need sed if you have Python and the re module. Here is an example of how to use re to replace a given pattern in a string.

>>> import re
>>> line = "abc def ghi"
>>> new_line = re.sub("abc", "123", line)
>>> new_line
'123 def ghi'
>>>

Of course this is only one way to do that in Python. I feel that for you str.replace() will do the job too.

他のヒント

The first issue is shell=True that is used together with a list argument. Either drop shell=True or use a string argument (the complete shell command) instead:

from subprocess import check_call

check_call(["sed", "-i.orig", regex])

otherwise the arguments ('-i.orig' and regex) are passed to /bin/sh instead of sed.

The second issue is that you haven't provided input files and therefore sed expects data from stdin that it is why it appears to hang.

If you want to make changes in files inplace, you could use fileinput module:

#!/usr/bin/env python
import fileinput

files = ['/Users/user1/Desktop/phenotypes2'] # if it is None it behaves like sed
for line in fileinput.input(files, backup='.orig', inplace=True):
    print re.sub(r'\.*131006', '201335658-01 13100', line),

fileinput.input() redirects stdout to the current file i.e., print changes the file.

The comma sets sys.stdout.softspace to avoid duplicate newlines.

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow