Python - ignore lines in a file

https://stackoverflow.com/questions/1931527

20-09-2019
|

Question

How does one ignore lines in a file?

Example:

If you know that the first lines in a file will begin with say, a or b and the remainder of lines end with c, how does one parse the file so that lines beginning a or b are ignored and lines ending c are converted to a nested list?

What I have so far:

fname = raw_input('Enter file name: ')

z = open(fname, 'r')

#I tried this but it converts all lines to a nested list

z_list = [i.strip().split() for i in z]

I am guessing that I need a for loop.

for line in z:
    if line[0] == 'a':
        pass
    if line[0] == 'b':
        pass
    if line[-1] == 'c':
        list_1 = [line.strip().split()]

The above is the general idea but I am expert at making dead code! How does one render it undead?

Thanks, Seafoid.

Solution

startswith can take a tuple of strings to match, so you can do this:

[line.strip().split() for line in z if not line.startswith(('a', 'b'))]

This will work even if a and b are words or sentences not just characters. If there can be cases where lines don't start with a or b but also don't end with c you can extend the list comprehension to this:

[
    line.strip().split()
    for line in z if line.endswith('c') and not line.startswith(('a', 'b'))
]

OTHER TIPS

One very general approach is to "filter" the file by removing some lines:

import itertools
zlist = [l.strip.split() for l in itertools.ifilter(lambda line: line[0] not in 'ab', z)]

You can use itertools.ifilter any time you want to "selectively filter" an iterable, getting another iterable which only contains those items which satisfy some predicate -- which is why I say this approach is very general. itertools has a lot of great, fast tools for dealing with iterables in a myriad way, and is well worth studying.

A similar but syntactically simpler approach, which suffices in your case (and which therefore I would recommend due to the virtue of simplicity), is to do the "filtering" with an if clause in the listcomp:

zlist = [l.strip.split() for l in z if l[0] not in 'ab']

You can add if conditions to list comprehensions.

z_list = [i.strip().split() for i in z if i[-1] == 'c']

z_list = [i.strip().split() for i in z if (i[0] <> 'a' and i[0] <> 'b')]

One way to do it is to replace 'pass' with 'continue'. This will continue to the next line in the file without doing anything. You will also need to append line to list_1

if line[-1] == 'c':
    list_1.append([line.strip().split()])

f=open("file")
for line in f:
   li=line.strip()
   if not li[0] in ["a","b"] and li[-1]=="c":
      print line.rstrip()
f.close()

For those interested in the solution.

And also, another question!

Example file format:

c this is a comment
p m 1468 1 267
260 32 0
8 1 0

Code:

fname = raw_input('Please enter the name of file: ')

z = open(fname, 'r')

required_list = [line.strip().split() for line in z if not line.startswith(('c', 'p'))]

print required_list

Output:

[['260', '32', '0'], ['8', '1', '0']]

Any suggestions on how to convert the strings in the lists to integers and perform arithmetic operations?

Pseudocode to illustrate:

#for the second item in each sublist
     #if sum is > than first number in second line of file
         #pass
     #else
         #abort/raise error

Cheers folks for your suggestions so far, Seafoid.

@Nadia, my day seems a little more worthwhile now! I spent hours (days even) trying to crack this solo! Thanks!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow