Question

I have downloaded a big file containing movie genres from imdb. The file is so big, that my super computer crash if i try to print everything out from the file.

Well, i need to get some genres on some movies out. To complete that, i made a list in python called movie.

This list contains movie names incl year in the same string. An example you can see here.

['The Shawshank Redemption (1994)\n',
 'The Godfather (1972)\n',
 'The Godfather: Part II (1974)\n',
 'The Dark Knight (2008)\n',
 'Pulp Fiction (1994)\n',

Well i have to make some for loops that for every line in the big file, it should check if one of the movienames appear from my movie list, and if it does it should append it too a new list called genrelist.

So the result would be a new list containing movie name incl genre for them ;)

I tried so far with:

filegenre = open("GenreMod.list", "r")
lines = filegenre.readlines()

for line in lines:
    for item in names:
        if item in line:
            genrelist.append(line)

print genrelist

But here it will only find the last name in the list names. So lets say if it search with the example i paste up, i will only find everything containing --> 'Pulp Fiction (1994) but not the rest?

Have i made some error code or`?

Was it helpful?

Solution

You need to keep the write file open. Your file is only writing the last loop instance.

with open("genrelist.ext", "w"):
    #do stuff

Nevermind, you are creating a list not a file. Poor Python skills here

OTHER TIPS

You can use list comprehension.

item = "Pulp Fiction"
with open("GenreMod.list", "r") as filegenre:
  print [line.strip() for line in filegenre if item in line]

If a file is open in text mode (what is now), then iterating over the file descriptor will return line by line.

List comprehension will loop over these lines, take into account only those, who have item in line and assign value for item in resulting list to original line with removed blanks (effectively removing "\n").

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top