You can either:
- use a key/value store if you lookup lots of keys.
- iterate over the file line by line and check for keys' existence if there are only a few keys to lookup.
Frage
I am very new to python. In a python script I need to check if input string is present in the set 'titles'; which I load from newline separated strings in files 'titles'. It consumes huge memory. I chose to store in set because there is if inputstring in titles:
later on.
Line # Mem usage Increment Line Contents
================================================
1 6.160 MiB 0.000 MiB @profile
2 def loadtitles():
3 515.387 MiB 509.227 MiB titles = open('titles').read().split()
4 602.555 MiB 87.168 MiB titles = set(titles)
Q1. Is there any other object type more memory efficient to store this large data?
One solution I can come up with is if I load file as string, it consumes exactly the same memory as filesize; which is 100% optimal consumption of memory.
Line # Mem usage Increment Line Contents
================================================
1 6.160 MiB 0.000 MiB @profile
2 def loadtitles():
3 217.363 MiB 211.203 MiB titles = open('titles').read()
then I can do if inputstring+'\n' in titles:
Q2. Is there a faster alternative to this?
Lösung
You can either:
Andere Tipps
Iterating file (processing line by line) instead of reading full contents of file will reduce memory consumption. (combining with generator expression):
def loadtitles():
with open('titles') as f:
titles = {word for line in f for word in line.split()}