Question

So I have a bit of an issue that I am wondering if you guys can help me with. So I am writing a program right now that pulls some strings from html pages and adds them to a list. I have 50 some pages I am pulling data from. When I run the program it takes between 45 and 55 seconds to gather the data. Not bad, but I need to be somewhere on the order of 15-20 seconds.

So here is my question: My computer has a 800MHz process (ya I know, its four years old) and I am about to get a new computer, will having a faster processor help with this? If so what speed of processor should I look for to get to my desired speed. Is this speed more related to processor speed or connection speed (my internet connection is definitely fast enough for this application)? Is it able to be speed up?

Thanks!

Addition:

Here is the code used.

This function creates the list of lists that stores the data

def makesobjlist(objs, length):
    sets = [objs]
    for obj in objs:
        objlist = [obj]
        for i in range(1,length+1):
            objlist.append(0)
        sets.append(objlist)
    return sets

The following function then updates the list of lists

def update(objslist):
    for i in range(1, len(objslist)):
        objlist = objslist[i]
        objlist.append(getdata(objlist[0]))
        del(objlist[1])
Était-ce utile?

La solution

Python supports threading, multiple processes and queues.

You may gain some speed by simply having multiple workers perform the job than a single worker that has to wait. Basically you divide the "work" up amongst multiple programs (workers) that process the tasks at hand. This is much faster than having to wait for one long process to finish.

Similar post here:

Threading in python using queue

Multiprocessing vs Threading Python

Autres conseils

del(objlist[1])

If the objlist here can be long (more than a few dozens), then this line has bad complexity: it shifts all the end of the list. You should refactor the code to not do that. For example, you could arrange that the item to remove is the last item of the list instead of the item at index 1; del objlist[-1] is always a constant-time operation.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top