Question

I'm trying to automate a tedious task to update the statuses of our change requests/cr's/bugs. Since the DataBase schema is extremely complex, I've resorted to issued a curl command to the web server to download a dump of the CRs along with their associated statuses. I was doing with with an os.system() call but I've decided to make it more pythonic and use pycurl. The problem, i think, is that when I write the downloaded csv to disk, the file is not completed yet when I go to access it (right after the c.perform()). I'm lead to believe this because the error shows an list index out of range. But when I open the file myself it looks like all the data is there. Here's the code snippet (inside the find_bugs method I issue a split on each line and index into the relevant column of each row - that's where the list index comes in):

f = open(cr_file, 'w+')

c = pycurl.Curl()
c.setopt(c.URL, csv_url)
c.setopt(c.WRITEFUNCTION, f.write)
c.setopt(c.HTTPHEADER, headers)
c.perform()


with open(cr_file, 'r') as f:
  ids = find_bugs(f.readlines())

Question: How do I write to disk using pycurl when I need to immediately access the file once complete?

Was it helpful?

Solution

Until the first file object is flushed/closed, the file content may not be in the file.

>>> f = open('text.csv', 'w+')
>>> f.write('asdf')
>>>
>>> f2 = open('text.csv', 'r')
>>> f2.read()
''
>>> f2.close()

After close:

>>> f.close()
>>> f2 = open('text.csv', 'r')
>>> f2.read()
'asdf'

Because you open the file with w+ mode, you can use that file object to read the content:

with open(cr_file, 'w+') as f:
    c = pycurl.Curl()
    c.setopt(c.URL, csv_url)
    c.setopt(c.WRITEFUNCTION, f.write)
    c.setopt(c.HTTPHEADER, headers)
    c.perform()

    f.seek(0)
    ids = find_bugs(f.readlines())
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top