Question

I'm using the below code to check server response codes. Instead of manually entering the URLs, I'd like python to check a CSV (data.csv) and then export the results to a new CSV (new_data.csv). Does anyone know how to write this?

Thanks for your time!

import urllib2
for url in ["http://stackoverflow.com/", "http://stackoverflow.com/questions/"]:
try:
    connection = urllib2.urlopen(url)
    print connection.getcode()
    connection.close()
except urllib2.HTTPError, e:
    print e.getcode()

# Prints:
#200 or 404

UPDATE:

import csv

out=open("urls.csv","rb")
data=csv.reader(out)
data=[row for row in data]
out.close()

print data

import urllib2
for url in ["http://stackoverflow.com/", "http://stackoverflow.com/questions/"]:
try:
    connection = urllib2.urlopen(url)
    print connection.getcode()
    connection.close()
except urllib2.HTTPError, e:
    print e.getcode()

OUTPUT:

[['link'], ['link'], ['link'], ['link'], ['link'], ['link']]

200

200

UPDATE:

import csv

with open("urls.csv", 'r') as csvfile:
    urls = [row[0] for row in csv.reader(csvfile)]

import urllib2
for url in urls:
    try:
        connection = urllib2.urlopen(url)
        print connection.getcode()
        connection.close()
    except urllib2.HTTPError, e:
        print e.getcode()
Was it helpful?

Solution

I think you have your clue from your print data output: [['link'], ['link'], ['link'], ['link'], ['link'], ['link']] - This tells me that you are probably making a mistake with the line data=[row for row in data] as it is giving you a list of lists this is why you can not simply use for url in data:.

BTW you will find the whole thing less confusing if you put some thought into naming - e.g. input from a file handle called 'out' and data = something based on data...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top