في بيثون، كيف يمكنني استخدام Urllib لمعرفة ما إذا كان موقع الويب 404 أو 200؟

https://stackoverflow.com/questions/1726402

19-09-2019
|

سؤال

كيفية الحصول على رمز الرؤوس من خلال Urllib؟

المحلول

تتم إرجاع الأسلوب GetCode () (المضافة في Python2.6) رمز حالة HTTP الذي تم إرساله باستخدام الاستجابة، أو لا شيء إذا كان عنوان URL ليس عنوان URL HTTP.

>>> a=urllib.urlopen('http://www.google.com/asdfsf')
>>> a.getcode()
404
>>> a=urllib.urlopen('http://www.google.com/')
>>> a.getcode()
200

نصائح أخرى

يمكنك استخدام Urllib2. كذلك:

import urllib2

req = urllib2.Request('http://www.python.org/fish.html')
try:
    resp = urllib2.urlopen(req)
except urllib2.HTTPError as e:
    if e.code == 404:
        # do something...
    else:
        # ...
except urllib2.URLError as e:
    # Not an HTTP-specific error (e.g. connection refused)
    # ...
else:
    # 200
    body = resp.read()

لاحظ أن HTTPError هي فرعية من URLError الذي يخزن رمز حالة HTTP.

لبثيون 3:

import urllib.request, urllib.error

url = 'http://www.google.com/asdfsf'
try:
    conn = urllib.request.urlopen(url)
except urllib.error.HTTPError as e:
    # Return code error (e.g. 404, 501, ...)
    # ...
    print('HTTPError: {}'.format(e.code))
except urllib.error.URLError as e:
    # Not an HTTP-specific error (e.g. connection refused)
    # ...
    print('URLError: {}'.format(e.reason))
else:
    # 200
    # ...
    print('good')

import urllib2

try:
    fileHandle = urllib2.urlopen('http://www.python.org/fish.html')
    data = fileHandle.read()
    fileHandle.close()
except urllib2.URLError, e:
    print 'you got an error with the code', e

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow