ما هي هذه الوظيفة تفعل في بيثون التي تنطوي على orllib2 وجميل جميلة؟

https://stackoverflow.com/questions/991967

13-09-2019
|

سؤال

لذلك سألت سؤالا في وقت مبكر حول استرداد الدرجات المرتفعة شكل صفحة HTML وأعطاني مستخدم آخر التعليمات البرمجية التالية للمساعدة. أنا جديد في بيثون وجمالته، لذلك أحاول أن أذهب إلى بعض الرموز الأخرى قطعة قطعة واحدة. أنا أفهم معظمها ولكنني لا أحصل على ما هو هذه القطعة وما هي وظيفتها هي:

    def parse_string(el):
       text = ''.join(el.findAll(text=True))
       return text.strip()

هنا هو الكود بأكمله:

from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup
import sys

URL = "http://hiscore.runescape.com/hiscorepersonal.ws?user1=" + sys.argv[1]

# Grab page html, create BeatifulSoup object
html = urlopen(URL).read()
soup = BeautifulSoup(html)

# Grab the <table id="mini_player"> element
scores = soup.find('table', {'id':'mini_player'})

# Get a list of all the <tr>s in the table, skip the header row
rows = scores.findAll('tr')[1:]

# Helper function to return concatenation of all character data in an element
def parse_string(el):
   text = ''.join(el.findAll(text=True))
   return text.strip()

for row in rows:

   # Get all the text from the <td>s
   data = map(parse_string, row.findAll('td'))

   # Skip the first td, which is an image
   data = data[1:]

   # Do something with the data...
   print data

المحلول

el.findAll(text=True) إرجاع كل النص الوارد في عنصر وعناصره الفرعية. عن طريق النص أعني كل شيء ليس داخل علامة؛ لذلك في <b>hello</b> ثم "مرحبا" سيكون النص ولكن <b> و </b> لن.

ولذلك تعمل هذه الوظيفة معا على جميع النصوص الموجودة أسفل العنصر المحدد وشرائط مسافة بيضاء من الأمام والظهر.

هنا رابط إلى findAll توثيق: http://www.crummy.com/software/beautifulsourov/documentation.htmll#arg-text.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow