Try pushing out your second loop and your return
line so that no redundant iteration happens and the final list is properly returned, something like the following:
from lxml import html
import requests as rq
def first_page_links(links):
recipe_links = []
recipe_html = []
for link in links:
r = rq.get(link)
recipe_html.append(html.fromstring(r.text))
for rhtml in recipe_html:
recipe_links.append(rhtml.xpath('//*[@id="content"]/ul/li/a/@href'))
return recipe_links
Let us know if this works.
EDIT:
Consider the following:
y_list = []
final_list = []
for x in x_list:
y_list.append(x)
for y in y_list:
final_list.append(y)
This is your function, simplified. Assuming in x_list
you have 3 URLs, what happens is the following:
x1
is appended toy_list
.y_list
is processed with onlyx1
so far, sox1
alone is appended tofinal_list
.final_list
now contains:[x1]
x2
is appended toy_list
.y_list
now containsx1
andx2
. Both are processed and appended tofinal_list
.final_list
now contains:[x1, x1, x2]
.x3
is appended toy_list
.y_list
now containsx1
,x2
, andx3
.- See where this is going? :)
Since your second loop, which processes the items in the first list, is inside the first loop, which adds incrementally to the first list, the second loop will process your first list on every iteration of the first loop. This makes it a redundant iteration.
There are many ways to execute what you wanted to do, but if you're just appending to lists and need a one-pass loop on both, the above fix was all that's needed.