I'd do it something like:
def main
@job_section_url = find_job_section_url()
write_header_to_file()
groups_urls_array = describe_groups(@job_section_url)
subcategories_urls_array = []
groups_urls_array.each do |group_url|
subcategories_urls_array << describe_groups(group_url)
end
subcategories_urls_array.flatten
end
# Parse a page into an array of groups URLs.
def describe_groups(job_section_url)
# @looking_for_a_job_string = '%D0%98%D1%89%D1%83+%D1%80%D0%B0%D0%B1%D0%BE%D1%82%D1%83'
doc = Nokogiri::HTML(open(job_section_url, 'Cookie' => 'city=3'))
doc.css('.group_title')[0..-2].map { |a|
URI.join(
DOMAIN_URL,
CGI.escape(a['href']).gsub('%2F', '/')
).to_s
}
end
Here are things to note:
main
is a poor choice for a method name. This isn't C so use something descriptive and mnemonic.- Comment your code prior to the definition of the method, not inside it using
=being
/=end
. Rdoc will find and parse the leading comment into decent documentation. Using#
is idiomatic Ruby, and while=begin
is supported, it's rarely used, actually it's mostly only in discussions like this. - Use more white-space. It's free, doesn't slow the application, and makes it a lot easier for your brain to read over time.
- Use
[]
to initialize empty arrays and similarly use{}
for hashes. They're shorter and you'll see them a lot more often thanArray.new
orHash.new
, except for when the block forms are being used. subcategories_urls_array.flatten
returns the flattened array.flatten!
returnsnil
if no sub-arrays existed, which is probably not what you want and is most likely a bug.- In
describe_groups
I DRY'd the code up by removing intermediate variables that didn't do anything useful. doc.css('.group_title')[0..-2].map
will return an array, relieving you of having to push elements onto an array and returning it. Because it's the last thing that happens in the method, Ruby will automatically use the returned value as the method's return value.