문제

How do I rewrite this populating @subcategories_urls_array which is like "Fibonacci sequence in Ruby (recursion)".

Do I use a condition to check if there are no more '.group_title' CSS-selectors to stop the recursion, or could it be done with a flag variable that counts cycles?

def main
=begin
=end
  @job_section_url = find_job_section_url()
  write_header_to_file()
  @groups_urls_array = describe_groups(@job_section_url)
  @subcategories_urls_array = Array.new
  @groups_urls_array.each do |group_url|
    @subcategories_urls_array << describe_groups(group_url)
  end #each
  @subcategories_urls_array.flatten!

end #main

def describe_groups(job_section_url)
=begin
Parse a page into an array of groups URLs.
=end
  # @looking_for_a_job_string = '%D0%98%D1%89%D1%83+%D1%80%D0%B0%D0%B1%D0%BE%D1%82%D1%83'
  @groups_urls_array = Array.new
  @page = open(job_section_url, 'Cookie' => 'city=3')
  @doc = Nokogiri::HTML(@page)
  @nodeset = @doc.css('.group_title')[0..-2]
  @nodeset.each do |a|
    @group_url = CGI.escape(a['href']).gsub('%2F', '/')
    @group_url = URI.join(DOMAIN_URL, @group_url).to_s
    @groups_urls_array << @group_url
  end #each

  @groups_urls_array
end #describe_groups

And do I really need this to implement?

도움이 되었습니까?

해결책

I'd do it something like:

def main

  @job_section_url = find_job_section_url()

  write_header_to_file()

  groups_urls_array = describe_groups(@job_section_url)

  subcategories_urls_array = []

  groups_urls_array.each do |group_url|
    subcategories_urls_array << describe_groups(group_url)
  end

  subcategories_urls_array.flatten

end 

# Parse a page into an array of groups URLs.
def describe_groups(job_section_url)

  # @looking_for_a_job_string = '%D0%98%D1%89%D1%83+%D1%80%D0%B0%D0%B1%D0%BE%D1%82%D1%83'
  doc = Nokogiri::HTML(open(job_section_url, 'Cookie' => 'city=3'))

  doc.css('.group_title')[0..-2].map { |a|
    URI.join(
      DOMAIN_URL,
      CGI.escape(a['href']).gsub('%2F', '/')
    ).to_s
  } 

end 

Here are things to note:

  • main is a poor choice for a method name. This isn't C so use something descriptive and mnemonic.
  • Comment your code prior to the definition of the method, not inside it using =being/=end. Rdoc will find and parse the leading comment into decent documentation. Using # is idiomatic Ruby, and while =begin is supported, it's rarely used, actually it's mostly only in discussions like this.
  • Use more white-space. It's free, doesn't slow the application, and makes it a lot easier for your brain to read over time.
  • Use [] to initialize empty arrays and similarly use {} for hashes. They're shorter and you'll see them a lot more often than Array.new or Hash.new, except for when the block forms are being used.
  • subcategories_urls_array.flatten returns the flattened array. flatten! returns nil if no sub-arrays existed, which is probably not what you want and is most likely a bug.
  • In describe_groups I DRY'd the code up by removing intermediate variables that didn't do anything useful.
  • doc.css('.group_title')[0..-2].map will return an array, relieving you of having to push elements onto an array and returning it. Because it's the last thing that happens in the method, Ruby will automatically use the returned value as the method's return value.
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top