Access all loaded resources with Capybara or similar

https://stackoverflow.com/questions/12615646

04-07-2021
|

Pergunta

I’m looking for a possibility to easily access and list all resources loaded after a web page document has started to load: scripts, images, stylesheets, etc. using a headless browser. I'm interested in the files' url, status code and type etc..

Think of a way to programmatically access the information the Network tab (developer tools) gives you:

enter image description here

Does anyone know of a Ruby library to help me with this or — even better — if there’s a way to achieve this using Capybara(–webkit) ?

Update

It seems that Poltergeist has a method called network_traffic which does what I’m after. Haven’t had the time to research it yet, though. I'll report back once I do.

Solução

As mentioned in an update, there seems to be a way to do this with Poltergeist (a Capybara driver). Here’s a quick and very “hackish” experiment:

require 'rubygems'
require 'capybara'
require 'capybara/poltergeist'

driver = Capybara::Poltergeist::Driver.new({})
port   = Capybara::Poltergeist::Util.find_available_port
server = Capybara::Poltergeist::Server.new(port, 30)
client = Capybara::Poltergeist::Client.start(port,
  :path              => driver.options[:phantomjs],
  :window_size       => driver.options[:window_size],
  :phantomjs_options => driver.phantomjs_options
)

browser = Capybara::Poltergeist::Browser.new(server, client, nil)
browser.visit('http://www.google.com/')

browser.network_traffic.each do |request|
  # sorry, quick and dirty to see what we get:
  request.response_parts.uniq(&:url).each do |response|
    puts "#{response.url}: #{response.status}"
  end
end

=> 

http://www.google.com/: 200
http://ssl.gstatic.com/gb/images/b_8d5afc09.png: 200
http://www.google.com/images/srpr/logo1w.png: 200
http://www.google.com/images/srpr/nav_logo80.png: 200
http://www.google.com/xjs/_/js/hp/sb_he,pcc/rt=j/ver=FaiMBboaDLc.en_US./d=1/sv=1/rs=AItRSTMKxoHomLOW7ITf6OnfIEr5jQCEtA: 200

This however is very slow and of course far from anything usable. I’m planning on digging deeper into Poltergeist to maybe do the same on a lower level.

Outras dicas

It seems odd that you'd want this information during a Capybara test. It's good practice to write your UI tests to mirror actual user behaviour.

Consider a button that uses AJAX to update a block of text on the page. You could click the button then check to see that the request happened and inspect the return value. But you'd be better to test it as a user would: click the button, wait until the block of text changes, then confirm that it now displays the expected text.

If you really want to capture the network traffic, I'd have your test set up a transparent HTTP proxy, connect through that, and examine the request logs after the fact.

My team uses a similar approach to simulate disconnection from the Internet during Capybara tests. The Firefox profile we use is configured to point to a transparent proxy that is started at the beginning of each feature. That way we can write scenarios like:

Given I am online
 When I do something
  And I am offline
 Then something doesn't break

... where the am online and am offline steps just switch the proxy on and off.

Building on @polarblau's response

You can set a debug breakpoint in your test code and run...

page.driver.network_traffic.each { |request| request.response_parts.uniq(&:url).each { |response| puts "#{response.url}: #{response.status}" }}

The difference is that you don't need to start a new browser, and can see what your page has loaded.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow