It was not working because a CloudFlare app called ScrapeShield was activated.
To get this to work simply disable the ScrapeShield E-mail obfuscation
option inside the Apps panel (https://www.cloudflare.com/cloudflare-apps).
質問
While scraping a single order (full HTML code can be found here: http://pastebin.com/SaLc5jHu) page (admin part of my OpenCart shop) for the customer's email address I get the following as the email address value:
[email protected]
/* <![CDATA[ */
(function(){try{var s,a,i,j,r,c,l,b=document.getElementsByTagName("script");l=b[b.length-1].previousSibling;a=l.getAttribute('data-cfemail');if(a){s='';r=parseInt(a.substr(0,2),16);for(j=2;a.length-j;j+=2){c=parseInt(a.substr(j,2),16)^r;s+=String.fromCharCode(c);}s=document.createTextNode(s);l.parentNode.replaceChild(s,l);}}catch(e){}})();
/* ]]> */
Here's the code:
require 'mechanize'
a = Mechanize.new
a.get('http://exampleshop.nl/admin/') do |page|
# Select the login form
login_form = page.forms.first
# Insert the username and password
login_form.username = 'username'
login_form.password = 'password'
# Submit the login information
dashboard_page = a.submit(login_form, login_form.buttons.first)
# Check if the login was successfull
puts check_1 = dashboard_page.title == 'Dashboard' ? "CHECK 1 DASHBOARD SUCCESS" : "CHECK 1 DASHBOARD FAIL"
# Visit the orders index page to scrape some standard information
orders_page = a.click(dashboard_page.link_with(:text => /Bestellingen/))
# pp orders_page # => http://pastebin.com/L3zASer6
# Check if the visit is successful
puts check_2 = orders_page.title == 'Bestellingen' ? "CHECK 2 ORDERS SUCCESS" : "CHECK 2 ORDERS FAIL"
# Search for all #singleOrder table row's and put them in variable all_single_orders
all_single_orders = orders_page.search("#singleOrder")
# Scrape the needed information (the actual save to database is omitted)
all_single_orders.each do |order|
# Set links for each order
order_link = order.at_css("a")['href'] #Assuming first link in row
order_id = order.search("#orderId").text # => 259
order_status = order.search("#orderStatus").text # => Bestelling ontvangen
order_amount = order.search("#orderAmount").text # => € 41,94
# Visit a single order page to fetch more detailed information
single_order_page = orders_page.link_with(:href => order_link).click
# Fetch more information
puts first_name = single_order_page.search(".firstName").text
puts last_name = single_order_page.search(".lastName").text
puts email = single_order_page.search(".email").text # => [email protected] /* <![CDATA[ */...
puts postal_code = single_order_page.search(".postalCode").text
puts address = single_order_page.search(".address").text
puts product_quantity = single_order_page.search(".orderQuantity").text
end
end
Any ideas? I'm using Ruby 2.0.0 and Mechanize 2.7.3 and have CloudFlare setup.
Working now. To get this to work simply disable the ScrapeShield E-mail obfuscation option inside the CloudFlare's Apps panel (https://www.cloudflare.com/cloudflare-apps).
解決
It was not working because a CloudFlare app called ScrapeShield was activated.
To get this to work simply disable the ScrapeShield E-mail obfuscation
option inside the Apps panel (https://www.cloudflare.com/cloudflare-apps).