If you want to find something specific and fine-tune your searches, you could build a basic web crawler that reads the HTML of a page and searches for the text you were expecting to find. You'll need to know how the site's pages are laid out more or less, but using .NET you could just use the WebClient
to download the HTML as a string like so...
// arguments could be passed into a method that wraps all this
// we're just setting them for now
var html = string.Empty;
var uri = "http://www.returbilen.se/category.html";
var query = new StringBuilder();
var args = new Dictionary<string, string>
{
{ "SHOW", "new" },
{ "anl", "1" }
}
// loop through the arguments to build your query string
// using a counter because you can't get the index of a
// un-ordered Dictionary and I'm loath to order query strings
var count = 0;
foreach (var arg in args)
{
count++;
query.AppendFormat("{0}={1}{2}", arg.Key, arg.Value, count < arg.Count
? "&" : string.empty );
}
// now fetch your HTML as a string
using (var wc = new WebClient())
{
html = wc.DownloadString(string.Format("{0}?{1}", uri, query.ToString()));
}
After this you can use the HtmlAgilityPack to parse the nodes and find what you want. However, you could also do something similar using a PHP simple script that loads the HTML based on the criteria you specify, then looks for whether your search term exists...
// same argument setup as before and this could also be passed
// into a basic function call, same looping logic, etc.
$uri = 'http://www.returbilen.se/category.html?';
$query = '';
$args = array(
'SHOW' => 'new',
'anl' => '1'
);
$count = 0;
foreach ($args as $k => $v) {
$count++;
$query .= $k . '=' . $v;
if ($count == count($args) {
$query .= '&';
}
}
// now load the HTML to use PHP's DOM parser
$html = file_get_html($uri . $query);
// now loop through the nodes to find the product you want
// making sure your search is more or less case invariant
foreach ($html->find('div.product') as $product) {
if (strtolower(strpos($product->find('div.name')), 'volvo') !== false) {
// do whatever you wish with the result
}
}
After you set this script up, you can just place it in a WAMP folder and schedule a job to call it at a given time, then open whatever report file is generated when it's done. Or you can create a page that will make a call to it like so assuming you return it as JSON...
$.getJSON('searchsite.php', function (data) {
// parse results into Knockout or add via jQuery
}
... and look if you have any hits for the Volvo, or just scrape the whole product detail using the PHP DOM Parser. You can also create something similar in .NET but you'd need to create a WebAPI project or a web service and then return your results in JSON.