Question

is there any way to parsing google shopping results using TFHpple without using google API (deprecated) but simple using url like for example this: https://www.google.com/search?hl=en&tbm=shop&q=AudiR8 ?

I've tried many types of tags:

...
myCar = @"Audi R8";
myURL = [NSString stringWithFormat:@"https://www.google.com/search?hl=en&tbm=shop&q=%@",myCar];
NSData *htmlData = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];
TFHpple *xpath = [[TFHpple alloc] initWithHTMLData:htmlData];
//use xpath to search element
NSArray *elements = [NSArray new];
elements = [xpath searchWithXPathQuery:@"//html//body"]; // <-- tags
...

but nothing to do, always the same output console message: UNABLE TO PARSE.

Was it helpful?

Solution

I've found various problem and finally i've solved all. First of all it's necessary to encoding URL adding:

myURL = [myURL stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

Then, inside original (and actual) TFHPPLE code (for exactly XPathQuery.m) parsing phase going to crash 'cause any time nodeContent and Raw are NIL. So, to solve this crash I've changed

[resultForNode setObject:currentNodeContent forKey:@"nodeContent"];

with (ATTENTION FOR BOTH ROWS [resultForNode...:

if (currentNodeContent != nil)
   [resultForNode setObject:currentNodeContent forKey:@"nodeContent"];

and:

[resultForNode setObject:rawContent forKey:@"raw"];

with:

if (rawContent != nil)
      [resultForNode setObject:rawContent forKey:@"raw"];

I want to remember that, 'cause the harder html code used by google, i decide to use these xpathqueries:

...
        NSArray *elementsImages = [NSArray new];
        NSArray *elementsPrices = [NSArray new];
        elementsImages = [xpath searchWithXPathQuery:@"//html//*[@class=\"psliimg\"]"];
        elementsPrices = [xpath searchWithXPathQuery:@"//html//*[@class=\"psliprice\"]"];
...

Another inconvenience is when you decide to use a for or while cycle to retrieve various html pages, in fact if you use:

NSData *htmlData = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];

initWithContenctsOfURL many times during the cycle cannot get correctly page (and debug console write the famous UNABLE TO PARSE )so I've decide to change it with:

// Send a synchronous request
NSURLRequest * urlRequest = [NSURLRequest requestWithURL:[NSURL URLWithString:myURL]];
NSURLResponse * response = nil;
NSError * error = nil;
NSData * data = [NSURLConnection sendSynchronousRequest:urlRequest
                                          returningResponse:&response
                                                      error:&error];

if (error == nil)
{
    // Parse data here
}

And if you don't want to waiting this cycle 'cause it's maded by syncronous NSURLRequests try to call parent method with (and your viewcontroller don't freeze waiting for parser):

_dispatch_queue_t *queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
                    dispatch_async( _queue, // now i call my google shopping parser cycle
                    ^{
                        [self GShoppingParser];
});

OTHER TIPS

Can you try changing the below line

NSData *htmlData = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];

to

NSData *Data = [[NSData alloc] initWithContentsOfURL:[NSURL URLWithString:myURL]];

and also the below line

TFHpple *xpath = [[TFHpple alloc] initWithHTMLData:htmlData];

to

TFHpple *xpathParser = [[TFHpple alloc] initWithHTMLData:data]; 

Let me know if this helps else there is one more line that you may need to change in your code.

happy coding!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top