Question

So I'm using HPPLE to do some Xpath queries in a iOS app that needs to do some basic web scraping of a few sites. Right now everything works pretty good, but I wanted to see if there's another, more elegant way of doing what I'm doing. Currently what I'm doing is that I'm using XPath to find a specific div class in a website, within that website (which is basically like a post) there can be any number of children that have text, and others that have text buried in another set of children. Right now I'm basically using repeated For Loops to check if the "text" tagName exists and if so add that value to a string and if not then check if there is another level of children that need to be scanned and I have 4 levels so far of the same search. I was wondering if there is some method that I can run that will redo the same search if the count of children within the current level is greater than 0. Below is the code for how I'm doing this now

 for (TFHppleElement *element in searchNodes) {
    //If a Text Node is found add it to the String, if not search again with next layer
    if ([element.tagName isEqualToString:@"text"]) {
        [bigString appendString:element.content];
    }
    //1. First layer Scan
    if (element.children.count > 0) {
        for (TFHppleElement *nextStep in element.children) { 
            if ([nextStep.tagName isEqualToString:@"text"]) {
                [bigString appendString:nextStep.content];
            }
            
            //2. Second layer Scan
            if (nextStep.children.count > 0) {
                for (TFHppleElement *child in nextStep.children) { 
                    if ([child.tagName isEqualToString:@"text"]) {
                        [bigString appendString:child.content];
                        
                    }
                    
                    //3. Thrid Layer Scan
                    if (child.children.count > 0) {
                        for (TFHppleElement *children in child.children) { 
                            if ([children.tagName isEqualToString:@"text"]){
                                [bigString appendString:children.content];
                            }
                            
                            //4. Fourth Layer Scan
                            if (children.children.count > 0) {
                                for (TFHppleElement *newchild in children.children){
                                    if ([newchild.tagName isEqualToString:@"text"]) {
                                        [bigString appendString:newchild.content];
                                        
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

I would like to have some kind of method build that I can just basically send over the initial NSArray and then it's checks for additional elements and then performs the search again with the next array all while continuing to build up a NSMutableString that will end up with all the text from every search. If not, what I have now seems to be working fine, I just wanted to see if there was a cleaner way of doing this.

Was it helpful?

Solution

I think what you want is recursion. You can write a recursive method that you pass an element to, have it modify some NSMutableString outside itself (an instance variable, maybe?), then call itself with its own children if it can. For example (uncompiled, untested):

@property (nonatomic, retain) NSMutableString * bigString;
// snip
@synthesize bigString;
// snip - assume bigString gets initialized somewhere

- (void)checkElement:(TFHppleElement *)elem {
    if ([element.tagName isEqualToString:@"text"]) {
        [bigString appendString:elem.content];
    }

    if (element.children.count > 0) {
        for (TFHppleElement * child in element.children) {
            [self checkElement:child];
        }
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top