Question

I am trying to parse a really small subset of HTML markup.

PKSequence *parrser = [PKSequence sequence];
[parrser add:[PKSymbol symbolWithString:@"<title>"]];   
PKWord *word = [PKWord word];
[word setAssembler:self selector:@selector(workOnWordAssembly:)];
[parrser add:word];
[parrser add:[PKSymbol symbolWithString:@"</title>"]];

PKAssembly *result = [parrser bestMatchFor:[PKTokenAssembly assemblyWithString:@"<title>teeest</title>"]];


-(void)workOnWordAssembly:(PKAssembly *)a {
        NSLog(@"We entered this");
}

but workOnWordAssembly is not being called.

Was it helpful?

Solution

Developer of ParseKit here. Make sure you are using head of trunk on google code.

  1. Assembler callbacks now have two arguments.
  2. By default, the string <title> will not be tokenized as a single Symbol token. That would be one < Symbol token, one title Word token and one > Symbol token. You could configure that behavior, however.

Please read the documentation on ParseKit, particularly the tokenization docs to understand how tokenization in ParseKit works.


Here's what's missing to accomplish your basic task above. However, I'm not sure this is the best approach for a real world task. I think reading the docs mentioned above would help explain that.

PKTokenizer *t = [PKTokenizer tokenizerWithString:@"<title>foobar</title>"];
[t.symbolState add:@"<title>"];
[t.symbolState add:@"</title>"];

PKAssembly *a = [PKTokenAssembly assemblyWithTokenizer:t];

PKSequence *p = [PKSequence sequence];

[p add:[PKSymbol symbolWithString:@"<title>"]]; 

PKWord *word = [PKWord word];
[word setAssembler:self selector:@selector(parser:didMatchWord:)];
[p add:word];

[p add:[PKSymbol symbolWithString:@"</title>"]];

PKAssembly *result = [p bestMatchFor:a];

-(void)parser:(PKParser *)p didMatchWord:(PKAssembly *)a {
        NSLog(@"%s %@", __PRETTY_FUNCTION__, a);
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top