If somebody would seek something similar I am on the end using TFHpple librabry to separate images from text elements in HTML data and then I change format attributes of the attributedString like this:
NSString *contentString = [self parseHTMLdata:bodyString];
NSMutableAttributedString *content = [[NSMutableAttributedString alloc] initWithData:[contentString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} documentAttributes:nil error:nil];
// prepare new format
NSRange effectiveRange = NSMakeRange(0, 0);
NSDictionary *attributes;
while (NSMaxRange(effectiveRange) < [content length]) {
attributes = [content attributesAtIndex:NSMaxRange(effectiveRange) effectiveRange:&effectiveRange];
UIFont *font = [attributes objectForKey:@"NSFont"];
if (font.pointSize == 18.0f) {
[content addAttribute:NSFontAttributeName value:self.headlineFont range:effectiveRange];
} else {
[content addAttribute:NSFontAttributeName value:self.bodyFont range:effectiveRange];
}
}
And the hpple part:
- (NSString *)parseHTMLdata:(NSString *)content
{
NSData *data = [content dataUsingEncoding:NSUTF8StringEncoding];
TFHpple *parser = [[TFHpple alloc] initWithHTMLData:data];
NSString *xpathQueryString = @"//body";
NSArray *elements = [[[parser searchWithXPathQuery:xpathQueryString] firstObject] children];
NSMutableString *textContent = [[NSMutableString alloc] init];
for (TFHppleElement *element in elements) {
if ([[element tagName] isEqualToString:@"h2"] || [[element tagName] isEqualToString:@"p"]) {
if ([[[element firstChild] tagName] isEqualToString:@"a"]) {
// image element, just save it in array
} else {
// pure h2 or p element
[textContent appendString:[element raw]];
}
}
}
return textContent;
}
Checking the font size in attributes may seem fragile, if it would cause some problems I can dig deeper into paragraph style which holds the heading/body tags.