I need to convert HTML data that consists of <h2>..</h2>, <p>..</p> and <a href=".."><img ..></a> elements into the attributedString with a proper formatting. I want to assign <h2> to UIFontTextStyleHeadline1 and <p> to UIFontTextStyleBody and store image links. I need the output to be attributedString with heading and body elements only and I will handle the images separately.

So far, I have this code:

NSMutableAttributedString *content = [[NSMutableAttributedString alloc] 
         initWithData:[[post objectForKey:@"content"] 
    dataUsingEncoding:NSUTF8StringEncoding] 
              options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
                   NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]}
   documentAttributes:nil error:nil];

which outputs to something like this:

Heading
{
    NSColor = "UIDeviceRGBColorSpace 0 0 0 1";
    NSFont = "<UICTFont: 0xd47bc00> font-family: \"TimesNewRomanPS-BoldMT\"; font-weight: bold; font-style: normal; font-size: 18.00pt";
    NSKern = 0;
    NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 14.94, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 2";
    NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0 1";
    NSStrokeWidth = 0;
}{
    NSAttachment = "<NSTextAttachment: 0xd486590>";
    NSColor = "UIDeviceRGBColorSpace 0 0 0.933333 1";
    NSFont = "<UICTFont: 0xd47cdb0> font-family: \"Times New Roman\"; font-weight: normal; font-style: normal; font-size: 12.00pt";
    NSKern = 0;
    NSLink = "http://www.placeholder.com/image.jpg";
    NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 12, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 0";
    NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0.933333 1";
    NSStrokeWidth = 0;
}
Body text, body text, body text. Body text, body text, body text.
{
    NSColor = "UIDeviceRGBColorSpace 0 0 0 1";
    NSFont = "<UICTFont: 0xd47cdb0> font-family: \"Times New Roman\"; font-weight: normal; font-style: normal; font-size: 12.00pt";
    NSKern = 0;
    NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 12, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 0";
    NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0 1";
    NSStrokeWidth = 0;
}

I am new to attributedString and seek for an efficient way to convert these attributes into the standard fonts mentioned above. Thank you.

有帮助吗?

解决方案

If somebody would seek something similar I am on the end using TFHpple librabry to separate images from text elements in HTML data and then I change format attributes of the attributedString like this:

NSString *contentString = [self parseHTMLdata:bodyString];

NSMutableAttributedString *content = [[NSMutableAttributedString alloc] initWithData:[contentString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} documentAttributes:nil error:nil];

// prepare new format
NSRange effectiveRange = NSMakeRange(0, 0);

NSDictionary *attributes;

while (NSMaxRange(effectiveRange) < [content length]) {

attributes = [content attributesAtIndex:NSMaxRange(effectiveRange) effectiveRange:&effectiveRange];

    UIFont *font = [attributes objectForKey:@"NSFont"];

    if (font.pointSize == 18.0f) {

        [content addAttribute:NSFontAttributeName value:self.headlineFont range:effectiveRange];

    } else {

        [content addAttribute:NSFontAttributeName value:self.bodyFont range:effectiveRange];
    }
}

And the hpple part:

- (NSString *)parseHTMLdata:(NSString *)content
{
    NSData *data = [content dataUsingEncoding:NSUTF8StringEncoding];

    TFHpple *parser = [[TFHpple alloc] initWithHTMLData:data];

    NSString *xpathQueryString = @"//body";

    NSArray *elements = [[[parser searchWithXPathQuery:xpathQueryString] firstObject] children];

    NSMutableString *textContent = [[NSMutableString alloc] init];

    for (TFHppleElement *element in elements) {

        if ([[element tagName] isEqualToString:@"h2"] || [[element tagName] isEqualToString:@"p"]) {

            if ([[[element firstChild] tagName] isEqualToString:@"a"]) {

                // image element, just save it in array
            } else {

                // pure h2 or p element
                [textContent appendString:[element raw]];
            }
        }
    }

    return textContent;
}

Checking the font size in attributes may seem fragile, if it would cause some problems I can dig deeper into paragraph style which holds the heading/body tags.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top