Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert attributed string, to, "simple" tagged html

I want to convert an NSAttributedString, to html like this:

This is a <i>string</i> with some <b>simple</b> <i><b>html</b></i> tags in it.

Unfortunately if you use apple's built-in system it generates verbose css-based html. (Example below for reference.)

So how to generate simple tagged html from an NSAttributedString?

I wrote a very verbose, fragile call to do it, which is a poor solution.

func simpleTagStyle(fromNSAttributedString att: NSAttributedString)->String {

    // verbose, fragile solution

    // essentially, iterate all the attribute ranges in the attString
    // make a note of what style they are, bold italic etc
    // (totally ignore any not of interest to us)
    // then basically get the plain string, and munge it for those ranges.
    // be careful with the annoying "multiple attribute" case
    // (an alternative would be to repeatedly munge out attributed ranges
    // one by one until there are none left.)

    let rangeAll = NSRange(location: 0, length: att.length)

    // make a note of all of the ranges of bold/italic
    // (use a tuple to remember which is which)
    var allBlocks: [(NSRange, String)] = []

    att.enumerateAttribute(
        NSFontAttributeName,
        in: rangeAll,
        options: .longestEffectiveRangeNotRequired
        )
            { value, range, stop in

            handler: if let font = value as? UIFont {

                let b = font.fontDescriptor.symbolicTraits.contains(.traitBold)
                let i = font.fontDescriptor.symbolicTraits.contains(.traitItalic)

                if b && i {
                    allBlocks.append( (range, "bolditalic") )
                    break handler   // take care not to duplicate
                }

                if b {
                    allBlocks.append( (range, "bold") )
                    break handler
                }

                if i {
                    allBlocks.append( (range, "italic") )
                    break handler
                }
            }

        }

    // traverse those backwards and munge away

    var plainString = att.string

    for oneBlock in allBlocks.reversed() {

        let r = oneBlock.0.range(for: plainString)!

        let w = plainString.substring(with: r)

        if oneBlock.1 == "bolditalic" {
            plainString.replaceSubrange(r, with: "<b><i>" + w + "</i></b>")
        }

        if oneBlock.1 == "bold" {
            plainString.replaceSubrange(r, with: "<b>" + w + "</b>")
        }

        if oneBlock.1 == "italic" {
            plainString.replaceSubrange(r, with: "<i>" + w + "</i>")
        }

    }

    return plainString
}

So here's how to use Apple's built in system, which unfortunately generates full-on CSS etc.

x = ... your NSAttributedText
var resultHtmlText = ""
do {

    let r = NSRange(location: 0, length: x.length)
    let att = [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType]

    let d = try x.data(from: r, documentAttributes: att)

    if let h = String(data: d, encoding: .utf8) {
        resultHtmlText = h
    }
}
catch {
    print("utterly failed to convert to html!!! \n>\(x)<\n")
}
print(resultHtmlText)

Example output....

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="Content-Style-Type" content="text/css">
<title></title>
<meta name="Generator" content="Cocoa HTML Writer">
<style type="text/css">
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Some Font'}
span.s1 {font-family: 'SomeFont-ItalicOrWhatever'; font-weight: normal; font-style: normal; font-size: 14.00pt}
span.s2 {font-family: 'SomeFont-SemiboldItalic'; font-weight: bold; font-style: italic; font-size: 14.00pt}
</style>
</head>
<body>
<p class="p1"><span class="s1">So, </span><span class="s2">here is</span><span class="s1"> some</span> stuff</p>
</body>
</html>
like image 644
Fattie Avatar asked Dec 23 '22 17:12

Fattie


2 Answers

According to the documentation of enumerateAttribute:inRange:options:usingBlock:, especially the Discussion part which states:

If this method is sent to an instance of NSMutableAttributedString, mutation (deletion, addition, or change) is allowed, as long as it is within the range provided to the block; after a mutation, the enumeration continues with the range immediately following the processed range, after the length of the processed range is adjusted for the mutation. (The enumerator basically assumes any change in length occurs in the specified range.) For example, if block is called with a range starting at location N, and the block deletes all the characters in the supplied range, the next call will also pass N as the index of the range.

In other words, in the closure/block, with the range, you can delete/replace characters there. The OS will put a marker on that end of the range. Once you did your modifications, it will compute the marker new range in order that the next iteration of the enumeration will start from that new marker. So you don't have to keep all the ranges in an array and apply the changes afterwards by doing a backward replacement to not modify the range. Don't bother you with that, the methods does it already.

I'm not a Swift developper, I'm more an Objective-C one. So my Swift code may not respect all "Swift rules", and may be a little ugly (optionals, wrapping, etc badly done, if let not done, etc.)

Here is my solution:

func attrStrSimpleTag() -> Void {

    let htmlStr = "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\"> <html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\"> <meta http-equiv=\"Content-Style-Type\" content=\"text/css\"> <title></title> <meta name=\"Generator\" content=\"Cocoa HTML Writer\"> <style type=\"text/css\"> p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Some Font'} span.s1 {font-family: 'SomeFont-ItalicOrWhatever'; font-weight: normal; font-style: normal; font-size: 14.00pt} span.s2 {font-family: 'SomeFont-SemiboldItalic'; font-weight: bold; font-style: italic; font-size: 14.00pt} </style> </head> <body> <p class=\"p1\"><span class=\"s1\">So, </span><span class=\"s2\">here is</span><span class=\"s1\"> some</span> stuff</p> </body></html>"
    let attr = try! NSMutableAttributedString.init(data: htmlStr.data(using: .utf8)!,
                                                   options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
                                                   documentAttributes: nil)
    print("Attr: \(attr)")
    attr.enumerateAttribute(NSFontAttributeName, in: NSRange.init(location: 0, length: attr.length), options: []) { (value, range, stop) in
        if let font = value as? UIFont {
            print("font found:\(font)")
            let isBold = font.fontDescriptor.symbolicTraits.contains(.traitBold)
            let isItalic = font.fontDescriptor.symbolicTraits.contains(.traitItalic)
            let occurence = attr.attributedSubstring(from: range).string
            let replacement = self.formattedString(initialString: occurence, bold: isBold, italic: isItalic)
            attr.replaceCharacters(in: range, with: replacement)
        }
    };

    let taggedString = attr.string
    print("taggedString: \(taggedString)")

}

func formattedString(initialString:String, bold: Bool, italic: Bool) -> String {
    var retString = initialString
    if bold {
        retString = "<b>".appending(retString)
        retString.append("</b>")
    }
    if italic
    {
        retString = "<i>".appending(retString)
        retString.append("</i>")
    }

    return retString
}

Output (for the last one, the other two prints are just for debug):

$> taggedString: So, <i><b>here is</b></i> some stuff

Edit: Objective-C Version (quickly written, maybe some issue).

-(void)attrStrSimpleTag
{
    NSString *htmlStr = @"<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01//EN\" \"http://www.w3.org/TR/html4/strict.dtd\"> <html> <head> <meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\"> <meta http-equiv=\"Content-Style-Type\" content=\"text/css\"> <title></title> <meta name=\"Generator\" content=\"Cocoa HTML Writer\"> <style type=\"text/css\"> p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px 'Some Font'} span.s1 {font-family: 'SomeFont-ItalicOrWhatever'; font-weight: normal; font-style: normal; font-size: 14.00pt} span.s2 {font-family: 'SomeFont-SemiboldItalic'; font-weight: bold; font-style: italic; font-size: 14.00pt} </style> </head> <body> <p class=\"p1\"><span class=\"s1\">So, </span><span class=\"s2\">here is</span><span class=\"s1\"> some</span> stuff</p> </body></html>";
    NSMutableAttributedString *attr = [[NSMutableAttributedString alloc] initWithData:[htmlStr dataUsingEncoding:NSUTF8StringEncoding]
                                                                              options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType}
                                                                   documentAttributes:nil
                                                                                error:nil];
    NSLog(@"Attr: %@", attr);

    [attr enumerateAttribute:NSFontAttributeName inRange:NSMakeRange(0, [attr length]) options:0 usingBlock:^(id  _Nullable value, NSRange range, BOOL * _Nonnull stop) {
        UIFont *font = (UIFont *)value;
        NSLog(@"Font found: %@", font);
        BOOL isBold =  UIFontDescriptorTraitBold & [[font fontDescriptor] symbolicTraits];
        BOOL isItalic =  UIFontDescriptorTraitItalic & [[font fontDescriptor] symbolicTraits];
        NSString *occurence = [[attr attributedSubstringFromRange:range] string];
        NSString *replacement = [self formattedStringWithString:occurence isBold:isBold andItalic:isItalic];
        [attr replaceCharactersInRange:range withString:replacement];
    }];

    NSString *taggedString = [attr string];
    NSLog(@"taggedString: %@", taggedString);
}


-(NSString *)formattedStringWithString:(NSString *)string isBold:(BOOL)isBold andItalic:(BOOL)isItalic
{
    NSString *retString = string;
    if (isBold)
    {
        retString = [NSString stringWithFormat:@"<b>%@</b>", retString];
    }
    if (isItalic)
    {
        retString = [NSString stringWithFormat:@"<i>%@</i>", retString];
    }
    return retString;
}

Edit Jan. 2020:
Updated code with easier modifications and Swift 5, adding support for two new effects (underline/strikethrough).

// MARK: In one loop
extension NSMutableAttributedString {
    func htmlSimpleTagString() -> String {
        enumerateAttributes(in: fullRange(), options: []) { (attributes, range, pointeeStop) in
            let occurence = self.attributedSubstring(from: range).string
            var replacement: String = occurence
            if let font = attributes[.font] as? UIFont {
                replacement = self.font(initialString: replacement, fromFont: font)
            }
            if let underline = attributes[.underlineStyle] as? Int {
                replacement = self.underline(text: replacement, fromStyle: underline)
            }
            if let striked = attributes[.strikethroughStyle] as? Int {
                replacement = self.strikethrough(text: replacement, fromStyle: striked)
            }
            self.replaceCharacters(in: range, with: replacement)
        }
        return self.string
    }
}

// MARK: In multiple loop
extension NSMutableAttributedString {
    func htmlSimpleTagString(options: [NSAttributedString.Key]) -> String {
        if options.contains(.underlineStyle) {
            enumerateAttribute(.underlineStyle, in: fullRange(), options: []) { (value, range, pointeeStop) in
                let occurence = self.attributedSubstring(from: range).string
                guard let style = value as? Int else { return }
                if NSUnderlineStyle(rawValue: style) == NSUnderlineStyle.styleSingle {
                    let replacement = self.underline(text: occurence, fromStyle: style)
                    self.replaceCharacters(in: range, with: replacement)
                }
            }
        }
        if options.contains(.strikethroughStyle) {
            enumerateAttribute(.strikethroughStyle, in: fullRange(), options: []) { (value, range, pointeeStop) in
                let occurence = self.attributedSubstring(from: range).string
                guard let style = value as? Int else { return }
                let replacement = self.strikethrough(text: occurence, fromStyle: style)
                self.replaceCharacters(in: range, with: replacement)
            }
        }
        if options.contains(.font) {
            enumerateAttribute(.font, in: fullRange(), options: []) { (value, range, pointeeStop) in
                let occurence = self.attributedSubstring(from: range).string
                guard let font = value as? UIFont else { return }
                let replacement = self.font(initialString: occurence, fromFont: font)
                self.replaceCharacters(in: range, with: replacement)
            }
        }
        return self.string

    }
}

//MARK: Replacing
extension NSMutableAttributedString {

    func font(initialString: String, fromFont font: UIFont) -> String {
        let isBold = font.fontDescriptor.symbolicTraits.contains(.traitBold)
        let isItalic = font.fontDescriptor.symbolicTraits.contains(.traitItalic)
        var retString = initialString
        if isBold {
            retString = "<b>" + retString + "</b>"
        }
        if isItalic {
            retString = "<i>" + retString + "</i>"
        }
        return retString
    }

    func underline(text: String, fromStyle style: Int) -> String {
        return "<u>" + text + "</u>"
    }

    func strikethrough(text: String, fromStyle style: Int) -> String {
        return "<s>" + text + "</s>"
    }
}

//MARK: Utility
extension NSAttributedString {
    func fullRange() -> NSRange {
        return NSRange(location: 0, length: self.length)
    }
}

Simple HTML to test with mixed tags: "This is <i>ITALIC</i> with some <b>BOLD</b> <b><i>BOLDandITALIC</b></i> <b>BOLD<u>UNDERLINEandBOLD</b>RESTUNDERLINE</u> in it."

The solutions brings two approaches: One doing one loop, the other doing multiple loops, but for mixed tags, the result could be strange. Check with the sample provided previously the different rendering.

like image 85
Larme Avatar answered Dec 31 '22 12:12

Larme


I have good way to convert NSAttributedString into simple HTML string .

1) Take UIWebView and UITextView.

2) Set your Attributed string in WebView.

[webView loadHTMLString:[yourAttributedString stringByReplacingOccurrencesOfString:@"\n" withString:@"<br/>"] baseURL:nil];

3) Get your HTML string from UIWebView.

NSString *simpleHtmlString = [webView stringByEvaluatingJavaScriptFromString:@"document.body.innerHTML"];
like image 22
Parth Patel Avatar answered Dec 31 '22 12:12

Parth Patel