Insert HTML into user string and render it in React (and avoid XSS threat)

Question

User is supplying a string to our React app, and it's being displayed to other users. I want to search for some characters, and replace them with some HTML, like if I were to search for the word "special," I would turn it into:

My <span class="special-formatting">special</span> word in a user string

Previously I was performing this replacement and then inserting the result into the DOM with dangerouslySetInnerHTML. This of course is now giving me the issue of users being able to type and enter whatever HTML/Javascript they please right into the app and render it for everyone to see.

I tried escaping the HTML characters to their entities, but dangerouslySetInnerHTML appears to render the HTML entities proper and not as an actual string. (EDIT: see below, this was the actual solution)

Is there any way to convert their message to a pure string, still preserving the display of those special characters, but also insert my own HTML into the string? Trying to avoid running a script after each string is inserted to the DOM.

Here's some more info regarding the current flow. All examples are pretty optimized to only show the relevant code.

The user text is submitted to the database with this function:

handleSubmit(event) {
        event.preventDefault();

        var messageText = this.state.messageValue;

        //bold font is missing some common characters, fake way of making the normal font look bold
        if (this.state.bold == true) {
            messageText = messageText.replace(/\'/g, "<span class='bold-apostrophe'>'</span>");
            messageText = messageText.replace(/\"/g, "<span class='bold-quote'>&quot;</span>");
            messageText = messageText.replace(/\?/g, "<span class='bold-question'>?</span>");
            messageText = messageText.replace(/\*/g, "<span class='bold-asterisk'>*</span>");
            messageText = messageText.replace(/\+/g, "<span class='bold-plus'>+</span>");
            messageText = messageText.replace(/\./g, "<span class='bold-period'>.</span>");
            messageText = messageText.replace(/\,/g, "<span class='bold-comma'>,</span>");
        }

        Messages.insert({
            text: messageText,
            createdAt: new Date(),
            userId: user._id,
            bold: this.state.bold,
        });

    }

So, I did my replacements without issue, however at this point, the messageText string could still contain undesired, user-input HTML code.

Then, our main app with the message list tries to render all the user messages:

render() {
    return (
        <div ref="messagesList">
            {this.renderMessages()}
        </div>
    );
}

renderMessages() {
    return [].concat(this.props.messages).reverse().map((message) => {
        return <Message
            key={message._id}
            message={message} />;
        }
    });
}

In Message.jsx is where I'm doing the final touches to the message string (certain changes I don't want saved into the database of messages) and inserting it into an element to return:

export default class Message extends React.Component {
    render() {

        var processedMessageText = this.props.message.text;

        //another find and replace to insert images for :image_name: strings, similar to how Discord inputs its emoji
        processedMessageText = processedMessageText.replace(/:([\w]+):/g, function (text) {
            text = text.replace(/:/g, "");
            if (text.indexOf("_s") !== -1) {
                text = text.replace(/_s/g, "");
                text = "<img class='small-smiley' src='/smileys/small/" + text + ".png'>";
                return text;
            }
            else {
                text = "<img class='smiley' src='/smileys/" + text + ".png'>";
                return text;
            }
        });

        return (
            <div>
                <div className='username'>{this.props.message.username}: </div>
                <div className='text' dangerouslySetInnerHTML={{ __html: processedMessageText }}></div>
            </div>
        );
    }
}

So again, if the user includes malicious HTML in their input string, it will travel through all of this and get output to the message list, which is real bad. I'm hoping there's some way I can perform these desired HTML insertions to their string, while also not rendering the HTML that they potentially input as actual HTML. I would also still like to show characters commonly used in HTML, like angle brackets (<>), so I want to avoid outright stripping their input string of common HTML characters.

Since the accepted answer doesn't have much detail, I'll post what I ended up doing here. I HTML encoded the characters suggested by OWASP before adding my own HTML and rendering it into an HTML element's content. I wanted to avoid using another library, so I just did this:

messageText = messageText.replace(/\&/g, "&amp;");
messageText = messageText.replace(/</g, "&lt;");
messageText = messageText.replace(/>/g, "&gt;");
messageText = messageText.replace(/\//g, "&#x2F;");
messageText = messageText.replace(/\'/g, "&#x27;");
messageText = messageText.replace(/\"/g, "&quot;");

After doing so I was no longer able to insert anything malicious, and tested using various test strings from OWASP without issue.

ebessa · Accepted Answer

The problem began when you injected HTML in the user's input text before saving it to the database. That makes things difficult because now you have to sanitize it, but not so much.

As a remedy, you can use dompurify or sanitize-html to remove any html but the html you've injected. Here's an example using dompurify:

import DOMPurify from "dompurify";

const dangerousString =
"<img onError='alert(\"h4ck3r\")' src='will throw error' /><span class='bold-apostrophe'>'</span>";

<div
  dangerouslySetInnerHTML={{
    __html: DOMPurify.sanitize(dangerousString, {
      ALLOWED_TAGS: ["span"],
      ALLOWED_ATTR: ["class"]
    })
  }}
/>

Keep in mind that sanitizer libs needs to be updated as frequently as possible, as hackers are constantly finding creative ways to bypass them.
The previous statement implies that you still may get XSS'ed. The only way to avoid it is to stop tempering strings with HTML before you save it to the database, so you can use a solution like the one presented by Ferrybig to add special formatting on the fly instead of dangerouslySetInnerHTML.

Nicholas Carey · Answer

Couldn't you just

HTML-encode the tainted string from the user.
Do your search/replace and insert your HTML.
Then do the dangerouslySetInnerHTML().

That should safely escape whatever the user entered and leave your inserted HTML element alone, no?

Insert HTML into user string and render it in React (and avoid XSS threat)

Tags:

javascript

html

reactjs

addMitt

2 Answers

ebessa

Nicholas Carey

Recent Activity

Donate For Us

Insert HTML into user string and render it in React (and avoid XSS threat)

Tags:

javascript

html

reactjs

addMitt

2 Answers

ebessa

Nicholas Carey

Related questions

Recent Activity

Donate For Us