Reading the scary doc, I know that if I provide the wrong arguments to dangerouslySetInnerHTML(), my trousers are down for XSS. What must I do upstream of this function call to be sure that I can use it safely? Look for and strip <script> tags from user input? What else?
When to use dangerouslySetInnerHTML. A use case where you need to set the HTML content of a DOM element is when you populate a <div> element with the data coming from a rich text editor. Imagine you have a webpage where people can submit comments and you allow them to use a rich text editor.
Dynamically rendering benign HTML code in React requires the use of dangerouslySetInnerHTML . That is not a naming mistake. This property is dangerous, and using it carelessly will create XSS vulnerabilities in your application.
Syntax and Need to use dangerouslySetInnerHTML In simple vanilla JS to insert the HTML into the web page we use innerHTML property like given below. But why is it called dangerouslySetInnerHTML? That's because this property is dangerous, and using it carelessly can create XSS vulnerabilities in your application.
However, if the application or site accepts multiple users' data input, you should be concerned. However, apart from that, there is one alternative to using dangerouslySetInnerHTML, simply setting innerHTML of a React HTML element using vanilla JS instead.
CAVEAT: I am not a security expert; the following summarizes the best understanding I have accumulated as a working layman.
The best way to be sure your "dangerous" inner HTML is safe is to make sure you only ever set it to HTML that you have generated yourself. In other words, you never display any content that has come from an outside source. That probably sounds too strict, but there's a workaround: if you want to include "tainted" content in your dangerous HTML, you can parse the tainted content and re-generate it. The basic idea is that your parser only recognizes legitimate inputs, and ignores everything else. It then takes the parsed input, and generates safe outputs.
For example, let's say we have the following rules:
Notice you're not blacklisting things like script tags, because you might not know everything that needs to be blacklisted. Instead, you're whitelisting certain specific things that you know are safe, and ignoring everything else. Once you're done parsing the input, you've got a list of known-safe strings and styled strings, and it's relatively straightforward to generate safe HTML output with embedded tags for styling.
Links and image tags are more difficult to handle safely, since any link/image could lead to malware, or to an innocuous-looking site that redirects to malware after a day or so. The best way I know of to be safe with images is to require them to be uploaded to a server equipped with virus scanners (which are not 100% foolproof either). For links, the best approach I can think of is to be sure that the actual link text is displayed along with the text you're linking. But I would still use the same approach: write a parser that knows how to parse safe URLs (for links or images), and does NOT know how to parse unsafe URLs, then regenerate the link/image from the parsed data. That's still a lot riskier than just displaying styled text, but if you need links/images, that's the best way I know of.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With