I am handling utf-8 strings in JavaScript and need to escape them. Both escape() / unescape() and encodeURI() / decodeURI() work in my browser. escape() <pre class="prettyprint"><code>> var hello = "안녕하세요" > var hello_escaped = escape(hello) > hello_escaped "%uC548%uB155%uD558%uC138%uC694" > var hello_unescaped = unescape(hello_escaped) > hello_unescaped "안녕하세요" </code></pre> encodeURI() <pre class="prettyprint"><code>> var hello = "안녕하세요" > var hello_encoded = encodeURI(hello) > hello_encoded "%EC%95%88%EB%85%95%ED%95%98%EC%84%B8%EC%9A%94" > var hello_decoded = decodeURI(hello_encoded) > hello_decoded "안녕하세요" </code></pre> However, Mozilla says that escape() is deprecated. Although encodeURI() and decodeURI() work with the above utf-8 string, the docs (as well as the function names themselves) tell me that these methods are for URIs; I do not see utf-8 strings mentioned anywhere. Simply put, is it okay to use encodeURI() and decodeURI() for utf-8 strings?

It is never okay to use <code>encodeURI()</code> or <code>encodeURIComponent()</code>. Let's try it out: <div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div class="snippet-code"> <pre class="prettyprint snippet-code-js lang-js prettyprint-override"><code>console.log(encodeURIComponent('@#*'));</code></pre> </div> </div> Input: <code>@#*</code>. Output: <code>%40%23*</code>. Wait, so, what exactly happened to the <code>*</code> character? Why wasn't that converted? Imagine this: You ask a user what file to delete and their response is <code>*</code>. Server-side, you convert that using <code>encodeURIComponent()</code> and then run <code>rm *</code>. Well, got news for you: using <code>encodeURIComponent()</code> means you just deleted all files. Use <code>fixedEncodeURI()</code>, when trying to encode a complete URL (i.e., all of <code>example.com?arg=val</code>), as defined and further explained at the MDN encodeURI() Documentation... <blockquote> <pre class="prettyprint"><code>function fixedEncodeURI(str) { return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']'); } </code></pre> </blockquote> Or, you may need to use use <code>fixedEncodeURIComponent()</code>, when trying to encode part of a URL (i.e., the <code>arg</code> or the <code>val</code> in <code>example.com?arg=val</code>), as defined and further explained at the MDN encodeURIComponent() Documentation... <blockquote> <pre class="prettyprint"><code>function fixedEncodeURIComponent(str) { return encodeURIComponent(str).replace(/[!'()*]/g, function(c) { return '%' + c.charCodeAt(0).toString(16); }); } </code></pre> </blockquote> If you are unable to distinguish them based on the above description, I always like to simplify it with: <ul> <li> <code>fixedEncodeURI()</code> : will not encode <code>+@?=:#;,$&</code> to their http-encoded equivalents (as <code>&</code> and <code>+</code> are common URL operators)</li> <li> <code>fixedEncodeURIComponent()</code> will encode <code>+@?=:#;,$&</code> to their http-encoded equivalents.</li> </ul>

Using encodeURI() vs. escape() for utf-8 strings in JavaScript

Tags:

javascript

encode

escaping

unicode

utf-8

I am handling utf-8 strings in JavaScript and need to escape them.

Both escape() / unescape() and encodeURI() / decodeURI() work in my browser.

escape()

> var hello = "안녕하세요" > var hello_escaped = escape(hello) > hello_escaped   "%uC548%uB155%uD558%uC138%uC694" > var hello_unescaped = unescape(hello_escaped) > hello_unescaped   "안녕하세요"

encodeURI()

> var hello = "안녕하세요"     > var hello_encoded = encodeURI(hello) > hello_encoded   "%EC%95%88%EB%85%95%ED%95%98%EC%84%B8%EC%9A%94" > var hello_decoded = decodeURI(hello_encoded) > hello_decoded   "안녕하세요"

However, Mozilla says that escape() is deprecated.

Although encodeURI() and decodeURI() work with the above utf-8 string, the docs (as well as the function names themselves) tell me that these methods are for URIs; I do not see utf-8 strings mentioned anywhere.

Simply put, is it okay to use encodeURI() and decodeURI() for utf-8 strings?

395

asked Jul 28 '14 19:07

SeanPlusPlus

2 Answers

Hi!

When it comes to escape and unescape, I live by two rules:

Avoid them when you easily can.
Otherwise, use them.

Avoiding them when you easily can:

As mentioned in the question, both escape and unescape have been deprecated. In general, one should avoid using deprecated functions.

So, if encodeURIComponent or encodeURI does the trick for you, you should use that instead of escape.

Using them when you can't easily avoid them:

Browsers will, as far as possible, strive to achieve backwards compatibility. All major browsers have already implemented escape and unescape; why would they un-implement them?

Browsers would have to redefine escapeand unescape if the new specification requires them to do so. But wait! The people who write specifications are quite smart. They too, are interested in not breaking backwards compatibility!

I realize that the above argument is weak. But trust me, ... when it comes to browsers, deprecated stuff works. This even includes deprecated HTML tags like <xmp> and <center>.

Using `escape` and `unescape`:

So naturally, the next question is, when would one use escape or unescape?

Recently, while working on CloudBrave, I had to deal with utf8, latin1 and inter-conversions.

After reading a bunch of blog posts, I realized how simple this was:

var utf8_to_latin1 = function (s) {     return unescape(encodeURIComponent(s)); }; var latin1_to_utf8 = function (s) {     return decodeURIComponent(escape(s)); };

These inter-conversions, without using escape and unescape are rather involved. By not avoiding escape and unescape, life becomes simpler.

Hope this helps.

114

answered Sep 24 '22 07:09

Sumukh Barve

It is never okay to use encodeURI() or encodeURIComponent(). Let's try it out:

console.log(encodeURIComponent('@#*'));

Input: @#*. Output: %40%23*. Wait, so, what exactly happened to the * character? Why wasn't that converted? Imagine this: You ask a user what file to delete and their response is *. Server-side, you convert that using encodeURIComponent() and then run rm *. Well, got news for you: using encodeURIComponent() means you just deleted all files.

Use fixedEncodeURI(), when trying to encode a complete URL (i.e., all of example.com?arg=val), as defined and further explained at the MDN encodeURI() Documentation...

function fixedEncodeURI(str) {    return encodeURI(str).replace(/%5B/g, '[').replace(/%5D/g, ']'); }

Or, you may need to use use fixedEncodeURIComponent(), when trying to encode part of a URL (i.e., the arg or the val in example.com?arg=val), as defined and further explained at the MDN encodeURIComponent() Documentation...

function fixedEncodeURIComponent(str) {  return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {    return '%' + c.charCodeAt(0).toString(16);  }); }

If you are unable to distinguish them based on the above description, I always like to simplify it with:

fixedEncodeURI() : will not encode +@?=:#;,$& to their http-encoded equivalents (as & and + are common URL operators)
fixedEncodeURIComponent() will encode +@?=:#;,$& to their http-encoded equivalents.

answered Sep 21 '22 07:09

HoldOffHunger

Related questions
                            
                                Current State of Javascript WebRTC Libraries? [closed]
                            
                                Cannot assign to read only property 'props' of #<Object> in react native
                            
                                Import ReactJS component from another file?
                            
                                Unobtrusive Javascript rich text editor? [closed]
                            
                                What's the difference between "DOMContent event" and "load event"
                            
                                Closures vs. classes for encapsulation?
                            
                                Javascript variable access in HTML
                            
                                Performance using JS querySelector [closed]
                            
                                Are Up, Down, Left, Right Arrow KeyCodes always the same?
                            
                                Failed to construct Notification: Illegal constructor
                            
                                Setting a length (height or width) for one element minus the variable length of another, i.e. calc(x - y), where y is unknown
                            
                                What is the difference between build and dist folder?
                            
                                Displaying pdf from arraybuffer
                            
                                How to remove a specific element in array in JavaScript [duplicate]
                            
                                Memory leak risk in JavaScript closures
                            
                                RegEx with extended latin alphabet (ä ö ü è ß)
                            
                                Twitter bootstrap stop propagation on dropdown open
                            
                                Numbers localization in Web applications
                            
                                Install Bower components into two different directories?
                            
                                Usages of jQuery's ajax crossDomain property?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Using encodeURI() vs. escape() for utf-8 strings in JavaScript

Tags:

javascript

encode

escaping

unicode

utf-8

SeanPlusPlus

People also ask

2 Answers

Avoiding them when you easily can:

Using them when you can't easily avoid them:

Using `escape` and `unescape`:

Sumukh Barve

HoldOffHunger

Recent Activity

Donate For Us

Using encodeURI() vs. escape() for utf-8 strings in JavaScript

Tags:

javascript

encode

escaping

unicode

utf-8

SeanPlusPlus

People also ask

2 Answers

Avoiding them when you easily can:

Using them when you can't easily avoid them:

Using escape and unescape:

Sumukh Barve

HoldOffHunger

Related questions

Recent Activity

Donate For Us

Using `escape` and `unescape`: