Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this %2B string being urldecoded?

[This may not be precisely a programming question, but it's a puzzle that may best be answered by programmers. I tried it first on the Pro Webmasters site, to overwhelming silence]

We have an email address verification process on our website. The site first generates an appropriate key as a string

mykey

It then encodes that key as a bunch of bytes

&$dac~ʌ����!

It then base64 encodes that bunch of bytes

JiRkYWN+yoyIhIQ==

Since this key is going to be given as a querystring value of a URL that is to be placed in an HTML email, we need to first URLEncode it then HTMLEncode the result, giving us (there's no effect of HTMLEncoding in the example case, but I can't be bothered to rework the example)

JiRkYWN%2ByoyIhIQ%3D%3D

This is then embedded in HTML that is sent as part of an email, something like:

click <a href="http://myapp/verify?key=JiRkYWN%2ByoyIhIQ%3D%3D">here</a>. 
Or paste <b>http://myapp/verify?key=JiRkYWN%2ByoyIhIQ%3D%3D</b> into your browser.

When the receiving user clicks on the link, the site receives the request, extracts the value of the querystring 'key' parameter, base64 decodes it, decrypts it, and does the appropriate thing in terms of the site logic.

However on occasion we have users who report that their clicking is ineffective. One such user forwarded us the email he had been sent, and on inspection the HTML had been transformed into (to put it in terms of the example above)

click <a href="http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D">here</a>
Or paste <b>http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D</b> into your browser.

That is, the %2B string - but none of the other percentage encoded strings - had been converted into a plus. (It's definitely leaving us with the right values - I've looked at the appropriate SMTP logs).

key=JiRkYWN%2ByoyIhIQ%3D%3D
key=JiRkYWN+yoyIhIQ%3D%3D

So I think that there are a couple of possibilities:

  1. There's something I'm doing that's stupid, that I can't see, or

  2. Some mail clients convert %2b strings to plus signs, perhaps to try to cope with the problem of people mistakenly URLEncoding plus signs

In case of 1 - what is it? In case of 2 - is there a standard, known way of dealing with this kind of scenario?

Many thanks for any help

like image 825
Yellowfog Avatar asked Sep 06 '13 08:09

Yellowfog


2 Answers

The problem lies at this step

on inspection the HTML had been transformed into (to put it in terms of the example above)

click <a href="http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D">here</a>
Or paste <b>http://myapp/verify?key=JiRkYWN+yoyIhIQ%3D%3D</b> into
your browser.

That is, the %2B string - but none of the other percentage encoded strings - had been converted into a plus

Your application at "the other end" must be missing a step of unescaping. Regardless of if there is a %2B or a + a function like perls uri_unescape returns consistent answers

DB<9> use URI::Escape;
DB<10> x uri_unescape("JiRkYWN+yoyIhIQ%3D%3D")
0  'JiRkYWN+yoyIhIQ=='
DB<11> x uri_unescape("JiRkYWN%2ByoyIhIQ%3D%3D")
0  'JiRkYWN+yoyIhIQ=='

Here is what should be happening. All I'm showing are the steps. I'm using perl in a debugger. Step 54 encodes the string to base64. Step 55 shows how the base64 encoded string could be made into a uri escaped parameter. Steps 56 and 57 are what the client end should be doing to decode.

One possible work around is to ensure that your base64 "key" does not contain any plus signs!

  DB<53> $key="AB~"
  DB<54> x encode_base64($key)
0  'QUJ+
'
  DB<55> x uri_escape('QUJ+') 
0  'QUJ%2B'
  DB<56> x uri_unescape('QUJ%2B')
0  'QUJ+'
  DB<57> $result=decode_base64('QUJ+')
  DB<58> x $result
0  'AB~'
like image 155
Vorsprung Avatar answered Oct 29 '22 05:10

Vorsprung


What may be happening here is that the URLDecode is turning the %2b into a +, which is being interpreted as a space character in the URL. I was able to overcome a similar problem by first urldecoding the string, then using a replace function to replace spaces in the decoded string with + characters, and then decrypting the "fixed" string.

like image 31
rcoopman Avatar answered Oct 29 '22 05:10

rcoopman