I'm using Rails 3.0.3 with REE ( Ruby 1.8.7 ) and gem 'mysql2', '0.2.6'
There's a search feature in my project that enable people to use the GET method using URL or using forms and then generate the URL.
Example:
I want to search:
origin city: "Århus, Denmark" and destination city: "Asunción, Paraguay"
they both have a special character: "Å" and "ó", so the URL will be generated like this when someone click the search button.
?&origin=%C5rhus%2C%20Denmark&destination=Asunci%F3n%2C%20Paraguay
Problem:
When i search that city, it's not unescaped like i want ( i tried using like CGI, URI, even some gems).
When i see at the console, ActiveRecord received the query like this:
Parameters: {"destination"=>"Asunci�n, Paraguay", "origin"=>"�rhus, Denmark", "sort"=>"newest"}
City Load (0.1ms) SELECT `cities`.* FROM `cities` WHERE (`cities`.`name` = '�rhus') ORDER BY cities.name ASC
City Load (6.8ms) SELECT `cities`.* FROM `cities` WHERE (`cities`.`name` = 'Asunci�n, Paraguay') ORDER BY cities.name ASC
Conclusion: the cities can't be found :(
But, i found an interesting thing:
When i made an error on the file asociated with this function, the output will be like this :
Request
Parameters:
{"destination"=>"Asunción,
Paraguay",
"origin"=>"Århus,
Denmark",
"sort"=>"newest"}
it's a valid one!
Question:
Do you guys have an idea how to solve this? Thanks in advance :)
You're right, it looks like you have an encoding problem somewhere. The 0xC5 character is "Å" in ISO-8859-1 (AKA Latin-1), in UTF-8 it would be %C3%85
in the URL.
I suspect that you're using JavaScript on the client side and that your JavaScript is using the old escape
function to build the URL, escape
has some issues with non-ASCII characters. If this is the case, then you should upgrade your JavaScript to use encodeURIComponent
instead. Have a look at this little demo and you'll see what I'm talking about:
http://jsfiddle.net/ambiguous/U5A3k/
If you can't change the client-side script then you can do it the hard way in Ruby using force_encoding
and encoding
:
>> s = CGI.unescape('%C5rhus%2C%20Denmark')
=> "\xC5rhus, Denmark"
>> s.encoding
=> #<Encoding:UTF-8>
>> s.force_encoding('iso-8859-1')
=> "\xC5rhus, Denmark"
>> s.encoding
=> #<Encoding:ISO-8859-1>
>> s.encode!('utf-8')
=> "Århus, Denmark"
>> s.encoding
=> #<Encoding:UTF-8>
You should get something like "\xC5rhus, Denmark"
from params
and you could unmangle that with:
s = params[:whatever].force_encoding('iso-8859-1').encode('utf-8')
Dealing with this on the server side would be a last resort though, if your client-side code is sending back incorrectly encoded data then you'll be left with a pile of guesswork on the server to figure out what encoding was actually used to get it into the URL.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With