So I'm trying to use a Rails URL helper (page_url
) to create URLs that contain special characters, including ampersands. Most cases work like you'd expect them to:
(rdb:1) page_url('foo', :host => 'host')
"http://host/pages/foo"
(rdb:1) page_url('foo_%_bar', :host => 'host')
"http://host/pages/foo_%25_bar"
But for some odd reason, ampersands are not escaped:
(rdb:1) page_url('foo_&_bar', :host => 'host')
"http://host/pages/foo_&_bar"
And if I pre-escape them, they get corrupted:
(rdb:1) page_url('foo_%26_bar', :host => 'host')
"http://host/pages/foo_%2526_bar"
CGI::escape
, on the other hand, escapes them fine:
(rdb:1) CGI::escape('foo_&_bar')
"foo_%26_bar"
What's going on, and how do I work around this? (With something nicer than gsub('&', '%26')
, that is.)
I can't tell you a nicer way to deal with it - but I can explain why it's happening.
Ampersands are not invalid characters for a URL. Otherwise you'd have problems with: "http://host/pages/foo?bar=baz&style=foo_style" or whatever.
Edit: Digging deeper into the source code, it looks like Rails uses CGI.escape only on parameters.
The helper, url-generators use url_for (under the covers), which eventually calls: http://apidock.com/rails/ActionController/Routing/Route/generate Which calls stuff deep in the sprivate-methods of the source code... but eventually ends up calling CGI.escape (first look in actionpack/lib/action_controller/routing/route.rb then in actionpack/lib/action_controller/routing/segments.rb )
End result is that on the url itself, rails uses URI.escape - which notably does not update ampersands at all:
>> CGI.escape('/my_foo_&_bar')
=> "%2Fmy_foo_%26_bar"
>> URI.escape('/my_foo_&_bar')
=> "/my_foo_&_bar"
There's currently nothing you can do about this without putting an actual feature-request onto the rails team.
...unless you have the option to choose not to use ampersands in your URLs You can always gsub them out yourself for all URLs:
def my_clean_url(the_url)
return the_url.gsub(/&/,'_')
end
>> my_clean_url('/my_foo_&_bar')
=> "/my_foo___bar"
page_url(my_clean_url('/my_foo_&_bar'))
For all those who are trying to just encode anything other than a-z, A-Z, 0-9 and underscore:
URI.encode(string, /\W/)
Say you have some content which may contain e.g. ampersands and you want to use this content as body
parameter for a mailto
link: Without /\W/
, the ampersand (which is a safe URI character) would not be encoded and therefore partially break the link.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With