Right now we're using the sanitize gem: https://github.com/rgrove/sanitize
Problem is if you enter "hello & world
" sanitize is saving that in the DB as:
hello & world
How can you whitelist the &
. We want sanitize to remove all possible malicious html and JS/script tags. but we're ok allowing the ampersand.
Ideas? Thanks
Sanitize will always transform what is output into html entities for valid html/xhtml.
The best way I can determine is filter the output
Sanitize.fragment("hello & world").gsub('&','&') #=> "Hello & world"
Use the strip_tags() method instead.
http://api.rubyonrails.org/classes/ActionView/Helpers/SanitizeHelper.html#method-i-sanitize
UnixMonkey's answer is what we ended up doing.
def remove_markup(html_str)
marked_up = Sanitize.clean html_str
ESCAPE_SEQUENCES.each do |esc_seq, ascii_seq|
marked_up = marked_up.gsub('&' + esc_seq + ';', ascii_seq.chr)
end
marked_up
end
Where ESCAPE_SEQUENCES was an array of the characters we didn't want escaped.
As of Rails 4.2, #strip_tags
does not unencode HTML special chars
strip_tags("fun & co")
=> "fun & co"
Otherwise you'd get the following:
strip_tags("<script>")
=> "<script>"
If you only want the ampersand I'd suggest filtering the output like @Unixmonkey suggested and keep it to &
only
strip_tags("<bold>Hello & World</bold>").gsub(/&/, "&")
=> "Hello & World"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With