I develop websites for corporate clients, so we see the ®, ™, etc. chars a whole lot. Sometimes I paste in huge blocks of copy, which might even contain pretty quotes (“ ”) or other strange characters from word processors.
So, my question is this: Does anyone know of a vim plugin or script that can, in one fell swoop, convert all these characters to html entities?
I think this covers all the bases of the entities it would be nice to have: http://web.forret.com/tools/charmap.asp
So, for the characters above, they would be replaced with ®
, ™
, “
, ”
, etc.
I tried the htmlspecialchars vimball (http://www.vim.org/scripts/script.php?script_id=2377), but no dice. It only performs its replacement like the PHP htmlsepcialchars function, replacing html-conflicting characters, and doesn't cover any additional special characters.
replace(/&/g, "&"). replace(/>/g, ">"). replace(/</g, "<"). replace(/"/g, """);
I would recommend Tim Pope's unimpaired plugin. It provides commands to encode and decode html entities, using the mappings: [x
and ]x
respectively.
Perl is better for this sort of things. Paste your file into vim and run this:
:%!perl -p -i -e 'BEGIN { use Encode; } $_=Encode::decode_utf8($_) unless Encode::is_utf8($_); $_=Encode::encode("ascii", $_, Encode::FB_HTMLCREF);'
Or even better:
%!perl -p -i -e 'BEGIN { use HTML::Entities; use Encode; } $_=Encode::decode_utf8($_) unless Encode::is_utf8($_); $_=Encode::encode("ascii", $_, sub{HTML::Entities::encode_entities(chr shift)});'
(HTML::Entities is a part of HTML::Parser on my system)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With