How to make Builder to not encode 'śćż' and other such characters. What I want is 'całość' to be literally printed in XML document. Example:
xml.instruct! :xml, :version => '1.0', :encoding => 'utf-8'
xml.Trader( :'xmlns:xsi' => "http://www.w3.org/2001/XMLSchema-instance",
:'xmlns:xsd' => "http://www.w3.org/2001/XMLSchema") do
xml.Informacje do
xml.RodzajPaczki 'całość'
xml.Program 'mine'
xml.WersjaProgramu '1.0'
end
end
Output:
<?xml version="1.0" encoding="utf-8"?>
<Trader xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Informacje>
<RodzajPaczki>całość</RodzajPaczki>
<Program>mine</Program>
<WersjaProgramu>1.0</WersjaProgramu>
</Informacje>
</Trader>
całość
should be całość
.
I saw pseudo solution like xml.RodzajPaczki {|t| t << 'całość' }
but it does not work correctly. It outdent 'całość' to left side of a document.
Here is what is happening. As we know by default Builder will escape non ASCII characters like the ones in całość
. You've also mentioned one possible way to kinda fix it and that is:
xml.RodzajPaczki {|t| t << 'całość' }
Unfortunately when you pass a block to the RodzajPaczki
element, Builder assumes that there will be some inner xml, so it adds a new line and applies the indent. Of course in our case there is only inner text and no xml so we get some unsightly output like:
<RodzajPaczki>
całość </RodzajPaczki>
There is an easy way and a harder way to fix this. First the easy way.
Configure Indent To Be Zero
Then you can use the fix from above xml.RodzajPaczki {|t| t << 'całość' }
everything will work as expected, but the output will not be pretty printed, it will infact be all on one line:
<?xml version="1.0" encoding="UTF-8"?><Trader xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><Informacje><RodzajPaczki>całość</RodzajPaczki><Program>mine</Program><WersjaProgramu>1.0</WersjaProgramu></Informacje></Trader>
You can run this through an external pretty printer if you want it nicely formatted.
If you simply must have pretty printed output and want no escaping, we need to patch Builder slightly. This is the harder way to fix this issue.
Patching Builder
We need to patch the initializer of our XmlMarkup
object to add an extra option :escape
. At the same time we patch the XmlBase
object to take this new option as a parameter. We default this new option to true
, to maintain the default behaviour. We then patch the text!
method on XmlBase
to use our new option to decide if we should escape text of not. Here is what it looks like:
module Builder
class XmlBase
def initialize(indent=0, initial=0, encoding='utf-8', escape=true)
@indent = indent
@level = initial
@encoding = encoding.downcase
@escape = escape
end
def text!(text)
if @escape
_text(_escape(text))
else
_text(text)
end
end
end
class XmlMarkup
def initialize(options={})
indent = options[:indent] || 0
margin = options[:margin] || 0
encoding = options[:encoding] || 'utf-8'
escape = options[:escape]
if escape == nil
escape = true
end
super(indent, margin, encoding, escape)
@target = options[:target] || ""
end
end
end
We can now use our newly patched builder in the following way (notice that when we construct the XmlMarkup
object we pass in our new :escape
options with a value of false
):
xml = Builder::XmlMarkup.new(:target=>STDOUT, :indent=>3, :encoding => 'utf-8', :escape => false)
xml.instruct! :xml, :version => '1.0', :encoding => 'UTF-8'
xml.Trader(:'xmlns:xsi' => "http://www.w3.org/2001/XMLSchema-instance", :'xmlns:xsd' => "http://www.w3.org/2001/XMLSchema") do
xml.Informacje do
xml.RodzajPaczki('całość')
xml.Program('mine')
xml.WersjaProgramu('1.0')
end
end
The output is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<Trader xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Informacje>
<RodzajPaczki>całość</RodzajPaczki>
<Program>mine</Program>
<WersjaProgramu>1.0</WersjaProgramu>
</Informacje>
</Trader>
As desired the text is not escaped. Note that the patch will apply this non-escaping behaviour to all text, so if you only want some of the text to be non-escaped while other text is still escaped you'll need to patch Builder to a much greater extent.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With