Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the benefit of using YAML?

When someone mentions the idea of saving information to external files by writing the inspect output and loading it via eval, I see that many people will criticize that idea and instead recommend using YAML. What is the problem of writing the output of inspect, and why is YAML preferrable? For human readability, I think ruby inspect or pp format is superior to YAML.

like image 578
sawa Avatar asked Jan 18 '23 15:01

sawa


2 Answers

Assuming nothing overrides inspect, of what use is this?:

#<Foo:0xa34feb8 @bar="wat">

When compared to this:

--- !ruby/object:Foo
bar: wat

YAML is more likely to produce useful output under non-trivial circumstances. It's also portable, and can be used as a more reliable way of sending serialized data between disparate systems.

like image 153
Dave Newton Avatar answered Jan 25 '23 15:01

Dave Newton


Security is the main concern. Because eval will run any code passed to it, a malicious hacker could inject code into your data file, and take control of your program. This may not be important for small scripts for yourself, but in a ruby on rails server, security will be important. Imagine we have the following code:

f=File.new("foobar.txt")
f.puts Foo.new.tap {|foo| foo.bar="bork"}.inspect

This would, assuming that Foo has not overwritten inspect, would give something like this:

#<Foo:0xff456a5 @bar="bork">

Obviously, this is not valid Ruby syntax. Oddly enough, eval throws no error, but just returns nil. That only makes this a worse idea, as a variable that you expected to be a Foo is now just nil (The infamous NoMethodError: undefined method 'bork' for nil:NilClass error).

The other large problem with this is that of security. Say your code saves data to a file, let's say foo.txt, and stores an inspected hash mapping bars to their respective bazes. A different program that needs to know these mappings to plan the FOO convention reads and evals this file. Imagine that this file is somewhere where a hacker could access it (which, when you think about it, is almost anywhere). If these programs are running on a RoR server that also stores all the foo's financial data, the hacker could cause mass chaos. If this hacker injected code into foo.txt, that, say, downloaded a malicious virus to the system and installed it, but still left the original program's hash at the end, it would execute unnoticed. Even if you eval the data with a $SAFE=4, the hacker can still damage the stability of the foo planing program by throwing errors and the like.

All in all, although the inspect-eval approach works for basic classes such as Hash, String, Array, etc., it depends on the class to give an exact syntactic representation of itself. For most, if not all applications it is a bad idea to use inspect-eval. YAML is preferred because it has a defined syntax for data, meaning that executable code mixed in would cause errors, rather than being mindlessly executed. Also, many developers use inspect for debugging, and would not expect an object to give a file dump of itself.

The other benefits of YAML are that it serializes complex objects easily. A complex object tree of foos and bars would be easy to do with YAML, but using inspect would create huge complication. In the final analysis, this can be thought of as the JSON problem--executable code in data executed because eval was used. inspect may be fine for small utilities for yourself, but never in production code, or code open to the great mean wide world.

like image 20
Linuxios Avatar answered Jan 25 '23 13:01

Linuxios