Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to ensure files are closed when reference is held by a different object

Tags:

ruby

nokogiri

In Ruby, when the reference to an open file is handed to another object, as in the following code, do I need to wrap the other object reference in a "begin/ensure" block, to ensure the unmanaged resources get closed, or is there another way?

@doc = Nokogiri::XML(File.open("shows.xml"))

@doc.xpath("//character")
# => ["<character>Al Bundy</character>",
#    "<character>Bud Bundy</character>",
#    "<character>Marcy Darcy</character>",
#    "<character>Larry Appleton</character>",
#    "<character>Balki Bartokomous</character>",
#    "<character>John "Hannibal" Smith</character>",
#    "<character>Templeton "Face" Peck</character>",
#    "<character>"B.A." Baracus</character>",
#    "<character>"Howling Mad" Murdock</character>"]
like image 968
user2029783 Avatar asked Jan 14 '23 02:01

user2029783


2 Answers

In general, it's OK to not worry about leaving a single file open, if nothing else will be attempting to write to it, and your program isn't a long-running application. Ruby will close the file as it shuts down and exits. (I suspect the OS will do it too if it sees the file is open, but that'd be hard to test without going into a low-level debugger or digging into the OS's code.)

If you're worried about it though, I'd recommend using the block form of File.open because it automatically closes the file when your code exits the block:

require 'nokogiri'

doc = ''
File.open('./test.html', 'r') do |fi|
  doc = Nokogiri::HTML(fi)
end

puts doc.to_html

Because I was curious and have always wondered, I ran a little test. I saved some HTML to a file called "test.html" and ran this in IRB:

test.rb(main):001:0> require 'nokogiri'
=> true
test.rb(main):002:0> page = File.open('test.html', 'r')
=> #<File:test.html>
test.rb(main):003:0> page.eof?
=> false
test.rb(main):004:0> page.closed?
=> false
test.rb(main):005:0> doc = Nokogiri::HTML(page)
=> #<Nokogiri::HTML::Document:0x3fc10149bc98 name="document" children=[#<Nokogiri::XML::DTD:0x3fc10149b6f8 name="html">, #<Nokogiri::XML::Element:0x3fc10149ef60 name="html" children=[#<Nokogiri::XML::Element:0x3fc10149ed58 name="head" children=[#<Nokogiri::XML::Element:0x3fc10149eb50 name="title" children=[#<Nokogiri::XML::Text:0x3fc10149e948 "Example Domain">]>, #<Nokogiri::XML::Element:0x3fc10149e740 name="meta" attributes=[#<Nokogiri::XML::Attr:0x3fc10149e6dc name="charset" value="utf-8">]>, #<Nokogiri::XML::Element:0x3fc10149e218 name="meta" attributes=[#<Nokogiri::XML::Attr:0x3fc10149e1b4 name="http-equiv" value="Content-type">, #<Nokogiri::XML::Attr:0x3fc10149e1a0 name="content" value="text/html; charset=utf-8">]>, #<Nokogiri::XML::Element:0x3fc10149da98 name="meta" attributes=[#<Nokogiri::XML::Attr:0x3fc10149da34 name="name" value="viewport">, #<Nokogiri::XML::Attr:0x3fc10149da20 name="content" value="width=device-width, initial-scale=1">]>, #<Nokogiri::XML::Element:0x3fc10149d318 name="style" attributes=[#<Nokogiri::XML::Attr:0x3fc10149d2b4 name="type" value="text/css">] children=[#<Nokogiri::XML::CDATA:0x3fc1014a0dd8 "\n\tbody {\n\t\tbackground-color: #f0f0f2;\n\t\tmargin: 0;\n\t\tpadding: 0;\n\t\tfont-family: \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n\n\t}\n\tdiv {\n\t\twidth: 600px;\n\t\tmargin: 5em auto;\n\t\tpadding: 3em;\n\t\tbackground-color: #fff;\n\t\tborder-radius: 1em;\n\t}\n\ta:link, a:visited {\n\t\tcolor: #38488f;\n\t\ttext-decoration: none;\n\t}\n\t@media (max-width: 600px) {\n\t\tbody {\n\t\t\tbackground-color: #fff;\n\t\t}\n\t\tdiv {\n\t\t\twidth: auto;\n\t\t\tmargin: 0 auto;\n\t\t\tborder-radius: 0;\n\t\t\tpadding: 1em;\n\t\t}\n\t}\n\t">]>]>, #<Nokogiri::XML::Element:0x3fc1014a0aa4 name="body" children=[#<Nokogiri::XML::Text:0x3fc1014a089c "\n">, #<Nokogiri::XML::Element:0x3fc1014a07c0 name="div" children=[#<Nokogiri::XML::Text:0x3fc1014a05b8 "\n\t">, #<Nokogiri::XML::Element:0x3fc1014a04dc name="h1" children=[#<Nokogiri::XML::Text:0x3fc1014a02d4 "Example Domain">]>, #<Nokogiri::XML::Text:0x3fc1014a00cc "\n\t">, #<Nokogiri::XML::Element:0x3fc10149fff0 name="p" children=[#<Nokogiri::XML::Text:0x3fc10149fde8 "This domain is established to be used for illustrative examples in documents. You do not need to\n\t\tcoordinate or ask for permission to use this domain in examples, and it is not available for\n\t\tregistration.">]>, #<Nokogiri::XML::Text:0x3fc10149fbe0 "\n\t">, #<Nokogiri::XML::Element:0x3fc10149fb04 name="p" children=[#<Nokogiri::XML::Element:0x3fc10149f8fc name="a" attributes=[#<Nokogiri::XML::Attr:0x3fc10149f898 name="href" value="http://www.iana.org/domains/special">] children=[#<Nokogiri::XML::Text:0x3fc10149f3d4 "More information...">]>]>, #<Nokogiri::XML::Text:0x3fc1014a309c "\n">]>, #<Nokogiri::XML::Text:0x3fc1014a2e94 "\n">]>]>]>
test.rb(main):006:0> page.eof?
=> true
test.rb(main):007:0> page.closed?
=> false
test.rb(main):008:0> page.close
=> nil
test.rb(main):009:0> page.closed?
=> true

So, in other words, Nokogiri does not close open files.

like image 82
the Tin Man Avatar answered Jan 19 '23 18:01

the Tin Man


I believe that Nokogiri will close the file as soon as it's read, but how about being safe and replacing File.open with File.read


Nope, as the Tin Man points out, Nokogiri does not close the file handle after reading it. The proper way to deal with it is still:

doc = Nokogiri::XML File.read("shows.xml")
like image 29
pguardiario Avatar answered Jan 19 '23 17:01

pguardiario