I have made a program that parses text file and download data in parallel. When runs download method in 9 or less threads, the program doesn't have error. But when runs the method in 10 or more threads, the program throws "`initialize': getaddrinfo: Name or service not known (SocketError)" error. I tried some algorithms to run in parallel, but the same problem occurs. I put the url, which was passed to 'open' method(open-uri) when "Name or service not known" error happens, into browser and confirmed that this url is valid and received correct data.Here's partial code.
jobs = []
aps = []
....
#jobs are pushed into jobs[]
....
max_thread = 15
loop do
ary_threads = []
max_thread.times do |i|
break if jobs.size == 0
job = jobs.pop
ary_threads << Thread.start {
begin
request(job[0],job[1]).each do |ap| #in "request" method, open(url)are called
aps.push(ap)
end
end
}
end
ary_threads.each { |th| th.join }
break if jobs.size == 0
end
and error is
/usr/lib/ruby/1.9.1/net/http.rb:762:in `initialize': getaddrinfo: Name or service not known (SocketError)
from /usr/lib/ruby/1.9.1/net/http.rb:762:in `open'
from /usr/lib/ruby/1.9.1/net/http.rb:762:in `block in connect'
from /usr/lib/ruby/1.9.1/timeout.rb:54:in `timeout'
from /usr/lib/ruby/1.9.1/timeout.rb:99:in `timeout'
from /usr/lib/ruby/1.9.1/net/http.rb:762:in `connect'
from /usr/lib/ruby/1.9.1/net/http.rb:755:in `do_start'
from /usr/lib/ruby/1.9.1/net/http.rb:744:in `start'
from /usr/lib/ruby/1.9.1/open-uri.rb:306:in `open_http'
from /usr/lib/ruby/1.9.1/open-uri.rb:775:in `buffer_open'
from /usr/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'
from /usr/lib/ruby/1.9.1/open-uri.rb:201:in `catch'
from /usr/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'
from /usr/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'
from /var/lib/gems/1.9.1/gems/open-uri-cached-0.0.5/lib/open-uri/cached.rb:10:in `open_uri'
from /usr/lib/ruby/1.9.1/open-uri.rb:677:in `open'
from /usr/lib/ruby/1.9.1/open-uri.rb:33:in `open'
from Test1.rb:42:in `request'
from Test1.rb:77:in `block (3 levels) in <main>'
Why does this happen? Have anyone encountered similar problem? Please help me!
3 hours after first question,I found temporary solution. If I sandwiched 'open' method in 'request' method with 'begin ~ rescue ~ retry ~ end', the error does not happen when the second time 'open' called.Here's the code.
begin
response = open(url)
rescue Exception
puts url
puts "retrying"
retry
end
After catching Exception and displaying url and "retrying", the url and "retrying" will never be displayed and the program working correctly:) But still can't I find what causes this problem.
I think it might be because of race condition between threads. Try doing the operations atomically. Put the mutex lock.
@mutex = Mutex.new
@mutex.syncronize do
...
ary_threads << Thread.start {
begin
request(job[0],job[1]).each do |ap| #in "request" method, open(url)are called
aps.push(ap)
end
end
}
...
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With