We started seeing some strange errors in our logs that normally appear when ruby isn't compiled properly with OpenSSL. But it's inconcistent...
We're getting errors like:
RuntimeError: Unsupported digest algorithm (SHA256).
(also with other digests, like sha1
). example error trace
Faraday::SSLError (SSL_CTX_new: (null))
example error trace
We managed to reproduce it when starting unicorn using service unicorn start
or systemctl start unicorn
. But only with some requests... Not all of them. Some requests that use OpenSSL under the hood do work. Others don't.
However, when we start unicorn with /etc/init.d/unicorn start
, everything works without a hitch. (to clarify, systemd starts the same /etc/init.d
script)
We tried debugging ENV
vars, user permissions, file/dir ownership, recompile ruby, bootstrap a new server from scratch... Nothing seems to help.
In case this helps:
What are we missing? What can we try that we haven't thought of?
/etc/systemd/system
apt
(explicitly removed, in case platform came pre-installed) and compiled from scratch. We're currently running 2.3.4 and tried also 2.3.6. Compiled either manually or using ruby-build. No rbenv
, nor RVM
.apt
(we're running apt-get install -y autoconf bison build-essential libssl-dev libyaml-dev libreadline6-dev zlib1g-dev libncurses5-dev libffi-dev libgdbm3 libgdbm-dev
before building ruby)We're using a scripted/repeatable build process for the VM (using fabric), and this problem is consistent on multiple VMs we bootstrapped on GCloud. We then tried a VM on DigitalOcean with the same bootstrap scripts, and the problem doesn't seem to appear there.
In both cases we picked Ubuntu 16.04 64bit base image, but obviously there are some differences with kernel versions, base installed packages etc...
The problem simply vanished. See my answer below.
@gingerlime I had a similar situation with our Jenkins on GCP, we're using ChefDK 3.1.0 (ruby embeed 2.5.1p57) -- tried other also, over a Jenkins that was running over systemd
(Ubuntu 16.04) and upstart
(Ubuntu 14.04) -- we tried on both versions, right now running over 16.04 in 4.15.0-1023-gcp
kernel version, running a few jobs with kitchen-docker
and this problem always emerge in a few situations.
I digged into and found that this only happens when the Etc.getlogin
class gets called (for me here), this doesn't return any error, it return the correct info, the correct type of the class (String
), but once it gets a call, the Unsupported digest algorithm
gets raised.
If I start the process manually by root
or jenkins
user, this problem doesn't happen. I tried to implement the Etc.getlogin
in several different ways, like using ENV['USER']
, a fixed String, or other classes from Etc
, like getpwuid
, simulating the return class and values from Etc.getlogin
, and the error doesn't get raised.
I'm not sure if this is some bug related to the ruby version and the custom kernel that GCP instances uses, but it happens in a similar situation like yours, and for me, the Etc.getlogin
was the problem. Right now, I fixed by using a custom configuration that doesn't gets the call from this function, and it's working normally.
One option is that this isn't an issue of sysVinit
vs systemd
at all, but you just haven't triggered the issue with your sysVinit
script yet.
When you run your svsVinit
script through the systemctl
command it's going through a compatibility layer, and there may be a problem there. Your problem would be simplified both yourself and for us if you reproduced the issue directly with a systemd
service file and shared that file.
You mentioned debugging ENV, but didn't mention exactly what you checked in the ENV. This is definitely one place where systemd
could make a difference. As seen in man systemd.exec
, systemd
sets $PATH in the environment to a fixed value:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
If this is not exactly the same as when run directly as an sysVinit
script, that could be an issue.
I would also check for all your copies of SSL on the system. Do you have more than one? Where? Do you have more than copy of the ruby
openssl
module loaded?
locate -r lib/.*libssl.*so
Also see the answer to the FAQ: Why do things behave differently under systemd?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With