Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I drive Ansible programmatically and concurrently?

I would like to use Ansible to execute a simple job on several remote nodes concurrently. The actual job involves grepping some log files and then post-processing the results on my local host (which has software not available on the remote nodes).

The command line ansible tools don't seem well-suited to this use case because they mix together ansible-generated formatting with the output of the remotely executed command. The Python API seems like it should be capable of this though, since it exposes the output unmodified (apart from some potential unicode mangling that shouldn't be relevant here).

A simplified version of the Python program I've come up with looks like this:

from sys import argv
import ansible.runner
runner = ansible.runner.Runner(
    pattern='*', forks=10,
    module_name="command",
    module_args=(
        """
        sleep 10
        """),
    inventory=ansible.inventory.Inventory(argv[1]),
)
results = runner.run()

Here, sleep 10 stands in for the actual log grepping command - the idea is just to simulate a command that's not going to complete immediately.

However, upon running this, I observe that the amount of time taken seems proportional to the number of hosts in my inventory. Here are the timing results against inventories with 2, 5, and 9 hosts respectively:

exarkun@top:/tmp$ time python howlong.py two-hosts.inventory
real    0m24.285s
user    0m0.216s
sys     0m0.120s
exarkun@top:/tmp$ time python howlong.py five-hosts.inventory                                                                                   
real    0m55.120s
user    0m0.224s
sys     0m0.160s
exarkun@top:/tmp$ time python howlong.py nine-hosts.inventory
real    1m57.272s
user    0m0.360s
sys     0m0.284s
exarkun@top:/tmp$

Some other random observations:

  • ansible all --forks=10 -i five-hosts.inventory -m command -a "sleep 10" exhibits the same behavior
  • ansible all -c local --forks=10 -i five-hosts.inventory -m command -a "sleep 10" appears to execute things concurrently (but only works for local-only connections, of course)
  • ansible all -c paramiko --forks=10 -i five-hosts.inventory -m command -a "sleep 10" appears to execute things concurrently

Perhaps this suggests the problem is with the ssh transport and has nothing to do with using ansible via the Python API as opposed to from the comand line.

What is wrong here that prevents the default transport from taking only around ten seconds regardless of the number of hosts in my inventory?

like image 896
Jean-Paul Calderone Avatar asked Jul 30 '13 23:07

Jean-Paul Calderone


2 Answers

Some investigation reveals that ansible is looking for the hosts in my inventory in ~/.ssh/known_hosts. My configuration has HashKnownHosts enabled. ansible isn't ever able to find the host entries it is looking for because it doesn't understand the hash known hosts entry format.

Whenever ansible's ssh transport can't find the known hosts entry, it acquires a global lock for the duration of the module's execution. The result of this confluence is that all execution is effectively serialized.

A temporary work-around is to give up some security and disabled host key checking by putting host_key_checking = False into ~/.ansible.cfg. Another work-around is to use the paramiko transport (but this is incredibly slow, perhaps tens or hundreds of times slower than the ssh transport, for some reason). Another work-around is to let some unhashed entries get added to the known_hosts file for ansible's ssh transport to find.

like image 67
Jean-Paul Calderone Avatar answered Oct 11 '22 10:10

Jean-Paul Calderone


Since you have HashKnownHosts enabled, you should upgrade to the latest version of Ansible. Version 1.3 added support for hashed known_hosts, see the bug tracker and changelog. This should solve your problem without compromising security (workaround using host_key_checking=False) or sacrificing speed (your workaround using paramiko).

like image 33
Jan Gondol Avatar answered Oct 11 '22 11:10

Jan Gondol