Question

I would like to use Ansible to execute a simple job on several remote nodes concurrently. The actual job involves grepping some log files and then post-processing the results on my local host (which has software not available on the remote nodes).

The command line ansible tools don't seem well-suited to this use case because they mix together ansible-generated formatting with the output of the remotely executed command. The Python API seems like it should be capable of this though, since it exposes the output unmodified (apart from some potential unicode mangling that shouldn't be relevant here).

A simplified version of the Python program I've come up with looks like this:

from sys import argv
import ansible.runner
runner = ansible.runner.Runner(
    pattern='*', forks=10,
    module_name="command",
    module_args=(
        """
        sleep 10
        """),
    inventory=ansible.inventory.Inventory(argv[1]),
)
results = runner.run()

Here, sleep 10 stands in for the actual log grepping command - the idea is just to simulate a command that's not going to complete immediately.

However, upon running this, I observe that the amount of time taken seems proportional to the number of hosts in my inventory. Here are the timing results against inventories with 2, 5, and 9 hosts respectively:

exarkun@top:/tmp$ time python howlong.py two-hosts.inventory
real    0m24.285s
user    0m0.216s
sys     0m0.120s
exarkun@top:/tmp$ time python howlong.py five-hosts.inventory                                                                                   
real    0m55.120s
user    0m0.224s
sys     0m0.160s
exarkun@top:/tmp$ time python howlong.py nine-hosts.inventory
real    1m57.272s
user    0m0.360s
sys     0m0.284s
exarkun@top:/tmp$

Some other random observations:

  • ansible all --forks=10 -i five-hosts.inventory -m command -a "sleep 10" exhibits the same behavior
  • ansible all -c local --forks=10 -i five-hosts.inventory -m command -a "sleep 10" appears to execute things concurrently (but only works for local-only connections, of course)
  • ansible all -c paramiko --forks=10 -i five-hosts.inventory -m command -a "sleep 10" appears to execute things concurrently

Perhaps this suggests the problem is with the ssh transport and has nothing to do with using ansible via the Python API as opposed to from the comand line.

What is wrong here that prevents the default transport from taking only around ten seconds regardless of the number of hosts in my inventory?

Était-ce utile?

La solution

Some investigation reveals that ansible is looking for the hosts in my inventory in ~/.ssh/known_hosts. My configuration has HashKnownHosts enabled. ansible isn't ever able to find the host entries it is looking for because it doesn't understand the hash known hosts entry format.

Whenever ansible's ssh transport can't find the known hosts entry, it acquires a global lock for the duration of the module's execution. The result of this confluence is that all execution is effectively serialized.

A temporary work-around is to give up some security and disabled host key checking by putting host_key_checking = False into ~/.ansible.cfg. Another work-around is to use the paramiko transport (but this is incredibly slow, perhaps tens or hundreds of times slower than the ssh transport, for some reason). Another work-around is to let some unhashed entries get added to the known_hosts file for ansible's ssh transport to find.

Autres conseils

Since you have HashKnownHosts enabled, you should upgrade to the latest version of Ansible. Version 1.3 added support for hashed known_hosts, see the bug tracker and changelog. This should solve your problem without compromising security (workaround using host_key_checking=False) or sacrificing speed (your workaround using paramiko).

With Ansible 2.0 Python API, I switched off StrictHostKeyChecking with

import ansible.constants

ansible.constants.HOST_KEY_CHECKING = False

I managed to speed up Ansible considerably by setting the following on managed computers. Newer sshd have the default the other way around, I think, so it might not be needed in your case.

/etc/ssh/sshd_config
----
UseDNS no
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top