Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to debug Ansible issues?

Sometimes, ansible doesn't do what you want. And increasing verbosity doesn't help. For example, I'm now trying to start coturn server, which comes with init script on systemd OS (Debian Jessie). Ansible considers it running, but it's not. How do I look into what's happening under the hood? Which commands are executed, and what output/exit code?

like image 557
x-yuri Avatar asked Feb 23 '17 13:02

x-yuri


People also ask

How do I debug an Ansible module?

Extract the module you want to debug from the zipped file that Ansible sent to the remote host: $ python AnsiballZ_my_test_module.py explode . Ansible will expand the module into ./debug-dir . You can optionally run the zipped file by specifying python AnsiballZ_my_test_module.py .

Why debug module gets used in Ansible?

This module prints statements during execution and can be useful for debugging variables or expressions without necessarily halting the playbook. Useful for debugging together with the 'when:' directive. This module is also supported for Windows targets.


2 Answers

Debugging modules

  • The most basic way is to run ansible/ansible-playbook with an increased verbosity level by adding -vvv to the execution line.

  • The most thorough way for the modules written in Python (Linux/Unix) is to run ansible/ansible-playbook with an environment variable ANSIBLE_KEEP_REMOTE_FILES set to 1 (on the control machine).

    It causes Ansible to leave the exact copy of the Python scripts it executed (either successfully or not) on the target machine.

    The path to the scripts is printed in the Ansible log and for regular tasks they are stored under the SSH user's home directory: ~/.ansible/tmp/.

    The exact logic is embedded in the scripts and depends on each module. Some are using Python with standard or external libraries, some are calling external commands.

Debugging playbooks

  • Similarly to debugging modules increasing verbosity level with -vvv parameter causes more data to be printed to the Ansible log

  • Since Ansible 2.1 a Playbook Debugger allows to debug interactively failed tasks: check, modify the data; re-run the task.

Debugging connections

  • Adding -vvvv parameter to the ansible/ansible-playbook call causes the log to include the debugging information for the connections.
like image 197
techraf Avatar answered Oct 11 '22 18:10

techraf


Debugging Ansible tasks can be almost impossible if the tasks are not your own. Contrary to what Ansible website states.

No special coding skills needed

Ansible requires highly specialized programming skills because it is not YAML or Python, it is a messy mix of both.

The idea of using markup languages for programming has been tried before. XML was very popular in Java community at one time. XSLT is also a fine example.

As Ansible projects grow, the complexity grows exponentially as result. Take for example the OpenShift Ansible project which has the following task:

- name: Create the master server certificate   command: >     {{ hostvars[openshift_ca_host]['first_master_client_binary'] }} adm ca create-server-cert     {% for named_ca_certificate in openshift.master.named_certificates | default([]) | lib_utils_oo_collect('cafile') %}     --certificate-authority {{ named_ca_certificate }}     {% endfor %}     {% for legacy_ca_certificate in g_master_legacy_ca_result.files | default([]) | lib_utils_oo_collect('path') %}     --certificate-authority {{ legacy_ca_certificate }}     {% endfor %}     --hostnames={{ hostvars[item].openshift.common.all_hostnames | join(',') }}     --cert={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.crt     --key={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.key     --expire-days={{ openshift_master_cert_expire_days }}     --signer-cert={{ openshift_ca_cert }}     --signer-key={{ openshift_ca_key }}     --signer-serial={{ openshift_ca_serial }}     --overwrite=false   when: item != openshift_ca_host   with_items: "{{ hostvars                   | lib_utils_oo_select_keys(groups['oo_masters_to_config'])                   | lib_utils_oo_collect(attribute='inventory_hostname', filters={'master_certs_missing':True}) }}"   delegate_to: "{{ openshift_ca_host }}"   run_once: true 

I think we can all agree that this is programming in YAML. Not a very good idea. This specific snippet could fail with a message like

fatal: [master0]: FAILED! => {"msg": "The conditional check 'item != openshift_ca_host' failed. The error was: error while evaluating conditional (item != openshift_ca_host): 'item' is undefined\n\nThe error appears to have been in '/home/user/openshift-ansible/roles/openshift_master_certificates/tasks/main.yml': line 39, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create the master server certificate\n ^ here\n"}

If you hit a message like that you are doomed. But we have the debugger right? Okay, let's take a look what is going on.

master0] TASK: openshift_master_certificates : Create the master server certificate (debug)> p task.args {u'_raw_params': u"{{ hostvars[openshift_ca_host]['first_master_client_binary'] }} adm ca create-server-cert {% for named_ca_certificate in openshift.master.named_certificates | default([]) | lib_utils_oo_collect('cafile') %} --certificate-authority {{ named_ca_certificate }} {% endfor %} {% for legacy_ca_certificate in g_master_legacy_ca_result.files | default([]) | lib_utils_oo_collect('path') %} --certificate-authority {{ legacy_ca_certificate }} {% endfor %} --hostnames={{ hostvars[item].openshift.common.all_hostnames | join(',') }} --cert={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.crt --key={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.key --expire-days={{ openshift_master_cert_expire_days }} --signer-cert={{ openshift_ca_cert }} --signer-key={{ openshift_ca_key }} --signer-serial={{ openshift_ca_serial }} --overwrite=false"} [master0] TASK: openshift_master_certificates : Create the master server certificate (debug)> exit 

How does that help? It doesn't.

The point here is that it is an incredibly bad idea to use YAML as a programming language. It is a mess. And the symptoms of the mess we are creating are everywhere.

Some additional facts. Provision of prerequisites phase on Azure of Openshift Ansible takes on +50 minutes. Deploy phase takes more than +70 minutes. Each time! First run or subsequent runs. And there is no way to limit provision to a single node. This limit problem was part of Ansible in 2012 and it is still part of Ansible today. This fact tells us something.

The point here is that Ansible should be used as was intended. For simple tasks without the YAML programming. Fine for lots of servers but it should not be used for complex configuration management tasks.

Ansible is a not Infrastructure as Code ( IaC ) tool.

If you ask how to debug Ansible issues, you are using it in a way it was not intended to be used. Don't use it as a IaC tool.

like image 43
onknows Avatar answered Oct 11 '22 19:10

onknows