Sometimes, <code>ansible</code> doesn't do what you want. And increasing verbosity doesn't help. For example, I'm now trying to start <code>coturn</code> server, which comes with init script on <code>systemd</code> OS (Debian Jessie). Ansible considers it running, but it's not. How do I look into what's happening under the hood? Which commands are executed, and what output/exit code?

Debugging modules <ul> <li>The most basic way is to run <code>ansible</code>/<code>ansible-playbook</code> with an increased verbosity level by adding <code>-vvv</code> to the execution line.</li> <li> The most thorough way for the modules written in Python (Linux/Unix) is to run <code>ansible</code>/<code>ansible-playbook</code> with an environment variable <code>ANSIBLE_KEEP_REMOTE_FILES</code> set to <code>1</code> (on the control machine). It causes Ansible to leave the exact copy of the Python scripts it executed (either successfully or not) on the target machine. The path to the scripts is printed in the Ansible log and for regular tasks they are stored under the SSH user's home directory: <code>~/.ansible/tmp/</code>. The exact logic is embedded in the scripts and depends on each module. Some are using Python with standard or external libraries, some are calling external commands. </li> </ul> Debugging playbooks <ul> <li>Similarly to debugging modules increasing verbosity level with <code>-vvv</code> parameter causes more data to be printed to the Ansible log</li> <li>Since Ansible 2.1 a Playbook Debugger allows to debug interactively failed tasks: check, modify the data; re-run the task.</li> </ul> Debugging connections <ul> <li>Adding <code>-vvvv</code> parameter to the <code>ansible</code>/<code>ansible-playbook</code> call causes the log to include the debugging information for the connections.</li> </ul>

How to debug Ansible issues?

Tags:

debugging

ansible

Sometimes, ansible doesn't do what you want. And increasing verbosity doesn't help. For example, I'm now trying to start coturn server, which comes with init script on systemd OS (Debian Jessie). Ansible considers it running, but it's not. How do I look into what's happening under the hood? Which commands are executed, and what output/exit code?

557

asked Feb 23 '17 13:02

x-yuri

2 Answers

Debugging modules

The most basic way is to run ansible/ansible-playbook with an increased verbosity level by adding -vvv to the execution line.
The most thorough way for the modules written in Python (Linux/Unix) is to run ansible/ansible-playbook with an environment variable ANSIBLE_KEEP_REMOTE_FILES set to 1 (on the control machine).

It causes Ansible to leave the exact copy of the Python scripts it executed (either successfully or not) on the target machine.

The path to the scripts is printed in the Ansible log and for regular tasks they are stored under the SSH user's home directory: ~/.ansible/tmp/.

The exact logic is embedded in the scripts and depends on each module. Some are using Python with standard or external libraries, some are calling external commands.

Debugging playbooks

Similarly to debugging modules increasing verbosity level with -vvv parameter causes more data to be printed to the Ansible log
Since Ansible 2.1 a Playbook Debugger allows to debug interactively failed tasks: check, modify the data; re-run the task.

Debugging connections

Adding -vvvv parameter to the ansible/ansible-playbook call causes the log to include the debugging information for the connections.

197

answered Oct 11 '22 18:10

techraf

Debugging Ansible tasks can be almost impossible if the tasks are not your own. Contrary to what Ansible website states.

No special coding skills needed

Ansible requires highly specialized programming skills because it is not YAML or Python, it is a messy mix of both.

The idea of using markup languages for programming has been tried before. XML was very popular in Java community at one time. XSLT is also a fine example.

As Ansible projects grow, the complexity grows exponentially as result. Take for example the OpenShift Ansible project which has the following task:

- name: Create the master server certificate   command: >     {{ hostvars[openshift_ca_host]['first_master_client_binary'] }} adm ca create-server-cert     {% for named_ca_certificate in openshift.master.named_certificates | default([]) | lib_utils_oo_collect('cafile') %}     --certificate-authority {{ named_ca_certificate }}     {% endfor %}     {% for legacy_ca_certificate in g_master_legacy_ca_result.files | default([]) | lib_utils_oo_collect('path') %}     --certificate-authority {{ legacy_ca_certificate }}     {% endfor %}     --hostnames={{ hostvars[item].openshift.common.all_hostnames | join(',') }}     --cert={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.crt     --key={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.key     --expire-days={{ openshift_master_cert_expire_days }}     --signer-cert={{ openshift_ca_cert }}     --signer-key={{ openshift_ca_key }}     --signer-serial={{ openshift_ca_serial }}     --overwrite=false   when: item != openshift_ca_host   with_items: "{{ hostvars                   | lib_utils_oo_select_keys(groups['oo_masters_to_config'])                   | lib_utils_oo_collect(attribute='inventory_hostname', filters={'master_certs_missing':True}) }}"   delegate_to: "{{ openshift_ca_host }}"   run_once: true

I think we can all agree that this is programming in YAML. Not a very good idea. This specific snippet could fail with a message like

fatal: [master0]: FAILED! => {"msg": "The conditional check 'item != openshift_ca_host' failed. The error was: error while evaluating conditional (item != openshift_ca_host): 'item' is undefined\n\nThe error appears to have been in '/home/user/openshift-ansible/roles/openshift_master_certificates/tasks/main.yml': line 39, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: Create the master server certificate\n ^ here\n"}

If you hit a message like that you are doomed. But we have the debugger right? Okay, let's take a look what is going on.

master0] TASK: openshift_master_certificates : Create the master server certificate (debug)> p task.args {u'_raw_params': u"{{ hostvars[openshift_ca_host]['first_master_client_binary'] }} adm ca create-server-cert {% for named_ca_certificate in openshift.master.named_certificates | default([]) | lib_utils_oo_collect('cafile') %} --certificate-authority {{ named_ca_certificate }} {% endfor %} {% for legacy_ca_certificate in g_master_legacy_ca_result.files | default([]) | lib_utils_oo_collect('path') %} --certificate-authority {{ legacy_ca_certificate }} {% endfor %} --hostnames={{ hostvars[item].openshift.common.all_hostnames | join(',') }} --cert={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.crt --key={{ openshift_generated_configs_dir }}/master-{{ hostvars[item].openshift.common.hostname }}/master.server.key --expire-days={{ openshift_master_cert_expire_days }} --signer-cert={{ openshift_ca_cert }} --signer-key={{ openshift_ca_key }} --signer-serial={{ openshift_ca_serial }} --overwrite=false"} [master0] TASK: openshift_master_certificates : Create the master server certificate (debug)> exit

How does that help? It doesn't.

The point here is that it is an incredibly bad idea to use YAML as a programming language. It is a mess. And the symptoms of the mess we are creating are everywhere.

Some additional facts. Provision of prerequisites phase on Azure of Openshift Ansible takes on +50 minutes. Deploy phase takes more than +70 minutes. Each time! First run or subsequent runs. And there is no way to limit provision to a single node. This limit problem was part of Ansible in 2012 and it is still part of Ansible today. This fact tells us something.

The point here is that Ansible should be used as was intended. For simple tasks without the YAML programming. Fine for lots of servers but it should not be used for complex configuration management tasks.

Ansible is a not Infrastructure as Code ( IaC ) tool.

If you ask how to debug Ansible issues, you are using it in a way it was not intended to be used. Don't use it as a IaC tool.

answered Oct 11 '22 19:10

onknows

Related questions
                            
                                Perl memory usage profiling and leak detection?
                            
                                How do you start running the program over again in gdb with 'target remote'?
                            
                                Segfaults in malloc() and malloc_consolidate()
                            
                                How does ptrace work in Linux?
                            
                                Why make use of HTTPS when Fiddler can decrypt it [duplicate]
                            
                                Maven Eclipse Debug "JDWP Transport dt_socket failed to initialize, TRANSPORT_INIT(510)"
                            
                                Redirect Console.Write... Methods to Visual Studio's Output Window While Debugging
                            
                                Difference between Debugger.Launch and Debugger.Break
                            
                                How to step through an R script from the beginning?
                            
                                How do I enable Mockito debug messages?
                            
                                Could not build the application for the simulator. Error launching application on iPhone 11 Pro Max
                            
                                Debugging functional code in Scala
                            
                                Null reference pointer was passed to the stub when not debugging with IE
                            
                                Print full signature of a method from a MethodInfo
                            
                                Xamarin Android (Visual Studio 2015) Could not connect to the debugger
                            
                                Inspecting STL containers in Visual Studio debugging
                            
                                Get object as JSON in IntelliJ Idea from debugger
                            
                                Error: device offline
                            
                                R: Error in fBody[[i]] : no such index at level 4
                            
                                Is there an #IF DEBUG for Asp.net markup?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With