Ansible Retry Examples - Retry a task until condition met | DevopsJunction

in this article, we are going to see how to retry an ansible task until it meets a certain condition or validation.

It is more like do sometask until somecondition kind of setup available in all the programming and scripting languages

Ansible lets you execute a task until a condition is met or satisfied. using retry until specification.

 

Ansible retry

 

Ansible Retry until file exists - A Quick example to Ansible Retry

In this quick example, you can see that we are trying to validate if the file is created at a certain path.

You can think of it as a file watcher,

Our Ansible task waits for the file to be created until it eventually times out. we can also define the number of retries and interval between each check

If the file is not created before the retries the task would be marked as failed.

---
- name: Retry until a file is available
hosts: localhost
tasks:
- name: Validate if the file is present
shell: ls -lrt /tmp/myprocess.pid
register: lsresult
until: "lsresult is not failed"
retries: 10
delay: 10
--- - name: Retry until a file is available hosts: localhost tasks: - name: Validate if the file is present shell: ls -lrt /tmp/myprocess.pid register: lsresult until: "lsresult is not failed" retries: 10 delay: 10
---
- name: Retry until a file is available
  hosts: localhost
  tasks:
    - name: Validate if the file is present
      shell: ls -lrt /tmp/myprocess.pid
      register: lsresult
      until: "lsresult is not failed"
      retries: 10
      delay: 10

This is how it looks like when the file is not created even after 10 retries with a delay of 10 seconds.

by default the value you are giving for delay would be considered in seconds.

ansible retry until

The task would be marked as failed

you can write your task with any module and make it retry until a certain condition is met.

Here I have used Shell module of ansible and running ls -lrt command. you can do the same with file module too

---
- name: Retry until a file is available
hosts: localhost
tasks:
- name: Validate if the file is present
file:
path: /tmp/myprocess.pid
state: file
register: lsresult
until: "lsresult is not failed"
retries: 10
delay: 10
--- - name: Retry until a file is available hosts: localhost tasks: - name: Validate if the file is present file: path: /tmp/myprocess.pid state: file register: lsresult until: "lsresult is not failed" retries: 10 delay: 10
---
- name: Retry until a file is available
  hosts: localhost
  tasks:
    - name: Validate if the file is present
      file:
        path: /tmp/myprocess.pid
        state: file
      register: lsresult
      until: "lsresult is not failed"
      retries: 10
      delay: 10

Here the state: file just returns the state of the path provided. So we are using it to validate the presence of the file here.

 

Where Ansible Retry until can be used?

As you can see we can use until, retries and delay keywords with any task and make it retry for no of times with a defined interval

I cannot say this is an alternative to the typical loop that runs the job until the condition is met.

As you might have already noticed

 Ansible Retry until can rerun the task only until the defined number of retries. It cannot run the task forever until the condition is met, like a typical loop.

So we cannot use ansible retry for tasks which we cannot foresee/estimate how long it may take.

So this is definitely not a replacement for our typical loop. But it solves a major problem of having an infinite loop by timing out the task after defined retries.

But sometimes we might want a task to continue infinite times until a certain condition is met, Despite the no of retries it has to take.

Is there a way to make ansible retry go infinite retries?

Fortunately Yes, there is a way to make Ansible retry to go on an infinite loop. we will talk about it later.

 

Ansible Retries without Until

When you run a task with until and register the result as a variable, the registered variable will include a key called “attempts”, which records the number of the retries for the task.

You must set the until parameter if you want a task to retry.

If until is not defined, the value for the retries parameter is forced to 1. So it would not retry

Let us take the same playbook we have been using so far, A File watcher.

All I am going to do is to remove the until keyword but the retries are going to remain as 10 times.

---
- name: Retry until a file is available
hosts: localhost
tasks:
- name: Validate if the file is present
file:
path: /tmp/myprocess.pid
state: file
register: lsresult
retries: 10
delay: 10
--- - name: Retry until a file is available hosts: localhost tasks: - name: Validate if the file is present file: path: /tmp/myprocess.pid state: file register: lsresult retries: 10 delay: 10
---
- name: Retry until a file is available
  hosts: localhost
  tasks:
    - name: Validate if the file is present
      file:
        path: /tmp/myprocess.pid
        state: file
      register: lsresult
      retries: 10
      delay: 10

 

Here is the execution output of this playbook ( on the right)

Ansible Retry until

You can see, Despite having the retries set to 10 the task did not re-run.  this would be the result of using retries without until conditional validation

Difference between Ansible Wait_for and Retry until

If you have been practising Ansible for some time then you might have got this question already in your mind.

How it is different from the Ansible wait_for module.

For those who do not know what is ansible wait_for please read our dedicated ansible wait_for article and return here

Ansible wait_for module examples – How to | Devops Junction

While ansible wait_for helps in a similar use case and makes the task wait until a certain condition is met.

There are few major differences between ansible wait_for and ansible retry until

Here I tried to summarize a few of them

Ansible wait_for  Ansible Retry until
A limited number of use cases, cannot be used directly with any module as wait_for is a module itself. Can rerun any module with Ansible retry, until
It can work with Ports and network interfaces and can wait until a reboot is completed Cannot work directly with Network interfaces or used for port, connection  monitoring
The customized failure message is possible Since Ansible Retry is not a separate module but an addon to rerun any module. it cannot have a customized failure message
Rerun the entire task as an attempt until the condition is met, so can track the no of attempts. Rerun the condition only, cannot track the attempts made

So both of them have specific use cases. While some of them can be overlapping like a file watcher but we can choose to use the right fit for your job.

In fact, the file watcher playbook we have been discussing in our article can be achieved through ansible wait_for too as I have mentioned.

here is the ansible-playbook waits for the file to be created using wait_for module

- name: Wait for the file to be available
register: waitforfile
wait_for:
path: /tmp/myprocess.pid
delay: 10
timeout: 30
state: present
msg: "Specified PID FILE is not present"
- name: Wait for the file to be available register: waitforfile wait_for: path: /tmp/myprocess.pid delay: 10 timeout: 30 state: present msg: "Specified PID FILE is not present"
- name: Wait for the file to be available
  register: waitforfile
  wait_for:
    path: /tmp/myprocess.pid
    delay: 10
    timeout: 30
    state: present
    msg: "Specified PID FILE is not present"

The execution output of this playbook has been given below.

As you can see,  It simply waits for the condition in the background and once the timeout is elapsed it fails.

So both ansible retries until and wait_for do their tasks in different manner/approach.

Ansible retry until

Here are some more examples we collected about Ansible retries until.

As we have already mentioned you can use retries with any module and make it retry until a certain condition is met

Ansible Retry until the remote URL is returning a message - Ansible wait_for URL or website

Here is an example of Ansible retry being used with the URI module to continuously check the remote URL and retry until the URL returns a certain message or content

In another word, we are waiting for the website or webpage to respond.

In this example, we are trying to hit the locally hosted web application at http://localhost:8080

but in real use cases, it can be a remote URL

here we are looking for a message Completed be present in the response. If the response is not Completed

or the site is not responding or returning some 5xx 4xx error. the job would simply retry 2 times with a delay of 10.

You can increase the delay and retries as per your requirement.

---
- name: Ansible Retry Examples
hosts: localhost
tasks:
- name: Job Status Check
uri:
url: http://localhost:8080
return_content: yes
register: result
until: "'Completed' in result.content"
ignore_errors: yes
retries: 2
delay: 10
--- - name: Ansible Retry Examples hosts: localhost tasks: - name: Job Status Check uri: url: http://localhost:8080 return_content: yes register: result until: "'Completed' in result.content" ignore_errors: yes retries: 2 delay: 10
---
- name: Ansible Retry Examples
  hosts: localhost
  tasks:
  - name: Job Status Check
    uri:
      url: http://localhost:8080
      return_content: yes
    register: result
    until: "'Completed' in result.content"
    ignore_errors: yes
    retries: 2
    delay: 10

Here is the execution result along with the simple server node js code I have used to setup my server which returns a message Completed

If the message is present. the job would immediately become successful at the first time as shown in the following snapshot

ansible retry

If the message is not present. the job would retry for the specified number of times before eventually failing.

Refer to the following snapshot where we have changed the message in the server.js to InProgress

We have also changed the retries to 10 and as you can see on the right side. It fails after trying for 10 times.

Ansible Retry until with include_tasks

here is one more example where we are going to execute a couple of other ansible playbooks using include_tasks until the condition is satisfied

It is similar to the same template we have been following. Instead of one task, we are going to invoke one/multiple tasks using include_tasks

now let us see an example where we are going to invoke two tasks

  • The first task is to retry until the file exists
  • Send Slack notification with the status of the file creation

Both these tasks would be on the same file which would be called by include_tasks

Since this is include_tasks, we are going to create two individual files.

  • main.yml - the main playbook
  • retry-until-file-create.yml - tasks file

here is the main.yml file which invokes the other tasks using include_tasks

main.yml

---
- name: Ansible Retry with include_tasks example
hosts: localhost
tasks:
- name: Executing the task
include_tasks:
file: retry-until-file-create.yaml
--- - name: Ansible Retry with include_tasks example hosts: localhost tasks: - name: Executing the task include_tasks: file: retry-until-file-create.yaml
---
- name: Ansible Retry with include_tasks example
  hosts: localhost
  tasks:
    - name: Executing the task
      include_tasks:
        file: retry-until-file-create.yaml

 

retry-until-file-create.yaml

---
- name: Wait for the file to be available
register: fileexists
file:
path: /tmp/myprocess.pid
state: file
until: fileexists is not failed
retries: 5
delay: 10
ignore_errors: true
- name: notify Slack that the job is failing
tags: slack
community.general.slack:
token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
msg: |
### StatusUpdate ###
– ------------------------------------
``
`Server`: {{ansible_host}}
`Status`: Ansible File Watcher Job failed
– ------------------------------------
channel: '#ansible'
color: good
username: 'Ansible on {{ inventory_hostname }}'
link_names: 0
parse: 'none'
when: fileexists is failed
- name: notify Slack that the job is Successful
tags: slack
community.general.slack:
token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
msg: |
### StatusUpdate ###
– ------------------------------------
``
`Server`: {{ansible_host}}
`Status`: Ansible File Watcher Job Successful.
– ------------------------------------
channel: '#ansible'
color: good
username: 'Ansible on {{ inventory_hostname }}'
link_names: 0
parse: 'none'
when: fileexists is not failed
--- - name: Wait for the file to be available register: fileexists file: path: /tmp/myprocess.pid state: file until: fileexists is not failed retries: 5 delay: 10 ignore_errors: true - name: notify Slack that the job is failing tags: slack community.general.slack: token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn msg: | ### StatusUpdate ### – ------------------------------------ `` `Server`: {{ansible_host}} `Status`: Ansible File Watcher Job failed – ------------------------------------ channel: '#ansible' color: good username: 'Ansible on {{ inventory_hostname }}' link_names: 0 parse: 'none' when: fileexists is failed - name: notify Slack that the job is Successful tags: slack community.general.slack: token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn msg: | ### StatusUpdate ### – ------------------------------------ `` `Server`: {{ansible_host}} `Status`: Ansible File Watcher Job Successful. – ------------------------------------ channel: '#ansible' color: good username: 'Ansible on {{ inventory_hostname }}' link_names: 0 parse: 'none' when: fileexists is not failed
---
  - name: Wait for the file to be available
    register: fileexists
    file:
      path: /tmp/myprocess.pid
      state: file
    until: fileexists is not failed
    retries: 5
    delay: 10
    ignore_errors: true
  
  - name: notify Slack that the job is failing
    tags: slack
    community.general.slack:
      token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
      msg: |
          ### StatusUpdate ###
          – ------------------------------------
          ``
          `Server`: {{ansible_host}}
          `Status`: Ansible File Watcher Job failed
          – ------------------------------------
      channel: '#ansible'
      color: good
      username: 'Ansible on {{ inventory_hostname }}'
      link_names: 0
      parse: 'none'
    when: fileexists is failed

  - name: notify Slack that the job is Successful
    tags: slack
    community.general.slack:
      token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
      msg: |
          ### StatusUpdate ###
          – ------------------------------------
          ``
          `Server`: {{ansible_host}}
          `Status`: Ansible File Watcher Job Successful.
          – ------------------------------------
      channel: '#ansible'
      color: good
      username: 'Ansible on {{ inventory_hostname }}'
      link_names: 0
      parse: 'none'
    when: fileexists is not failed

that is a. quick example of using incloude_tasks with ansible retries

The Main task would validate if the file exists for a defined period and fail, Since we have set ignore_errors: true it would not fail the entire playbook

It will then move on the next stage of sending notification based on the status of the main task.

If the file exists, the main task would be successful not failed and it would invoke the Successful message Slack notification

In case of failure, it would invoke the failure notification message.

Both are done using the when: result is failed or not failed validation

Here is a simple Slack notification that has been generated and received in our slack channel.

Ansible retry include_tasks

 

Ansible Retry until with include_tasks - Ansible Infinite Retry

As I have mentioned earlier, there is a way to make ansible retry to go infinite

Let's see, How to make Ansible retry never end

To make Ansible Retry go infinite and continue even after the retries timeout we have to use the same principle we used with import_tasks

The first thing to make ansible retry go infinite is to split the task you want to execute and run it as part of import_tasks

  • main.yml - the main playbook
  • retry-until-file-create.yml - tasks file

In the previous example, we have seen how to create two playbooks one is a main playbook and the other contains the tasks

The only difference between the previous example and the current one is one line.

we are going to make the retry-until-file-create.yml to call itself once again using the import_tasks

the task playbook would call itself once again ( restart) if the task is not successful

Here are the playbooks

main.yml

---
- name: Ansible Retry with include_tasks example
hosts: localhost
tasks:
- name: Executing the task
include_tasks:
file: retry-until-file-create.yaml
--- - name: Ansible Retry with include_tasks example hosts: localhost tasks: - name: Executing the task include_tasks: file: retry-until-file-create.yaml
---
- name: Ansible Retry with include_tasks example
  hosts: localhost
  tasks:
    - name: Executing the task
      include_tasks:
        file: retry-until-file-create.yaml

 

retry-until-file-create.yaml

---
- name: Wait for the file to be available
register: fileexists
file:
path: /tmp/myprocess.pid
state: file
until: fileexists is not failed
retries: 5
delay: 10
ignore_errors: true
- name: notify Slack that the job is failing
tags: slack
community.general.slack:
token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
msg: |
### StatusUpdate ###
– ------------------------------------
``
`Server`: {{ansible_host}}
`Status`: Ansible File Watcher Job failed
– ------------------------------------
channel: '#ansible'
color: good
username: 'Ansible on {{ inventory_hostname }}'
link_names: 0
parse: 'none'
when: fileexists is failed
ignore_errors: true
- name: notify Slack that the job is Successful
tags: slack
community.general.slack:
token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
msg: |
### StatusUpdate ###
– ------------------------------------
``
`Server`: {{ansible_host}}
`Status`: Ansible File Watcher Job Successful.
– ------------------------------------
channel: '#ansible'
color: good
username: 'Ansible on {{ inventory_hostname }}'
link_names: 0
parse: 'none'
when: fileexists is not failed
- name: Re run the task if failed
include_tasks: retry-until-file-create.yaml
when: "fileexists is failed"
--- - name: Wait for the file to be available register: fileexists file: path: /tmp/myprocess.pid state: file until: fileexists is not failed retries: 5 delay: 10 ignore_errors: true - name: notify Slack that the job is failing tags: slack community.general.slack: token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn msg: | ### StatusUpdate ### – ------------------------------------ `` `Server`: {{ansible_host}} `Status`: Ansible File Watcher Job failed – ------------------------------------ channel: '#ansible' color: good username: 'Ansible on {{ inventory_hostname }}' link_names: 0 parse: 'none' when: fileexists is failed ignore_errors: true - name: notify Slack that the job is Successful tags: slack community.general.slack: token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn msg: | ### StatusUpdate ### – ------------------------------------ `` `Server`: {{ansible_host}} `Status`: Ansible File Watcher Job Successful. – ------------------------------------ channel: '#ansible' color: good username: 'Ansible on {{ inventory_hostname }}' link_names: 0 parse: 'none' when: fileexists is not failed - name: Re run the task if failed include_tasks: retry-until-file-create.yaml when: "fileexists is failed"
---
  - name: Wait for the file to be available
    register: fileexists
    file:
      path: /tmp/myprocess.pid
      state: file
    until: fileexists is not failed
    retries: 5
    delay: 10
    ignore_errors: true
  
  - name: notify Slack that the job is failing
    tags: slack
    community.general.slack:
      token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
      msg: |
          ### StatusUpdate ###
          – ------------------------------------
          ``
          `Server`: {{ansible_host}}
          `Status`: Ansible File Watcher Job failed
          – ------------------------------------
      channel: '#ansible'
      color: good
      username: 'Ansible on {{ inventory_hostname }}'
      link_names: 0
      parse: 'none'
    when: fileexists is failed
    ignore_errors: true

  - name: notify Slack that the job is Successful
    tags: slack
    community.general.slack:
      token: T02****8KPF/*******/WOa7r*****tXy7Ao0jnWn
      msg: |
          ### StatusUpdate ###
          – ------------------------------------
          ``
          `Server`: {{ansible_host}}
          `Status`: Ansible File Watcher Job Successful.
          – ------------------------------------
      channel: '#ansible'
      color: good
      username: 'Ansible on {{ inventory_hostname }}'
      link_names: 0
      parse: 'none'
    when: fileexists is not failed

  - name: Re run the task if failed
    include_tasks: retry-until-file-create.yaml
    when: "fileexists is failed"

 

Here is the terminal recording of this playbook execution.

As you can see the job is continuously retrying without ending when the main task is failing and stop automatically when the main task ( File existence) is successful.

This is how we can make Ansible retry continue infinitely without timing out

 

Conclusion

In this article, we have learnt about ansible retry until and how it differs from ansible wait module

We also learnt the following key objectives.

  • How to retry until a task is complete with Ansible
  • How to retry until a file exists in Ansible
  • Retrying with include_tasks
  • How to send Slack notification when the job fails after retry
  • How to make Ansible retry to run infinite without a timeout

hope this helps.  If you are looking for professional DevOps support for your company or individually do reach out to us at hello@gritfy.com

 

Cheers
Sarav AK

Follow me on Linkedin My Profile
Follow DevopsJunction onFacebook orTwitter
For more practical videos and tutorials. Subscribe to our channel

Buy Me a Coffee at ko-fi.com

Signup for Exclusive "Subscriber-only" Content

Loading