Getting started with Ansible

If you have ever found yourself managing more than one machine, be it servers, IoT devices or others, you may have found it challenging to keep the configuration across all these machines consistent. To make things worse, as you added more machines, you found it quickly became almost impossible to keep up.

Configuration-as-code (CaC) tools like Ansible allow you to manage the configuration of machines as code, that is, write your configuration in code where it can be version controlled, reviewed, or re-used. CaC tools typically also allow you to manage the configuration of many machines at once, which saves a lot of time!

What is Ansible?

Ansible is an open-source tool for configuration management, software deployment and provisioning. The latter is a use-case less often seen as other more purpose-built tools exist which are better suited to provisioning, but it’s worth mentioning. Ansible is praised for being approachable, agentless and straightforward.

Approachable

Ansible “playbooks” and other files are written in the YAML format; this makes Ansible easily human-readable and approachable by people who have never written code. Ansibles setup is also straightforward, and it’s excellent documentation makes it easy to learn and get started quickly. These qualities make Ansible very approachable compared to its competitors.

Agentless

An important point on Ansible is this it is agentless, meaning no agent software needs to be installed on hosts for Ansible to configure them. Ansible operates by connecting to host systems over SSH (WinRM on windows). Ansible then loads its modules to the host, where it will execute its tasks through the host systems Python interpreter.

Agentlessness is very useful as the only requirement for Ansible to manage a host system is for that system to be accessible via SSH and for the host system to have Python installed.

Idempotent

An essential concept in Ansible and other tools is idempotence. Idempotence is the concept that an operation can be applied many times and not make changes beyond its initial purpose. Take a red off switch on a factory machine as an example; this is an idempotent operation. You can press the off switch to turn off the device, you can then continue to push the off switch again and again, but nothing beyond switching off the machine will occur.

Idempotence is important to keep in mind when writing playbooks as Ansible can be very flexible and does not strictly ensure idempotency on its own. A well-written playbook will provide idempotency, repeatedly running the playbook on the same machine. It will only make configuration changes necessary to achieve the desired state if changes are needed.

I will demonstrate idempotence at the end of this post, but discussing where Ansible can become non-idempotent goes beyond the scope of this post.

First steps

All examples in this post were created and tested using the following versions:

Name	Version
Ansible	`2.12.1`
Python	`3.10.2`
Jinja	`3.0.3`
Vagrant	`2.2.19`

Installation

Please refer to Ansible’s documentation for installation instructions here.

Note: at the time of writing, you cannot run Ansible from Windows without workarounds.

Optional requirements

For this demonstration, it’s recommended to install Vagrant as we will use Vagrant throughout for creating disposable test environments. The aim of this post is not to introduce you to Vagrant, but it will be a valuable tool for this, and I will provide the necessary Vagrant files and commands.

Note: Vagrant can use several providers, example providers include VirtualBox, Hyper-V and Docker. Ensure you have a supported provider installed on your machine before attempting to run Vagrant.

Setting up Vagrant

After installing Vagrant, you’ll need a Vagrantfile to get started. Create a file named Vagrantfile at the root of your project with the below contents.

Vagrant.configure("2") do |config|
  boxes = [
    { 
      :name => "ubuntu-test",
      :box => "generic/ubuntu2010",
      :http_port => 8080
    },
    { 
      :name => "fedora-test",
      :box => "generic/fedora34", 
      :http_port => 8081
    }
  ]
  boxes.each do |opts|
    config.vm.define opts[:name] do |config|
      config.vm.box = opts[:box]
      config.vm.network "forwarded_port", guest: 80, host: opts[:http_port]
      if opts[:name] == boxes.last[:name]
        config.vm.provision "ansible" do |ansible|
          ansible.playbook = "playbook.yml"
          ansible.limit = "all"
        end
      end
    end
  end
end

Basic commands

To follow along, you will only need a few commands.

To create your Vagrant boxes, run:

vagrant up

Note: During vagrant up, the boxes will automatically be created and provisioned with our playbook.yml playbook.

To remove the boxes you’ve created, run:

vagrant destroy

To rerun the provisioning step (Ansible in this case) on existing boxes, run:

vagrant provision

Ad-hoc commands

Let us begin with running a simple ad-hoc command. Assuming you have followed Ansible’s installation instructions correctly, you should be able to run the below ad-hoc command.

ansible localhost -m ping

To better understand what’s happening here, let’s deconstruct this command. We call Ansible, providing localhost as our first variable; this first variable tells Ansible on what machines to execute its tasks, in this case on our local device as indicated by using localhost. We also provide the -m option, which allows us to specify an Ansible module to run. Here, we have chosen ping, which reaches out to check if a system is reachable.

localhost | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

The output should be a ‘pong’ response like the example above.

Ad-hoc commands can be a handy tool, especially for once-off, single-step commands like rebooting machines, gathering facts and any other module available in Ansible.

Our first playbook

Ad-hoc commands are great, but you will often have a whole set of tasks to run and may need to rerun them in the future. Playbooks allow us to maintain more complex sets of functions in YAML files.

We can start simple and replicate our ping task as a playbook for the first playbook. First, create a project folder and within, a file called ping_me.yml like below:

---
- name: My first playbook
  hosts: localhost

  tasks:
    - name: Ping me!
      ping:

Firstly, we begin our playbook with ---, your playbook will likely run without this, but it is outlined in the YAML specification here. The --- is used as a separator for YAML directives (we won’t need them here) or to indicate the beginning of a file if no directives exist (like the example above).

Next, we define some information about this playbook. Firstly we give it a name, this can be anything, but it’s best to keep it meaningful and descriptive, then we define the hosts, here we use localhost, same as with the ad-hoc task, hosts tells Ansible on what machines to execute its tasks.

Lastly, we provide a list of tasks, just one task for now. We first define a name for our task; again, it’s best to keep it meaningful and descriptive. The final line, ping, tells Ansible what module we wish to use. Most modules will require additional information, which I will demonstrate later.

You can then run your playbook with:

ansible-playbook ping_me.yml

You should get an output like below.

PLAY [My first playbook] **********************************************************************************************************************************************************************

TASK [Gathering Facts] ************************************************************************************************************************************************************************
ok: [localhost]

TASK [Ping me!] *******************************************************************************************************************************************************************************
ok: [localhost]

PLAY RECAP ************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

This output looks a lot different from the ad-hoc command. At the top, we see the play’s name; we then see two tasks have run Gathering Facts followed by our Ping me! task, both have run successfully. Finally, we get a recap of our tasks.

So, from where did Gathering Facts come? At the beginning of every playbook, Ansible will run some initial tasks, including this task to gather facts about the host machine. These facts can then be used in your playbooks like variables; further on, we will demonstrate the use of facts.

Note: Gathering Facts can be explicitly enabled or disabled with gather_facts but is enabled by default.

Inventory

Inventory is where Ansible retrieves what machines exist and how to connect to them. When using the Ansible Vagrant provisioner, Vagrant creates and manages our inventory for us. We won’t explore inventory in-depth here as this post is aimed at the first steps, but it is essential to know its existence.

After running vagrant up, you can view the inventory created by Vagrant here:

.vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory

A practical playbook

I find the best way to learn is through action, and having a practical goal is even better! We will create a simple playbook to install an apache server and learn some more Ansible along the way.

Start with a new playbook and create a new file called playbook.yml. You can start this playbook like below:

---
- name: Install apache server
  hosts: all

  tasks:

Tasks

A playbook with no tasks is not very useful at all. Ansible tasks call on modules to get work done; Ansible has a long list of modules in the builtin colllection and even more, community-supported modules and collections packaged with Ansible or through Ansible Galaxy.

Let’s add our first task. We can use the package module to manage packages on a host. Here we will use this module to install the apache server. Add the below task to your playbook under tasks.

- name: Install Apache
  package:
    name: apache2
    state: present

The package module accepts a package name with name and a state, which tells Ansible what state you wish the package to be. Common states are present, absent and latest.

Let’s create out boxes and run this playbook with:

vagrant up

You should see some permission errors; this leads us nicely to the next section about become.

Become

When we usually install packages on a Linux system via the command line, we may need to enter a command like sudo apt install. The sudo command elevates this command’s privileges, so we must do the same in Ansible through the become directive.

After the name, add become: true to your install Apache task. See below.

- name: Install Apache
  become: true
  package:
    name: apache2
    state: present

Now rerun this playbook with:

vagrant provision

This time, we should see that the Ubuntu machine has changed, but our Fedora machine tells us the package is not available. Why? Ubuntu and Fedora use different names for their Apache package, apache2 and httpd, respectively.

To deal with variants like this, we’ll need to use variables.

Variables

Here is where Ansible facts and the Gather Facts step mentioned earlier become useful. One fact gathered from hosts is distribution, which tells us what operating system distribution a host is running.

There are many options and ways to work with variables; today, we will load variables based on this distribution fact. We can use the vars_files directive to load variables from a file.

Add the below before your tasks section.

vars_files:
  - "vars/{{ ansible_facts.distribution | lower }}.yml"

Take note of | lower; this is a Jinja filter. Filters are used to manipulate data and be very powerful when needed; lower here will convert the string to lower case.

Next, create a vars directory, and place two files within, vars/ubuntu.yml and vars/fedora.yml.

Contents of vars/ubuntu.yml:

---
apache_package: apache2
apache_service: apache2

Contents of vars/fedora.yml:

---
apache_package: httpd
apache_service: httpd

Now change your single install apache task to use the apache_package variable like below and rerun your provisioning.

- name: Install Apache
  become: true
  package:
    name: "{{ apache_package }}"
    state: present

All your tasks should complete successfully, stating either ok or changed. Apache should now be on both VMs; let us try to connect to them. In your browser, enter 127.0.0.1:8080 for our Ubuntu machine. You should be greeted with a placeholder webpage.

Next, try 127.0.0.1:8081 for our Fedora machine. This time it will not work, this is because after installing Apache on this version of Fedora, we must also start and enable the service and open a firewall port.

First, let’s add a new task after ‘Install Apache’ to ensure the service is running and enabled; we can do this with the service module.

- name: Ensure Apache is running and enabled
  service:
    name: "{{ apache_service }}"
    state: started
    enabled: yes

Using Collections

Next, we want to ensure port 80 is allowed; for this post, we will add a simple task to enable HTTP where the firewalld service is present; this will allow us to explore and demonstrate collections and conditionals.

To allow HTTP traffic through the firewalld service, we will need to use the ansible.posix collection. First, let’s add a requirements file, requirements.yml, at the root of your project.

Contents of requirements.yml:

---
collections:
  - name: ansible.posix

next, to install our requirements, run:

ansible-galaxy install -r requirements.yml

Finally, we add a new task to allow Apache access to port 80, but we need to ensure this only runs where firewalld is present to prevent error, so let’s talk conditionals.

Conditionals

Ansible has a useful when directive. This directive will only allow a playbook object to be run if the when condition evaluates to true; if false, the object will be skipped.

Add the below tasks before enabling Apache; this second task tells firewalld to allow access on port 80. Take note of the when condition here.

- name: Collect service facts
  service_facts:

- name: Permit http traffic
  become: true
  ansible.posix.firewalld:
    service: http
    permanent: yes
    immediate: yes
    state: enabled
  when: ansible_facts.services['firewalld.service'] is defined

Now, if we rerun our playbook and try access 127.0.0.1:8080 or 127.0.0.1:8081, you should see the placeholder page.

Idempotence Example

Now that our machines are configured, rerun your playbook and observe the difference in the play recap. You should notice that all actions are listed as ok or skipped; because Ansible had nothing further to do, the desired state is already met. As our playbook is idempotent, no action is taken.

Conclusion

We’ve discussed what Ansible is, some of its pros, and even created a simple yet working playbook to install and configure an Apache server. We explored tasks, variables, and collections and used directives such as become and when to achieve this. This knowledge should act as a good starting point to begin exploring more about Ansible and be on your way to automating more with configuration-as-code.

This post only scratches the surface of what Ansible is capable of, and we have yet to see handlers, roles, and so much more. I encourage you, the reader, to take this foundation and build upon it, take the following steps, learn, create and automate.

What is Ansible?#

Approachable#

Agentless#

Idempotent#

First steps#

Installation#

Optional requirements#

Setting up Vagrant#

Basic commands#

Ad-hoc commands#

Our first playbook#

Inventory#

A practical playbook#

Tasks#

Become#

Variables#

Using Collections#

Conditionals#

Idempotence Example#

Conclusion#