Monday, February 29, 2016

Common Problems Using Ansible 2.0 OpenStack Modules

I've seen some recurring issues with the new Ansible 2.0 OpenStack modules popping up either on IRC or on the Ansible mailing lists. It's probably time to address these in a blog post.

Issue #1: Putting clouds.yaml in current directory

As I discussed in a previous blog post, all of the new OpenStack modules support using a clouds.yaml file to hold the cloud authentication details. The library that reads this file in supports reading it from the current directory, among other places. I think this is what trips people up: they assume they can create the file in the same directory where they run their playbook and think it will be read in.

This is not so. Why?

The answer lies in how Ansible works. It's not documented in a very obvious place, but the important aspect is documented with this Ansible configuration variable:
"Ansible works by transferring modules to your remote machines, running them, and then cleaning up after itself."
The modules are copied to the remote_tmp directory on the target host (even if it is localhost) and run from there. Your clouds.yaml is NOT copied along with the module, so it will not be found.

The solution? Place clouds.yaml in one of the other locations on the target host where it can be found:
  • /etc/openstack/
  • ~/.config/openstack/
The second option uses the home directory of the user executing the playbook task. The simplest option is to simply use the /etc/openstack/ directory.

Issue #2: Running Ansible from a virtual Python environment

This second issue also comes from a fundamental misunderstanding of how Ansible works.

New users of Ansible, understandably excited by such an awesome piece of software, will often begin experimenting with it by installing Ansible in a virtual Python environment rather than installing from a distro package or installing it as part of the system Python. There's nothing wrong with doing this. But users are usually confused when they think they've installed all of the required Python libraries in this virtual environment, along with the version of Ansible they are testing, and, when they run their playbook (using localhost as the target host), get a "missing library" or similar error message. But you KNOW that library is installed in the virtual environment! What gives?

As discussed in the issue above, Ansible copies modules to the target host and runs them remotely. The Python interpreter it uses to run the modules, by default, is the system Python (i.e, not your virtual environment, but, rather /usr/bin/python). It doesn't care about your virtual environment because it sees even localhost as a different machine.

How do you tell it to use your virtual environment Python when connecting to localhost? You have to set this option in your Ansible inventory file. The option you are looking for is ansible_python_interpreter. You will have to search the inventory documentation page for reference to that option. It is also discussed in the Ansible FAQ. Basically, you want an entry that looks something like:

localhost   ansible_python_interpreter=/path/to/venv/python

That should force Ansible to use your virtual environment.

Monday, July 6, 2015

Ansible OpenStack User Modules: The Rebirthening

Lately, I've been hard at work on improving the OpenStack Ansible modules. Thus far, we've been working on improving the modules that are oriented to normal users: those people wanting to not experience any pain in just spinning up a new instance that just works, and possibly adding a volume or similar.

It turns out, that experience can be painful depending on which public cloud provider (or possibly a private cloud) you are using. Because, although the provider might be using OpenStack, they have decided to make some choices that differentiates them from other providers. Thus, the user experience can be somewhat inconsistent.

Our top goal is to hide these provider-specific pain points from the user and supply the user with something that "just works". The magic sauce that does that is the shade library. Shade is a topic for another blog post, but you should check it out for your own projects needing a Python client.

The existing set of OpenStack Ansible modules (nova_compute, glance_image, etc) are being deprecated in favor of a new set of modules based on the shade library. All of the new modules are named with the prefix "os_" (e.g., os_server, os_image) and require that the shade library be installed.

Below is a simple Ansible playbook demonstrating how to spin up a new instance.

- hosts: localhost
  connection: local
  gather_facts: false

    - name: Create instance
        name: my_test_server
            username: david
            password: SuperSecretPassword
            project_name: my-project-name
        region_name: region-b.geo-1
        state: present
        flavor: standard.xsmall
        image: dbebe467-56c0-43e6-b7f1-7d0233109e2e

This example uses HP Public Cloud to spin up a small Ubuntu instance. If you want to use another provider, you would need to change the auth parameters, as well as region_name, flavor, and image values.

I hear you thinking... Hey! That's great. But I use multiple providers and managing all of that authentication mumbo-jumbo throughout several playbooks is a major pain in the rear.

Ah, yes. We totally agree! Built within the shade library is the option to define all of your providers within a single YAML file (provided by the os-client-config library), allowing you to easily refer to them by a user-defined name. This functionality is inherited by the Ansible modules as well!

Now I hear you thinking... Sweeeeeet.


Let's setup a YAML file called clouds.yaml and put it in the $HOME/.config/openstack directory. (This can actually go in several locations. Check the os-client-config README for details.)

      profile: hp
      region_name: region-b.geo-1
        username: david
        password: SuperSecretPassword
        project_name: my-project-name
      profile: rackspace
         username: david
         password: SuperSecretPassword
         project_name: 012345
      region_name: IAD

The YAML above defines two providers: one named "hp" and another named "rax". The os-client-config library also defines several characteristics of each provider it knows about. That, too, is a topic for another blog post.

Now we can refer to either of these providers by name in my playbooks. Let's look at the first example again, only this time, taking advantage of our clouds.yaml file:

- hosts: localhost
  connection: local
  gather_facts: false

    - name: Create instance
        name: my_test_server
        cloud: hp
        state: present
        flavor: standard.xsmall
        image: dbebe467-56c0-43e6-b7f1-7d0233109e2e

Well that's a bit cleaner. Now how about deleting it?

- hosts: localhost
  connection: local
  gather_facts: false

    - name: Create instance
        name: my_test_server
        cloud: hp
        state: absent

Simple enough.

The purpose of this post is to give new users a taste of the new Ansible modules, and a bit of the libraries behind it. I'm hoping this spurs myself (and others!) to post more details and more complex examples in the near future.

Look for these new modules in the 2.0 release of Ansible, or in the current devel branch of the Ansible git repo.

Friday, November 2, 2012

Python, Gearman, Eventlet, Oh My!

There are several examples on the Interwebs of using the Python Gearman API, but I didn't really see any examples of a script acting as both a Gearman worker AND client simultaneously. Since the worker part of it sits in an infinite loop (via the method), you need to do some sort of multitasking to have the same script act as a Gearman client as well. You could handle this a few different ways:
Forking processes for the worker and client is a viable option, but is a bit overkill for processes that are most likely I/O bound (communicating with the Gearman job server).

Threads can be unattractive in Python because of the GIL, but I think they would still work well in this case since the Gearman tasks (in my case) are mostly I/O bound.

However, I chose to give the eventlet module a try. Mostly because I had never really used it before, and some other code that I'm working with is already using it.

Below is a quick script that I hacked out to test it. It's a simple echo server using Gearman to deliver the messages to echo. Obviously, this is a bit silly to use Gearman to communicate between functions in the same program. A real-world program would likely have the workers and clients using different Gearman function names. But, hey, it's an example.

#!/usr/bin/env python

import eventlet
# Important to do this before importing gearman
import gearman

def echo(worker, job):
    """ The worker task to execute for each message """

def worker1(servers):
    """ The worker thread """
    worker = gearman.GearmanWorker(servers)
    worker.register_task("echo_func", echo)

def client1(servers):
    """ The client thread """
    client = gearman.GearmanClient(servers)
    job_request = client.submit_job("echo_func", "Hello, World")

def check_request_status(job_request):
    if job_request.complete:
        print "Job %s finished!  Result: %s - %s" % \
    elif job_request.timed_out:
        print "Job %s timed out!" % job_request.unique
    elif job_request.state == JOB_UNKNOWN:
        print "Job %s connection failed!" % job_request.unique

class Server(object):
    def __init__(self):
        self._servers = []

    def add_job_server(self, host, port=4730):
        self._servers.append("%s:%s" % (host, port))

    def main(self, tasks):
        if not self._servers:
            raise Exception("Must add at least one job server.")
        thread_list = []
        for task in tasks:
            thread_list.append(eventlet.spawn(task, self._servers))
        for thd in thread_list:

def main():
    server = Server()

    # These are the tasks (functions) we want to execute, each
    # within its own green thread.
    tasks = [ worker1, client1 ]


if __name__ == "__main__":

Friday, April 6, 2012

The Future Looks Cloudy

For the past year, I've been working for an awesome startup company here in Durham. But as of this Monday, April 9th, I'll be officially working for HP Cloud, joining some former coworkers from days gone by. Good things to come...

Thursday, December 22, 2011

2011 Year In Review

Wow, what an insane year. Looking back on it, a lot of stuff has happened this year for me. Let's look back:
The new job has kept me unbelievably busy, so I haven't had the time to contribute more to Drizzle, other than a bug fix here or there. That makes me a sad panda, but priorities have changed. Perhaps next year will offer more opportunity.

I don't know what 2012 has in store for me, but hopefully it is just as exciting. I hope everyone has a Happy New Year.

Wednesday, December 21, 2011

Going to Percona Live in DC

I'm looking forward to attending Percona Live in DC next month. If you're on the east coast of the US and you use MySQL in any capacity, you need to be there. Who's coming?

Friday, April 15, 2011

MySQL Users Conference Slides

Slides for my talk at the 2011 MySQL Users Conference are now up:

All in all, the talk went pretty well. There was quite a bit of interest in the subject. After attending my talk, Jay Janssen of Yahoo! put up a good blog post about some replication features that he would like to see in future versions. That is exactly the type of feedback that we want. Open source projects should be community driven. We want to implement the features that YOU want.