Friday, November 2, 2012

Python, Gearman, Eventlet, Oh My!

There are several examples on the Interwebs of using the Python Gearman API, but I didn't really see any examples of a script acting as both a Gearman worker AND client simultaneously. Since the worker part of it sits in an infinite loop (via the gearman.worker.work() method), you need to do some sort of multitasking to have the same script act as a Gearman client as well. You could handle this a few different ways:
Forking processes for the worker and client is a viable option, but is a bit overkill for processes that are most likely I/O bound (communicating with the Gearman job server).

Threads can be unattractive in Python because of the GIL, but I think they would still work well in this case since the Gearman tasks (in my case) are mostly I/O bound.

However, I chose to give the eventlet module a try. Mostly because I had never really used it before, and some other code that I'm working with is already using it.

Below is a quick script that I hacked out to test it. It's a simple echo server using Gearman to deliver the messages to echo. Obviously, this is a bit silly to use Gearman to communicate between functions in the same program. A real-world program would likely have the workers and clients using different Gearman function names. But, hey, it's an example.


#!/usr/bin/env python

import eventlet
# Important to do this before importing gearman
eventlet.monkey_patch()
import gearman

def echo(worker, job):
    """ The worker task to execute for each message """
    return job.data

def worker1(servers):
    """ The worker thread """
    worker = gearman.GearmanWorker(servers)
    worker.register_task("echo_func", echo)
    worker.work()

def client1(servers):
    """ The client thread """
    client = gearman.GearmanClient(servers)
    job_request = client.submit_job("echo_func", "Hello, World")
    check_request_status(job_request)

def check_request_status(job_request):
    if job_request.complete:
        print "Job %s finished!  Result: %s - %s" % \
              (job_request.job.unique,
               job_request.state,
               job_request.result)
    elif job_request.timed_out:
        print "Job %s timed out!" % job_request.unique
    elif job_request.state == JOB_UNKNOWN:
        print "Job %s connection failed!" % job_request.unique

class Server(object):
    def __init__(self):
        self._servers = []

    def add_job_server(self, host, port=4730):
        self._servers.append("%s:%s" % (host, port))

    def main(self, tasks):
        if not self._servers:
            raise Exception("Must add at least one job server.")
        thread_list = []
        for task in tasks:
            thread_list.append(eventlet.spawn(task, self._servers))
        for thd in thread_list:
            thd.wait()

def main():
    server = Server()
    server.add_job_server('127.0.0.1')

    # These are the tasks (functions) we want to execute, each
    # within its own green thread.
    tasks = [ worker1, client1 ]

    server.main(tasks)

if __name__ == "__main__":
    main()

Friday, April 6, 2012

The Future Looks Cloudy

For the past year, I've been working for an awesome startup company here in Durham. But as of this Monday, April 9th, I'll be officially working for HP Cloud, joining some former coworkers from days gone by. Good things to come...

Thursday, December 22, 2011

2011 Year In Review

Wow, what an insane year. Looking back on it, a lot of stuff has happened this year for me. Let's look back:
The new job has kept me unbelievably busy, so I haven't had the time to contribute more to Drizzle, other than a bug fix here or there. That makes me a sad panda, but priorities have changed. Perhaps next year will offer more opportunity.

I don't know what 2012 has in store for me, but hopefully it is just as exciting. I hope everyone has a Happy New Year.




Wednesday, December 21, 2011

Going to Percona Live in DC

I'm looking forward to attending Percona Live in DC next month. If you're on the east coast of the US and you use MySQL in any capacity, you need to be there. Who's coming?

Friday, April 15, 2011

MySQL Users Conference Slides

Slides for my talk at the 2011 MySQL Users Conference are now up:


All in all, the talk went pretty well. There was quite a bit of interest in the subject. After attending my talk, Jay Janssen of Yahoo! put up a good blog post about some replication features that he would like to see in future versions. That is exactly the type of feedback that we want. Open source projects should be community driven. We want to implement the features that YOU want.

Saturday, March 19, 2011

Multi-Master Support in Drizzle Replication

So Brian asked me the other day what it would take to support multiple masters in our new Drizzle slave plugin. Not master to master replication, but multiple masters sending replication events to a single slave that simply ignores any conflicts and just chugs along. I told him I didn't know, but considering how simple the code is, it probably wouldn't take much.

To get a better understanding of what exactly would be involved in supporting multiple masters, I decided to just start hacking it up. I did this mainly to get a sense of what would need to be changed, since my original design didn't allow for this at all. (Shortsightedness on my part I suppose.)

So I have a beta version of my results available in this Launchpad branch:
lp:~dshrews/drizzle/beta-multi-master
From my simple tests, it seems to work. I'm not real happy with the code (like I said, this was a hack), but functionality is there. I'm not promising this will go into Drizzle trunk just yet. I would like to make some improvements on it, and I'd really like to get some feedback from people on it.

To use it, you'll first need to create a modified slave configuration file. Here is a sample one:
ignore-errors

[master1]
master-host = foo.my.domain
master-port = 3306
master-user = user1
master-pass = password

[master2]
master-host = bar.my.domain
master-port = 3306
master-user = user2
master-pass = password
Currently, a total of 10 masters are supported. This was an arbitrary number. It was simplest to just predetermine a set number of masters due to some complications with config file parsing which I wasn't prepared to solve (this is one of the things I want to see fixed). One IO thread per master will be started, though we still use a single applier thread for the time being.

You'll notice in the sample config a new option, ignore-errors. If this option is present, the slave ignores any errors from replication events received from the masters that it executes locally. I highly recommend you have this option enabled. Also note the addition of the [master1] and [master2] sections that define options for each master. You can go all the way to a [master10] section.

Nothing changes with how you start your slave or masters (see my post on setting up a simple replication example).

Give it a try and let me know how it works for you. Again, this is bleeding edge stuff (does any other database support this?  :) ), so be prepared for bugs.

Thursday, March 17, 2011

Installing Drizzle from Source on OS X

Installing Drizzle on OS X 10.6 is pretty simple. We have a page on our wiki that has the basic steps, but I thought that I'd detail what I do on my Macs in the hope that it may make someone's life easier. Note that we don't build on any 10.5 machines, and I don't use that version anymore, so YMMV with these instructions. Also, these instructions assume that you have the Xcode package already installed. I have Xcode 4 installed, but these instructions should work with Xcode 3, too. If they don't, let me know.

I used to use MacPorts on my Macs to install the necessary libraries that are needed by Drizzle. I've recently dumped that because I didn't like all of the extra stuff that was installed (do you really need to install a separate Perl installation?). And a recent b0rk of their Perl installations was the final straw.

It turns out that all you really need are just a few extra packages to build Drizzle on your Mac. Here are the packages that I currently have installed on my machines:

The first three (autoconf, automake, and libtool) aren't strictly necessary. I install those because I want newer versions of those tools than what OS X provides by default. It makes building a little bit nicer (the output is much cleaner). The last three are what you really need.

Each package has its own instructions for how to compile and install. I use the default installation path (/usr/local) for each. Basically, building and installing for each is simply:
  • ./configure
  • make
  • sudo make install
The Boost package is the lone exception:
  • ./bootstrap.sh
  • ./bjam
  • sudo ./bjam install
If you install the libtool package, there is one additional step you should do. That package installs the binaries libtool and libtoolize into /usr/local/bin. I rename these to glibtool and glibtoolize, respectively. The Drizzle build system looks for these program names.

The last thing I do is to make sure that /usr/local/bin is in my path. So in $HOME/.bash_profile, I have this line:
export PATH=/usr/local/bin:$PATH
With all that in place, to build Drizzle is just:

  • ./config/autorun.sh
  • ./configure
No need to add any extra options to configure to find libraries.