Saturday, March 19, 2011

Multi-Master Support in Drizzle Replication

So Brian asked me the other day what it would take to support multiple masters in our new Drizzle slave plugin. Not master to master replication, but multiple masters sending replication events to a single slave that simply ignores any conflicts and just chugs along. I told him I didn't know, but considering how simple the code is, it probably wouldn't take much.

To get a better understanding of what exactly would be involved in supporting multiple masters, I decided to just start hacking it up. I did this mainly to get a sense of what would need to be changed, since my original design didn't allow for this at all. (Shortsightedness on my part I suppose.)

So I have a beta version of my results available in this Launchpad branch:
lp:~dshrews/drizzle/beta-multi-master
From my simple tests, it seems to work. I'm not real happy with the code (like I said, this was a hack), but functionality is there. I'm not promising this will go into Drizzle trunk just yet. I would like to make some improvements on it, and I'd really like to get some feedback from people on it.

To use it, you'll first need to create a modified slave configuration file. Here is a sample one:
ignore-errors

[master1]
master-host = foo.my.domain
master-port = 3306
master-user = user1
master-pass = password

[master2]
master-host = bar.my.domain
master-port = 3306
master-user = user2
master-pass = password
Currently, a total of 10 masters are supported. This was an arbitrary number. It was simplest to just predetermine a set number of masters due to some complications with config file parsing which I wasn't prepared to solve (this is one of the things I want to see fixed). One IO thread per master will be started, though we still use a single applier thread for the time being.

You'll notice in the sample config a new option, ignore-errors. If this option is present, the slave ignores any errors from replication events received from the masters that it executes locally. I highly recommend you have this option enabled. Also note the addition of the [master1] and [master2] sections that define options for each master. You can go all the way to a [master10] section.

Nothing changes with how you start your slave or masters (see my post on setting up a simple replication example).

Give it a try and let me know how it works for you. Again, this is bleeding edge stuff (does any other database support this?  :) ), so be prepared for bugs.

7 comments:

  1. That is all kinds of awesome. So, you'd say that it is currently targeted for a scenario where we don't expect overlap from the masters? That we just want a consolidated copy of the database(s) somewhere?

    I look forward to giving this a spin : )

    ReplyDelete
  2. @pcrews: Correct. I guess one common scenario would be where you may have your data divided up into separate tables (or databases) on each master, and then want to pull it all to a single slave to run some sort of data analysis. I'm sure there are more scenarios, including some that don't care about conflicts. Conflict resolution will come at a later point.

    ReplyDelete
  3. Nice work!

    I was thinking about this a bit, and something else that may be useful in this scenario is to turn on the the replication log for the slave and allow a hierarchy master->slave->slave. This would work currently but any type of recovery would be very manual.

    Drizzle currently does not log where a update originates on a slave so a failure in the middle makes recovery hard for the second level slave, or upgrading a slave to master becomes cumbersome. I think a possible solution to this would be to allow the Execute class to accept a gpb message and then store the trx id and server id of the originating server on the slaves replication log. That would allow upgrading a slave to master much easier. Just a thought, Ive been fiddling with a prototype, but dont have it working yet.

    ReplyDelete
  4. @joe: All good points. We do need to start thinking about more complex setups and dealing with issues like that. Glad to see you are doing some hacking in that area. :)

    ReplyDelete
  5. @pcrews: And when I say "separate tables/database", what I mean is uniquely named tables/databases across masters, just to clarify.

    ReplyDelete
  6. Why do you recommend ignore-errors? Is there something that causes errors to occur unexpectedly?

    ReplyDelete
  7. @robert: There is currently no conflict resolution (yet), so unless you setup your masters so that conflicts will not occur on the slave, you can expect replication to halt on the first conflict. If you want to keep replication going despite conflicts, you'll need to use the ignore-errors option.

    ReplyDelete