Thursday, January 20, 2011

Transaction Log Quality and the Importance of QA

If you read my post a few months back that gave a progress report on Drizzle Replication, you know that our major focus lately has been the quality of the transaction log. It has to be solid before other things can be built upon it. And fixing bugs in it usually means fixing bugs in the core replication message stream, so other replication plugins will benefit as well.

Well, Patrick Crews, the Drizzle QA Superman, has sent this post to the drizzle-discuss mailing list:

It looks like we finally have a high-quality transaction log to build on! With that solid foundation in place, we can now work on adding more functionality.

This, of course, couldn't have happened with some super high quality QA work. Any developers out there know that finding your own flaws in code you maintain is difficult to do. You're too close to the code, can't see the forest for the trees. Sure, you'll find the obvious ones whenever you get a core dump, or when some unit or regression test fails (you do have those, right?). It's the super-hard-to-find-make-you-wish-you-chose-another-career bugs that are the ones that are usually missed. Sometimes because once we code it, we want to move on to something else as quickly as possible, so we'd like to assume that it works. The reality is usually different...

This is where QA comes in. Their job is to pull the reins in on you, show you where you messed up, and make sure it gets fixed. It's usually a thankless job, but it's also one that is absolutely necessary. Patrick continually amazes me at what he does and how thorough he is. How we got lucky enough to snag him for the Drizzle team is beyond me, but I'm glad we did. Without his QA expertise, we'd certainly be quite lagging in the quality of our product. (He would try to give us developers the credit for fixing the bugs, but we are the ones introducing the bugs to begin with!)

So never underestimate the value of your QA team. You can't make a quality product without them. And have you thanked your QA team lately?  (Thanks Patrick!)

Wednesday, January 12, 2011

Change in Transaction Protobuf Message Segmenting

If you read my post on Drizzle transaction message limits, you know that it is possible to have multiple Transaction protobuf messages (segments) for a single database transaction. This was necessary to keep the Google protobuf messages from growing too large (there is a maximum limit on message size).

Before my most recent change, you would have to parse each Statement sub-message contained within the enclosing Transaction message to see if the Transaction was split up into multiple messages (they would be linked together by sharing the same transaction ID). This was kind of a pain, but since the segment information was only contained in the Statement, this was the only way to do it.

As of Bazaar revision number 2076 of the Drizzle trunk, we now have segment information stored in the Transaction message in addition to the Statement message. We added the values segment_id and end_segment to the Transaction message definition. These are just like the identically named values in the Statement message definition. From the drizzled/message/transaction.proto definition file:
message Transaction
  required TransactionContext transaction_context = 1;
  repeated Statement statement = 2;
  optional Event event = 3;

   * A single transaction in the database can possibly be represented with
   * multiple protobuf Transaction messages if the message grows too large.
   * This can happen if you have a bulk transaction, or a single statement
   * affecting a very large number of rows, or just a large transaction with
   * many statements/changes.
   * For the first two examples, it is likely that the Statement sub-message
   * itself will get segmented, causing another Transaction message to be
   * created to hold the rest of the Statement's row changes. In these cases,
   * it is enough to look at the segment information stored in the Statement
   * message.
   * For the last example, the Statement sub-messages may or may not be
   * segmented, but we could still need to split the Statements up into
   * multiple Transaction messages to keep the Transaction message size from
   * growing too large. In this case, the segment information in the Statement
   * submessages is not helpful if the Statement isn't segmented. We need this
   * information in the Transaction message itself.
   * These values should be set appropriately whether or not the Statement
   * sub-messages are segmented.
  optional uint32 segment_id = 4; /* Segment number of the Transaction msg */
  optional bool end_segment = 5;  /* FALSE if Transaction msg is split into multiples */
So other than making it easier to check to see if a Transaction is segmented, why add these new values?

Well, it turns out having the segment information only in the Statement doesn't allow us to segment large Transaction messages if none of the Statement sub-messages are themselves segmented. This is documented in this bug report.