Sunday, August 29, 2010

Drizzle Transaction Message Limit

Some recent changes I made have recently been pushed to Drizzle trunk that affect the size of the Transaction protobuf message that any replication stream will see (e.g., the transaction log). This was necessary to fix bug 600795.

Without a Transaction message size limit, for any bulk operations, like LOAD DATA, we would have ended up with a Transaction message that could possibly contain a very large Statement message that contained all of the INSERT data for the bulk load. This obviously could eat up a large amount of memory if we kept allowing the Statement to grow without bounds. The Drizzle kernel, when it can, keeps appending the values to INSERT onto the same record.

To circumvent this, we now allow multiple Transaction records for a single database transaction. Each Transaction GPB message representing a single database transaction will all have the same transaction ID, and only the last Transaction message will have the Statement's end_segment attribute set to true.

Here is an example of this change that you might now see in the transaction log:


transaction_context {
  server_id: 1
  transaction_id: 3
  start_timestamp: 1283118092815781
  end_timestamp: 1283118092815869
}
statement {
  type: INSERT
  start_timestamp: 1283118092815782
  end_timestamp: 1283118092815868
  insert_header {
    table_metadata {
      schema_name: "test"
      table_name: "t"
    }
    field_metadata {
      type: INTEGER
      name: "id"
    }
    field_metadata {
      type: VARCHAR
      name: "a"
    }
  }
  insert_data {
    segment_id: 1
    end_segment: false
    record {
      insert_value: "2"
      insert_value: "abc"
      is_null: false
      is_null: false
    }
    record {
      insert_value: "3"
      insert_value: "def"
      is_null: false
      is_null: false
    }
  }
}

transaction_context {
  server_id: 1
  transaction_id: 3
  start_timestamp: 1283118092816250
  end_timestamp: 1283118092816725
}
statement {
  type: INSERT
  start_timestamp: 1283118092816251
  end_timestamp: 1283118092816724
  insert_header {
    table_metadata {
      schema_name: "test"
      table_name: "t"
    }
    field_metadata {
      type: INTEGER
      name: "id"
    }
    field_metadata {
      type: VARCHAR
      name: "a"
    }
  }
  insert_data {
    segment_id: 1
    end_segment: true
    record {
      insert_value: "4"
      insert_value: "ghi"
      is_null: false
      is_null: false
    }
    record {
      insert_value: "5"
      insert_value: "jkl"
      is_null: false
      is_null: false
    }
  }
}


This example is a bit contrived as there is no need to split up such a small transaction, but you can see the basic changes here. We have two Transaction messages, both with the same transaction ID. You can see that the Statement's end_segment is set to false in the first message, while the Statement within the second Transaction message has end_segment set to true.

So, in case it isn't obvious, there are now two ways to determine when you should commit if you are a replication stream TransactionApplier, or if you are reading from the transaction log:

  1. If the transaction ID changes, COMMIT.
  2. Or, if the current Transaction has all Statement messages with end_segment set to true, COMMIT.
Choose which ever method of the two best suits your needs.

Currently, if a Transaction message crosses the 1M threshold, the kernel will create a new Transaction message. Why did I choose 1M? Well, the Google Protobuf documentation says:
Protocol Buffers are not designed to handle large messages. As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.
So 1M seemed to be a reasonable default. I'll change this in the near future to be a configurable value once we get some changes to our sys var stuff merged.

1 comment:

  1. Because we now handle bulk loads, which may split up Transaction messages into multiples, looking for a change in transaction ID to know when to commit is no longer valid since Transaction messages may be intermingled.

    ReplyDelete