Archive for September, 2006


Followup: On IPs, hostnames, and MySQL

by Jeremy Cole on Tuesday, September 26th, 2006 at 18:15:44 in MySQL, MySQL Patches, MySQL Tips

Back in April, I wrote On IPs, hostnames, and MySQL, which described the (sometimes surprising) ways in which MySQL deals with IP addresses, hostnames, and privileges, as well as some basics about the host cache itself. In a footnote to that post, I mentioned a patch I had written against MySQL 4.1 to give some more visibility into the host cache.

Over the past two days, I have worked on porting that patch to MySQL 5.01, 2, and making some fairly large improvements to it. The patch implements a few things:

  1. Configurable Size — Without the patch, the size of the cache is fixed at 128 entries, and can only be changed by changing a #define and recompiling. You may now tune the size of the host cache using SET GLOBAL host_cache_size=X.
  2. Metrics — You may now see how well the cache is performing using SHOW GLOBAL STATUS LIKE 'Host%'. (See example below.)
  3. Display — You may now see the complete contents of the cache using SHOW HOST CACHE. (See example below.)

The above new features should give MySQL DBAs much better insight into what’s happening in their MySQL server, in an area where there has historically been very little information and a lot of misunderstandings.

Now, for the examples:

SHOW GLOBAL STATUS LIKE 'Host%'

mysql> show status like 'host%';
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| Host_cache_free    | 127   |
| Host_cache_hits    | 2     |
| Host_cache_inserts | 1     |
| Host_cache_misses  | 1     |
| Host_cache_prunes  | 0     |
| Host_cache_used    | 1     |
+--------------------+-------+
6 rows in set (0.01 sec)

SHOW HOST CACHE

mysql> show host cache;
+-----------+----------+--------+------+
| IP        | Hostname | Errors | Hits |
+-----------+----------+--------+------+
| 10.0.0.22 | hiriko   |      0 |    2 |
+-----------+----------+--------+------+
1 row in set (0.00 sec)

Enjoy!

1 Hopefully this will be accepted into MySQL 5.0. Brian?

2 The patch is against mysql-5.0.25-nightly-20060823 specifically.

To reason why…

by Jeremy Cole on Tuesday, September 19th, 2006 at 15:17:47 in HOTCW, Quotes

Heard from a colleague today while consulting:

Ours is not to reason why, ours is to reboot and retry!

Haven’t heard that one before. ;)

Jeremy or Adrienne 2.0

by Jeremy Cole on Sunday, September 10th, 2006 at 20:37:40 in Baby

Well, some of you already know—but many of you don’t. This is the official announcement, then… Adrienne and I are having a baby! It is due around mid-March. Exciting stuff! Here’s what the baby looked like as of a few days ago:

On 1U cases, PCI risers, and LSI MegaRAID

by Jeremy Cole on Monday, September 4th, 2006 at 15:42:29 in MySQL, MySQL Tips, Technology

I’ve been working with one Proven Scaling customer that has had some interesting issues recently, involving InnoDB corruption, resulting in messages similar to these:

InnoDB: Page checksum 3156980109, prior-to-4.0.14-form checksum 577557610
InnoDB: stored checksum 741279449, prior-to-4.0.14-form stored checksum 577557610
InnoDB: Page lsn 0 2323869442, low 4 bytes of lsn at page end 2323869442
InnoDB: Page number (if stored to page already) 195716,
InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0
InnoDB: Page may be an index page where index id is 0 2831
InnoDB: (index PRIMARY of table db/table)
InnoDB: Database page corruption on disk or a failed
InnoDB: file read of page 195716.

The problem was encountered when testing hardware for a move from software RAID to hardware RAID using LSI MegaRAID SCSI 320-2 cards. The servers are 1U machines with Tyan motherboards, and a PCI riser card which the MegaRAID plugged into.

They were receiving the same messages on several different machines, ruling out a single bad piece of hardware. After spending weeks trying to figure out what the problem could be, testing different configurations and isolating variables, it was tracked down to the PCI riser cards. Searching for “lsi pci riser” shows quite a few people having similar issues.

It turns out that LSI “does not support” using their cards with PCI risers, at all. Maybe they should reword things a bit—if their cards don’t work with PCI risers.

The scariest part of the whole exercise, though, is that the corruption was occurring completely silently: data comes in, is written to disk, but gets corrupted in flight. Since the OS wrote certain data it is now caching the correct copy of the data, but the disks contain something different. The only way the corruption is discovered is when the page is read back quite a bit later, after having been flushed from cache.

You’d think that somewhere along the line, the OS or the RAID card would catch the corruption?