且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

MySQL:将字段添加到大表

更新时间:2023-11-30 19:44:04

在几乎所有情况下,MySQL都会在ALTER **期间重建表.这是因为基于行的引擎(即所有引擎)必须执行此操作才能以正确的格式保留数据以进行查询.这也是因为您可以进行许多其他更改,这些更改也需要重建表(例如更改索引,主键等)

我不知道您使用的是哪种引擎,但是我假设使用MyISAM. MyISAM复制数据文件,进行任何必要的格式更改-这相对较快,并且花费的时间不会比IO硬件可以将旧数据文件输入新光盘并输出到光盘的时间更长.

重建索引确实是杀手..根据您的配置方式,MySQL将执行以下任一操作:对于每个索引,将被索引的列放入文件排序缓冲区(可能在内存中,但通常在磁盘上),使用其filesort()函数进行排序(执行快速排序)通过在两个文件之间递归地复制数据(如果它对于内存来说太大),然后根据排序后的数据构建整个索引.

如果它不能执行文件排序技巧,它将表现得就像您在每一行上执行了INSERT一样,然后依次用每行的数据填充索引块.这太慢了,导致无法达到***指标.

您可以在此过程中使用SHOW PROCESSLIST来确定正在执行的操作. 通过文件排序修复"是好的,通过密钥缓存修复"是不好的.

所有这些都将使用AT MOST的一个内核,但有时也会绑定IO(特别是复制数据文件).

**有一些例外,例如在innodb插件表上删除二级索引.

i have a table with about 200,000 records. i want to add a field to it:

 ALTER TABLE `table` ADD `param_21` BOOL NOT NULL COMMENT 'about the field' AFTER `param_20`

but it seems a very heavy query and it takes a very long time, even on my Quad amd PC with 4GB of RAM.

i am running under windows/xampp and phpMyAdmin. does mysql have a business with every record when adding a field? or can i change the query so it makes the change more quickly?

MySQL will, in almost all cases, rebuild the table during an ALTER**. This is because the row-based engines (i.e. all of them) HAVE to do this to retain the data in the right format for querying. It's also because there are many other changes you could make which would also require rebuilding the table (such as changing indexes, primary keys etc)

I don't know what engine you're using, but I will assume MyISAM. MyISAM copies the data file, making any necessary format changes - this is relatively quick and is not likely to take much longer than the IO hardware can get the old datafile in and the new on out to disc.

Rebuilding the indexes is really the killer. Depending on how you have it configured, MySQL will either: for each index, put the indexed columns into a filesort buffer (which may be in memory but is typically on disc), sort that using its filesort() function (which does a quicksort by recursively copying the data between two files, if it's too big for memory) and then build the entire index based on the sorted data.

If it can't do the filesort trick, it will just behave as if you did an INSERT on every row, and populate the index blocks with each row's data in turn. This is painfully slow and results in far from optimal indexes.

You can tell which it's doing by using SHOW PROCESSLIST during the process. "Repairing by filesort" is good, "Repairing with keycache" is bad.

All of this will use AT MOST one core, but will sometimes be IO bound as well (especially copying the data file).

** There are some exceptions, such as dropping secondary indexes on innodb plugin tables.