更新时间:2023-02-07 12:00:23
不要使用循环,尤其是在 RDBMS 中这种规模的循环.
尝试使用查询快速用 100 万行填充您的表格
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, date)选择 1, 'a1', 100, 1, 500000, '2013-06-14 12:40:45'从(选择 a.N + b.N * 10 + c.N * 100 + d.N * 1000 + e.N * 10000 + f.N * 100000 + 1 Nfrom (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a,(选择0为N并全选1并全选2并全选3并全选4并全选5并全选6并全选7并全选8并全选9)b,(选0为N并全选1并全选2并全选3并全选4并全选5并全选6并全选7并全选8并全选9)c,(选择0为N并全选1并全选2并全选3并全选4并全选5并全选6并全选7并全选8并全选9)d, (选择0为N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) e,(选0为N并全选1并全选2并全选3并全选4并全选5并全选6并全选7并全选8并全选9)f) t
我的盒子(MacBook Pro 16GB RAM,2.6Ghz Intel Core i7)用了大约 8 秒完成
查询正常,1000000 行受影响(7.63 秒)记录:1000000 重复:0 警告:0UPDATE1 现在是使用准备语句的存储过程的一个版本
DELIMITER $$创建程序`inputRowsNoRandom`(IN NumRows INT)开始声明 i INT DEFAULT 0;准备 stmt从'插入到`entity_versionable`(fk_entity,str1,str2,bool1,double1,date)值(?, ?, ?, ?, ?, ?)';设置@v1 = 1,@v2 = 'a1',@v3 = 100,@v4 = 1,@v5 = 500000,@v6 = '2013-06-14 12:40:45';当我 <行数使用@v1、@v2、@v3、@v4、@v5、@v6 执行 stmt;SET i = i + 1;结束时;解除分配准备 stmt;完$$分隔符;
在约 3 分钟内完成:
mysql> CALL inputRowsNoRandom(1000000);查询正常,0 行受影响(2 分 51.57 秒)感受 8 秒和 3 分钟的不同
UPDATE2 为了加快速度,我们可以明确地使用事务并批量提交插入.所以这里是 SP 的改进版本.
DELIMITER $$CREATE PROCEDURE inputRowsNoRandom1(IN NumRows BIGINT, IN BatchSize INT)开始声明 i INT DEFAULT 0;准备 stmt从'插入到`entity_versionable`(fk_entity,str1,str2,bool1,double1,date)值(?, ?, ?, ?, ?, ?)';设置@v1 = 1,@v2 = 'a1',@v3 = 100,@v4 = 1,@v5 = 500000,@v6 = '2013-06-14 12:40:45';开始交易;当我 <行数使用@v1、@v2、@v3、@v4、@v5、@v6 执行 stmt;SET i = i + 1;如果我 % BatchSize = 0 那么犯罪;开始交易;万一;结束时;犯罪;解除分配准备 stmt;完$$分隔符;
不同批次大小的结果:
mysql> CALL inputRowsNoRandom1(1000000,1000);查询正常,0 行受影响(27.25 秒)mysql> CALL inputRowsNoRandom1(1000000,10000);查询正常,0 行受影响(26.76 秒)mysql> CALL inputRowsNoRandom1(1000000,100000);查询正常,0 行受影响(26.43 秒)您自己会看到差异.仍然 > 比交叉连接差 3 倍.
I am testing performance in a MySQL Server and filling a table with more than 200 million of records. The Stored Procedure is very slow generating the big SQL string. Any help or comment is really welcome.
System Info:
The Stored Procedure creates a INSERT sql query with all the values to be inserted into the table.
DELIMITER $$
USE `test`$$
DROP PROCEDURE IF EXISTS `inputRowsNoRandom`$$
CREATE DEFINER=`root`@`localhost` PROCEDURE `inputRowsNoRandom`(IN NumRows BIGINT)
BEGIN
/* BUILD INSERT SENTENCE WITH A LOS OF ROWS TO INSERT */
DECLARE i BIGINT;
DECLARE nMax BIGINT;
DECLARE squery LONGTEXT;
DECLARE svalues LONGTEXT;
SET i = 1;
SET nMax = NumRows + 1;
SET squery = 'INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, DATE) VALUES ';
SET svalues = '("1", "a1", 100, 1, 500000, "2013-06-14 12:40:45"),';
WHILE i < nMax DO
SET squery = CONCAT(squery, svalues);
SET i = i + 1;
END WHILE;
/*SELECT squery;*/
SET squery = LEFT(squery, CHAR_LENGTH(squery) - 1);
SET squery = CONCAT(squery, ";");
SELECT squery;
/* EXECUTE INSERT SENTENCE */
/*START TRANSACTION;*/
/*PREPARE stmt FROM squery;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
*/
/*COMMIT;*/
END$$
DELIMITER ;
Results:
CALL test.inputRowsNoRandom(20000);
CALL test.inputRowsNoRandom(100000);
Result (ordered by duration) - stateduration (summed) in sec || percentage
freeing items 0.00005 50.00000
starting 0.00002 20.00000
executing 0.00001 10.00000
init 0.00001 10.00000
cleaning up 0.00001 10.00000
Total 0.00010 100.00000
Change Of STATUS VARIABLES Due To Execution Of Query
variable value description
Bytes_received 21 Bytes sent from the client to the server
Bytes_sent 97 Bytes sent from the server to the client
Com_select 1 Number of SELECT statements that have been executed
Questions 1 Number of statements executed by the server
Tests:
I have already tested with different MySQL configurations from 12 to 64 threads, setting cache on and off, moving logs to another hardware disk...
Also tested using TEXT, INT..
Additional Information:
Questions:
SELECT squery;
is a NULL string. Whats happening? (error must be there but I dont see it).mysql -u mysqluser -p databasename < numbers.sql
UPDATE:
Don't use loops especially on that scale in RDBMS.
Try to quickly fill your table with 1m rows with a query
INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, date)
SELECT 1, 'a1', 100, 1, 500000, '2013-06-14 12:40:45'
FROM
(
select a.N + b.N * 10 + c.N * 100 + d.N * 1000 + e.N * 10000 + f.N * 100000 + 1 N
from (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) a
, (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) b
, (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) c
, (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) d
, (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) e
, (select 0 as N union all select 1 union all select 2 union all select 3 union all select 4 union all select 5 union all select 6 union all select 7 union all select 8 union all select 9) f
) t
It took on my box (MacBook Pro 16GB RAM, 2.6Ghz Intel Core i7) ~8 sec to complete
Query OK, 1000000 rows affected (7.63 sec) Records: 1000000 Duplicates: 0 Warnings: 0
UPDATE1 Now a version of a stored procedure that uses a prepared statement
DELIMITER $$
CREATE PROCEDURE `inputRowsNoRandom`(IN NumRows INT)
BEGIN
DECLARE i INT DEFAULT 0;
PREPARE stmt
FROM 'INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, date)
VALUES(?, ?, ?, ?, ?, ?)';
SET @v1 = 1, @v2 = 'a1', @v3 = 100, @v4 = 1, @v5 = 500000, @v6 = '2013-06-14 12:40:45';
WHILE i < NumRows DO
EXECUTE stmt USING @v1, @v2, @v3, @v4, @v5, @v6;
SET i = i + 1;
END WHILE;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;
Completed in ~3 min:
mysql> CALL inputRowsNoRandom(1000000); Query OK, 0 rows affected (2 min 51.57 sec)
Feel the difference 8 sec vs 3 min
UPDATE2 To speed things up we can explicitly use transactions and commit insertions in batches. So here it goes an improved version of the SP.
DELIMITER $$
CREATE PROCEDURE inputRowsNoRandom1(IN NumRows BIGINT, IN BatchSize INT)
BEGIN
DECLARE i INT DEFAULT 0;
PREPARE stmt
FROM 'INSERT INTO `entity_versionable` (fk_entity, str1, str2, bool1, double1, date)
VALUES(?, ?, ?, ?, ?, ?)';
SET @v1 = 1, @v2 = 'a1', @v3 = 100, @v4 = 1, @v5 = 500000, @v6 = '2013-06-14 12:40:45';
START TRANSACTION;
WHILE i < NumRows DO
EXECUTE stmt USING @v1, @v2, @v3, @v4, @v5, @v6;
SET i = i + 1;
IF i % BatchSize = 0 THEN
COMMIT;
START TRANSACTION;
END IF;
END WHILE;
COMMIT;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;
Results with different batch sizes:
mysql> CALL inputRowsNoRandom1(1000000,1000); Query OK, 0 rows affected (27.25 sec) mysql> CALL inputRowsNoRandom1(1000000,10000); Query OK, 0 rows affected (26.76 sec) mysql> CALL inputRowsNoRandom1(1000000,100000); Query OK, 0 rows affected (26.43 sec)
You see the difference yourself. Still > 3 times worse than cross join.