【MySQL】MySQL 快速创建千万级测试数据之mysql_random_data_load

mysql

发布日期: 2019-07-24

文章字数: 803

阅读时长: 4 分

阅读次数:

介绍

使用mysql_random_data_load将随机数据插入表中百万数据分分钟插入

源码地址：https://github.com/Percona-Lab/mysql_random_data_load/releases

备份地址：https://github.com/ciweigg2/mysql_random_data_load/releases

安装

下载安装包

https://github.com/ciweigg2/mysql_random_data_load/releases/download/v0.1.10/mysql_random_data_load_linux_amd64.tar.gz
tar xzf mysql_random_data_loader_linux_amd64.tar.gz
mv mysql_random_data_loader /usr/bin
chmod +x mysql_random_data_loader

测试

创建数据库

CREATE TABLE `test`.`t3` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tcol01` tinyint(4) DEFAULT NULL,
  `tcol02` smallint(6) DEFAULT NULL,
  `tcol03` mediumint(9) DEFAULT NULL,
  `tcol04` int(11) DEFAULT NULL,
  `tcol05` bigint(20) DEFAULT NULL,
  `tcol06` float DEFAULT NULL,
  `tcol07` double(10,2) DEFAULT NULL,
  `tcol08` decimal(10,2) DEFAULT NULL,
  `tcol09` date DEFAULT NULL,
  `tcol10` datetime DEFAULT NULL,
  `tcol11` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `tcol12` time DEFAULT NULL,
  `tcol13` year(4) DEFAULT NULL,
  `tcol14` varchar(100) DEFAULT NULL,
  `tcol15` char(2) DEFAULT NULL,
  `tcol16` blob,
  `tcol17` text,
  `tcol18` mediumtext,
  `tcol19` mediumblob,
  `tcol20` longblob,
  `tcol21` longtext,
  `tcol22` mediumtext,
  `tcol23` varchar(3) DEFAULT NULL,
  `tcol24` varbinary(10) DEFAULT NULL,
  `tcol25` enum('a','b','c') DEFAULT NULL,
  `tcol26` set('red','green','blue') DEFAULT NULL,
  `tcol27` float(5,3) DEFAULT NULL,
  `tcol28` double(4,2) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

运行

mysql_random_data_load test t3 2000 --host=ip --port=3306 --user=root --password=123456
INFO[2019-07-24T22:11:03+08:00] Starting                                     
   0s [====================================================================] 100%
INFO[2019-07-24T22:11:04+08:00] 200 rows inserted

如果这样的错排查了很久才发现是字段没有设置默长度呀

因为源码地址的创建脚本double没有默认长度所以报错了

支持的字段

Field type	Generated values
tinyint	0 ~ 0xFF
smallint	0 ~ 0XFFFF
mediumint	0 ~ 0xFFFFFF
int - integer	0 ~ 0xFFFFFFFF
bigint	0 ~ 0xFFFFFFFFFFFFFFFF
float	0 ~ 1e8
decimal(m,n)	0 ~ 10^(m-n)
double	0 ~ 1000
char(n)	up to n random chars
varchar(n)	up to n random chars
date	NOW() - 1 year ~ NOW()
datetime	NOW() - 1 year ~ NOW()
timestamp	NOW() - 1 year ~ NOW()
time	00:00:00 ~ 23:59:59
year	Current year - 1 ~ current year
tinyblob	up to 100 chars random paragraph
tinytext	up to 100 chars random paragraph
blob	up to 100 chars random paragraph
text	up to 100 chars random paragraph
mediumblob	up to 100 chars random paragraph
mediumtext	up to 100 chars random paragraph
longblob	up to 100 chars random paragraph
longtext	up to 100 chars random paragraph
varbinary	up to 100 chars random paragraph
enum	A random item from the valid items list
set	A random item from the valid items list

支持的参数

Option	Description
–bulk-size	Number of rows per INSERT statement (Default: 1000)
–debug	Show some debug information
–fk-samples-factor	Percentage used to get random samples for foreign keys fields. Default 0.3
–host	Host name/ip
–max-fk-samples	Maximum number of samples for fields having foreign keys constarints. Default: 100
–max-retries	Maximum number of rows to retry in case of errors. See duplicated keys. Deafult: 100
–no-progressbar	Skip showing the progress bar. Default: false
–password	Password
–port	Port number
–Print	Print queries to the standard output instead of inserting them into the db
–user	Username
–version	Show version and exit

外键约束

默认支持外键关联插入

If a field has Foreign Keys constraints, random-data-load will get up to --max-fk-samples random samples from the referenced tables in order to insert valid values for the field.
The number of samples to get follows this rules:
1. Get the aproximate number of rows in the referenced table using the rows field in:

EXPLAIN SELECT COUNT(*) FROM <referenced schema>.<referenced table>

1.1 If the number of rows is less than max-fk-samples, all rows are retrieved from the referenced table using this query:

SELECT <referenced field> FROM <referenced schema>.<referenced table>

1.2 If the number of rows is greater than max-fk-samples, samples are retrieved from the referenced table using this query:

SELECT <referenced field> FROM <referenced schema>.<referenced table> WHERE RAND() <= <fk-samples-factor> LIMIT <max-fk-samples>

Ciwei

https://ciweigg2.github.io/2019/07/24/mysql-mysql-kuai-su-chuang-jian-qian-wan-ji-ce-shi-shu-ju-zhi-mysql-random-data-load/

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源 Ciwei !

mysql

SpringBoot集成自己实现redisson-spring-boot-starter(单机,集群,哨兵,主从,云托管)

redisson-spring-boot-starter参照一些大神例子自己实现的starter呀目前有很多项目还在使用jedis的 setNx 充当分布式锁,然而这个锁是有问题的,redisson是java支持redis的redlock

2019-07-26 综合

springboot redis redisson

数据可视化之redash(支持43种数据源)

安装npm官网有详细的教程：https://redash.io/ 查看当前版本 yum --showduplicates list nodejs | expand #以10.x 版本为例，要9.x的话只要把该命令中10.x改为9.x就好

2019-07-24 综合

数据可视化

介绍

安装

测试

支持的字段

支持的参数

外键约束

你的赏识是我前进的动力