MySQL dumps_MySQL

php中文网
发布: 2016-06-01 13:14:44
原创
1021人浏览过

as part of thehttp archiveproject, i create mysql dumps for each crawl (on the 1st and 15th of each month). you can access the list of dumps from thedownloads page. several people use these dumps, most notablyilya grigorik who imports the data into google bigquery.

For the last year I’ve hesitated on many feature requests because they require schema changes. I wasn’t sure how changing the schema would affect the use of the dump files that preceded the change. This blog post summarizes my findings.

Format

When I started the HTTP Archive all the dumps were exported in MySQL format using a command like the following:

mysqldump --opt --skip-add-drop-table -u USERNAME -p -h SERVER DBNAME TABLENAME | gzip > TABLENAME.gz
登录后复制

These MySQL formatted dump files are imported like this:

gunzip -c TABLENAME.gz | mysql -u USERNAME -p -h SERVER DBNAME
登录后复制

People using databases other than MySQL requested that I also export in CSV format. The output of this export command is two files: TABLENAME.txt and TABLENAME.sql. The .txt file is CSV formatted and can be gzipped with a separate command.

mysqldump --opt --complete-insert --skip-add-drop-table -u USERNAME -p -h SERVER -T DIR DBNAME TABLENAMEgzip -c DIR/TABLENAME.txt > DIR/TABLENAME.csv.gz
登录后复制

This CSV dump is imported like this:

gunzip DIR/TABLENAME.csv.gzmysqlimport --local --fields-optionally-enclosed-by="/"" --fields-terminated-by=, --user=USERNAME -p DBNAME DIR/TABLENAME.csv
登录后复制

The largest HTTP Archive dump file is ~25G unzipped and ~3G gzipped. This highlights a disadvantage of using CSV formatted dumps: there’s no way to gzip and ungzip in memory. This is because themysqlimport command uses the filename to determine which table to use – if you piped in the rows then it wouldn’t know the table name. Unzipping a 25G file can be a challenge if disk space is limited.

On the other hand, the CSV import is ~30% faster than using the MySQL format file. This can save over an hour when importing 30 million rows. The HTTP Archive currently provides dumps in both MySQL and CVS format so people can choose between less disk space or faster imports.

CRMEB单商户商城系统
CRMEB单商户商城系统

CRMEB单商户商城打通版基于Thinkphp6.0+vue+mysql+redis开发,前后台全部采用前后端分离式开发。前端框架为uni-app,多端合一,首页页面后台可视化编辑操作,后台采用iview框架。

CRMEB单商户商城系统 0
查看详情 CRMEB单商户商城系统

Forward Compatibility

My primary concern is with the flexibility of previously-generated dump files in light of later schema changes – namely adding and dropping columns.

Dump files in MySQL format work fine with added columns. The INSERT commands in the dump are tied to specific column names, so the new columns are simply ignored. CSV formatted dumps are less flexible. The values in a row are stuffed into the table’s columns in order. If a new column is added at the end, everything works fine. But if a column is added in the middle of the existing columns, the row values will all shift one column to the left.

Neither format works well with dropped columns. MySQL formatted files will fail with an “unknown column” error. CSV formatted files will work but all the columns will be shifted, this time to the right.

Takeaways

I now feel comfortable making schema changes without invalidating the existing dump files provided I follow these guidelines:

  • don’t drop columns– If a column is no longer needed, I’ll leave it in place and modify the column definition to be a small size.
  • add columns at the end– I prefer to organize my columns semantically, but all new columns from this point forward will be added at the end.

I’ll continue to create dumps in MySQL and CSV format. These guidelines ensure that all past and future dump files will work against the latest schema.

最佳 Windows 性能的顶级免费优化软件
最佳 Windows 性能的顶级免费优化软件

每个人都需要一台速度更快、更稳定的 PC。随着时间的推移,垃圾文件、旧注册表数据和不必要的后台进程会占用资源并降低性能。幸运的是,许多工具可以让 Windows 保持平稳运行。

下载
来源:php中文网
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn
最新问题
热门推荐
开源免费商场系统广告
热门教程
更多>
最新下载
更多>
网站特效
网站源码
网站素材
前端模板
关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送

Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号