疫情期间,写了个 MySQL 同步到 ClickHouse 的小工具。 - V2EX
V2EX = way to explore
V2EX 是一个关于分享和探索的地方
Sign Up Now
For Existing Member  Sign In
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
jenlors

疫情期间,写了个 MySQL 同步到 ClickHouse 的小工具。

  •  
  •   jenlors Feb 25, 2020 6849 views
    This topic created in 2256 days ago, the information mentioned may be changed or developed.

    因工作需要,调研了很多 OLAP 系统,像 Druid、Presto、Kylin、ClickHouse 等,最终选定了 ClickHouse,其优点:快、安装简单、依赖少。

    于是开启了踩坑之旅。

    网上搜了很多从 MySQL 同步数据到 ClickHouse 的工具,好吧,确实很少,推荐最多的是https://github.com/Altinity/clickhouse-mysql-data-reader,无奈其不支持更新和删除,遂弃之;

    找到了阿里开源的 canal,搭建了半天,先用 docker 死活搭建不成功,貌似里面启动了一个 MySQL 服务器,我本机也搭了个,冲突了,服!于是下载安装,可算装上了,一番配置过后,参考http://www.wuzhq.com/2019/12/16/mysql2clickhouse/这篇文章,结果同步报错,貌似语法解析错误什么的,弃之;

    又找了个https://github.com/brokercap/Bifrost,一番折腾可算是搭好了,配置完了,同步依然出错,弃之...

    然后找到了https://github.com/yymysql/mysql-clickhouse-replication,原理是解析 MySQL binlog 同步,clone 下来一番修补总算是可以用上了。无奈其只支持 python2,代码也比较冗长,介于实在是无工具可用,于是打算对此重新改造。

    然后有了:https://github.com/long2ice/mysql2ch

    基于 mysql-clickhouse-replication 修改优化,增加命令行操作,打包上传至 pypi,增加 python3&pypy3 支持。

    只需要一行命令:mysql2ch --log-pos-to=file,你就可以享受从 MySQL 到 ClickHouse 的同步快感。

    顺便推荐一个 BI 工具,metabase,配合 ClickHouse,你就可以对你的数据,想怎么玩儿,就怎么玩~

    喜欢的可以点个 star~ https://github.com/long2ice/mysql2ch

    7 replies    2020-04-06 21:57:37 +08:00
    mywaiting
        1
    mywaiting  
       Feb 25, 2020
    支持 py3 的话,可以基于 python-mysql-replication 自己折腾一个实现吧

    这货有个很具体的 keyword 叫 MySQL/XXXX Change Data Capture to ClickHouse/XXXXX
    heyyyy
        2
    heyyyy  
       Feb 25, 2020
    playniuniu
        3
    playniuniu  
       Feb 26, 2020 via iPhone
    赞 一直想做个类似的 没时间
    jimcaicn
        4
    jimcaicn  
       Mar 16, 2020
    [root@master-node ~]# pip3 install mysql2ch
    WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
    Collecting mysql2ch
    Downloading https://pypi.doubanio.com/packages/93/fc/033f39f3570139d3e1ddc740059c9fefb1a38f21968f3bff0aa860dc5269/mysql2ch-0.0.3.tar.gz
    Collecting mysql-replication (from mysql2ch)
    Downloading https://pypi.doubanio.com/packages/e3/54/8c496e300d610299bf168e2068dc10a64b66b299cbe596a27aac5d5b3e7b/mysql-replication-0.21.tar.gz
    Collecting clickhouse-driver (from mysql2ch)
    Downloading https://pypi.doubanio.com/packages/54/ae/7b6d40a774760e7192888cc9855fea834e3ff4f266fa8c1962c266555fb1/clickhouse_driver-0.1.3-cp36-cp36m-manylinux1_x86_64.whl (595kB)
    100% || 604kB 4.2MB/s
    Collecting redis (from mysql2ch)
    Downloading https://pypi.doubanio.com/packages/f0/05/1fc7feedc19c123e7a95cfc9e7892eb6cdd2e5df4e9e8af6384349c1cc3d/redis-3.4.1-py2.py3-none-any.whl (71kB)
    100% || 71kB 42.4MB/s
    Collecting mysqlclient (from mysql2ch)
    Downloading https://pypi.doubanio.com/packages/d0/97/7326248ac8d5049968bf4ec708a5d3d4806e412a42e74160d7f266a3e03a/mysqlclient-1.4.6.tar.gz (85kB)
    100% || 92kB 3.0MB/s
    Complete output from command python setup.py egg_info:
    /bin/sh: mysql_config: command not found
    /bin/sh: mariadb_config: command not found
    /bin/sh: mysql_config: command not found
    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/tmp/pip-build-gl8l1_on/mysqlclient/setup.py", line 16, in <module>
    metadata, optiOns= get_config()
    File "/tmp/pip-build-gl8l1_on/mysqlclient/setup_posix.py", line 61, in get_config
    libs = mysql_config("libs")
    File "/tmp/pip-build-gl8l1_on/mysqlclient/setup_posix.py", line 29, in mysql_config
    raise EnvironmentError("%s not found" % (_mysql_config_path,))
    OSError: mysql_config not found

    ----------------------------------------
    Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-gl8l1_on/mysqlclient/


    报错如上,该当如何?
    jenlors
        5
    jenlors  
    OP
       Mar 19, 2020
    不知道你这是什么系统,这明显是没有安装 mysql,如果是 Mac 安装 mysql 就行了,Linux 系统的装 mysql-devel 之类的,不同系统有些许差别。
    dtgxx
        6
    dtgxx  
       Apr 3, 2020
    你这个是离线同步吗?
    大部分的场景至少都得是增量同步吧,
    jenlors
        7
    jenlors  
    OP
       Apr 6, 2020
    @dtgxx 是实时在线同步,支持增量和全量,看我的另一条帖子,已经升级到 2.0 了。[t/654385#reply4]( t/654385#reply4)
    About     Help     Advertise   &nsp; Blog     API     FAQ     Solana     955 Online   Highest 6679       Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 39ms UTC 23:15 PVG 07:15 LAX 16:15 JFK 19:15
    Do have faith in what you're doing.
    ubao msn snddm index pchome yahoo rakuten mypaper meadowduck bidyahoo youbao zxmzxm asda bnvcg cvbfg dfscv mmhjk xxddc yybgb zznbn ccubao uaitu acv GXCV ET GDG YH FG BCVB FJFH CBRE CBC GDG ET54 WRWR RWER WREW WRWER RWER SDG EW SF DSFSF fbbs ubao fhd dfg ewr dg df ewwr ewwr et ruyut utut dfg fgd gdfgt etg dfgt dfgd ert4 gd fgg wr 235 wer3 we vsdf sdf gdf ert xcv sdf rwer hfd dfg cvb rwf afb dfh jgh bmn lgh rty gfds cxv xcv xcs vdas fdf fgd cv sdf tert sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf sdf shasha9178 shasha9178 shasha9178 shasha9178 shasha9178 liflif2 liflif2 liflif2 liflif2 liflif2 liblib3 liblib3 liblib3 liblib3 liblib3 zhazha444 zhazha444 zhazha444 zhazha444 zhazha444 dende5 dende denden denden2 denden21 fenfen9 fenf619 fen619 fenfe9 fe619 sdf sdf sdf sdf sdf zhazh90 zhazh0 zhaa50 zha90 zh590 zho zhoz zhozh zhozho zhozho2 lislis lls95 lili95 lils5 liss9 sdf0ty987 sdft876 sdft9876 sdf09876 sd0t9876 sdf0ty98 sdf0976 sdf0ty986 sdf0ty96 sdf0t76 sdf0876 df0ty98 sf0t876 sd0ty76 sdy76 sdf76 sdf0t76 sdf0ty9 sdf0ty98 sdf0ty987 sdf0ty98 sdf6676 sdf876 sd876 sd876 sdf6 sdf6 sdf9876 sdf0t sdf06 sdf0ty9776 sdf0ty9776 sdf0ty76 sdf8876 sdf0t sd6 sdf06 s688876 sd688 sdf86