📚
Tech-Posts
  • README
  • Kafka + Maxwell
  • Kafka
  • Docker
  • MySQL connection via SSH
  • Python
    • Django
    • PyCharm+Docker Dev
    • Pip Tools
    • python project with local packages
  • PHP
    • PhpStorm+Docker Dev
  • Cassandra
  • AWS
    • Cheat Sheet
    • Lambda with Kinesis Event Source Mapping
  • AWS DMS
  • Lambda demo function to produce to Kinesis
  • Deploy a static web page with protection of specific static resources on AWS S3
  • Data Engineer
    • Move Salesforce Files out using Pentaho DI
  • A Pentaho DI Project Readme
  • PowerBI
    • Power BI refer to previous row
Powered by GitBook
On this page
  • TODO
  • optimise elasticsearch sync job
  • Maxwell + Kafka
  • Use php-rdkafka to consume Kafka messages and sync elasticsearch

Was this helpful?

Kafka + Maxwell

to sync database to elasticsearch

TODO

https://github.com/arnaud-lb/php-rdkafka/issues/198

optimise elasticsearch sync job

Maxwell + Kafka

Mysql Settings

[mysqld]
server_id = 1
log-bin=master
binlog_format=row

server id MUST be set.

log-bin and nin_log_format NEED to be set to meet row binlog analyze.

Also Need to have a user, who has permissions to store state in the database specified by the schema_database option (default maxwell).

Maxwell will record stats to a schema in the same mysql, the schema name is defined in option schema_database.

Start Zookeeper Server

bin/zookeeper-server-start.sh config/zookeeper.properties

Start Kafka Server

bin/kafka-server-start.sh config/server.properties

before start it, check config/server.properties: set (if not exists, add the key) advertised.host.name=192.168.33.216, the value should be kafka server host's public ip address (not 127.0.0.1 or localhost), if you want to use docker version of maxwell to connect kafka.

Start Maxwell

  • producer is console

docker run -it --rm zendesk/maxwell bin/maxwell --user='maxwell' \
--password='maxwell' --host='192.168.33.216' --producer=stdout \
--filter='exclude: *.*, include: logix_crm_local.acy_leads, include: logix_crm_local.acy_accounts, include: logix_crm_local.acy_opportunities, include: logix_crm_local.acy_leads_cstm, include: logix_crm_local.acy_accounts_cstm, include: logix_crm_local.acy_opportunities_cstm, include: logix_crm_local.acy_objects_assign'
  • producer is kafka

docker run -it --rm zendesk/maxwell bin/maxwell --user='maxwell' \
--password='maxwell' --host='192.168.33.216' --producer=kafka \
--kafka.bootstrap.servers='192.168.33.216:9092' --kafka_topic=maxwell \
--filter='exclude: *.*, include: logix_crm_local.acy_leads, include: logix_crm_local.acy_accounts, include: logix_crm_local.acy_opportunities, include: logix_crm_local.acy_leads_cstm, include: logix_crm_local.acy_accounts_cstm, include: logix_crm_local.acy_opportunities_cstm, include: logix_crm_local.acy_objects_assign'

用docker执行maxwell,引用的docker外部的ip,必须使用host.docker.internal 或者 public ip address (不能使用127.0.0.1)。 which match KAFKA_ADVERTISED host-name and port

如果不同项目使用一个MySql host,可以执行多个maxwell进程,但是需要配置不同的client_id和replica_server_id

e.g: --client_id=maxwell2 --replica_server_id=6380

consumer

console consumer to test:

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic maxwell --from-beginning

bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic maxwell --consumer.config config/consumer.properties

其他Kafka命令

  • 创建一个 topic bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

  • 列出所有 topic bin/kafka-topics.sh --list --zookeeper localhost:2181

  • 启动一个生产者,然后随意发送一些消息 bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

This is a message
This is another message

Use php-rdkafka to consume Kafka messages and sync elasticsearch

https://github.com/arnaud-lb/php-rdkafka

PreviousREADMENextKafka

Last updated 4 years ago

Was this helpful?