HStream IO User Guides
Data synchronization service is used to synchronize data between systems(e.g. databases) in real time, which is useful for many cases, for example, MySQL is a widely-used database, if your application is running on MySQL, and:
- You found its query performance is not enough.
- You want to migrate your data to another database (e.g. PostgreSQL), but you need to switch your application seamlessly.
- Your applications highly depended on MySQL, migrating is difficult, so you have to migrate gradually.
- You don't need to migrate the whole MySQL data, instead, just copy some data from MySQL to other databases(e.g. HStreamDB) for data analysis.
- You need to upgrade your MySQL version for some new features seamlessly.
- You need to back up your MySQL data in multiple regions in real time.
In those cases, you will find a data synchronization service is helpful, HStream IO is an internal data integration framework for HStreamDB, and it can be used as a data synchronization service, this document will show you how to use HStream IO to build a data synchronization service from a MySQL table to a PostgreSQL table, you will learn:
- How to create a source connector that synchronizes records from a MySQL table to an HStreamDB stream.
- How to create a sink connector that synchronizes records from an HStreamDB stream to a PostgreSQL table.
- How to use HStream SQLs to manage the connectors.
Set up an HStreamDB Cluster
You can check quick-start to find how to set up an HStreamDB cluster and connect to it.
Set up a MySQL
Set up a MySQL instance with docker:
docker run --network=hstream-quickstart --name mysql-s1 -e MYSQL_ROOT_PASSWORD=password -d mysql
Here we use the
hstream-quickstart network if you set up your HStreamDB cluster based on quick-start.
Connect to the MySQL instance:
docker exec -it mysql-s1 mysql -uroot -h127.0.0.1 -P3306 -ppassword
Create a database
d1, a table
person and insert some records:
create database d1; use d1; create table person (id int primary key, name varchar(256), age int); insert into person values (1, "John", 20), (2, "Jack", 21), (3, "Ken", 33);
person must include a primary key, or the
DELETE statement may not be synced correctly.
Set up a PostgreSQL
Set up a PostgreSQL instance with docker:
docker run --network=hstream-quickstart --name pg-s1 -e POSTGRES_PASSWORD=postgres -d postgres
Connect to the PostgreSQL instance:
docker exec -it pg-s1 psql -h 127.0.0.1 -U postgres
sink-postgresql doesn't support the automatic creation of a table yet, so you need to create the database
d1 and the table
create database d1; \c d1; create table person (id int primary key, name varchar(256), age int);
person must include a primary key.
Download Connector Plugins
A connector plugin is a docker image, so before you can set up the connectors, you should download and update their plugins with
docker pull hstreamdb/source-mysql:latest docker pull hstreamdb/sink-postgresql:latest
Here is a table of all available connectors.
After connecting an HStream Server, you can use create source/sink connector SQLs to create connectors.
Connect to the HStream server:
docker run -it --rm --network host hstreamdb/hstream:latest hstream sql --port 6570
Create a source connector:
create source connector source01 from mysql with ("host" = "mysql-s1", "port" = 3306, "user" = "root", "password" = "password", "database" = "d1", "table" = "person", "stream" = "stream01");
The source connector will run an HStream IO task, which continually synchronizes data from MySQL table
d1.person to stream
stream01. Whenever you update records in MySQL, the change events will be recorded in stream
stream01 if the connector is running.
You can use
SHOW CONNECTORS to check connectors and their status and use
RESUME to stop/restart connectors:
PAUSE connector source01; RESUME connector source01;
If resume the connector task, the task will not fetch data from the beginning, instead, it will continue from the point when it was paused.
Then you can create a sink connector that consumes the records from the stream
stream01 to PostgreSQL table
create sink connector sink01 to postgresql with ("host" = "pg-s1", "port" = 5432, "user" = "postgres", "password" = "postgres", "database" = "d1", "table" = "person", "stream" = "stream01");
sink01 connectors running, you get a synchronization service from MySQL to PostgreSQL.
You can use the
DROP CONNECTOR statement to delete the connectors:
drop connector source01; drop connector sink01;