Get Started with Kafka API
Requirement
For optimal performance, we suggest utilizing a Linux kernel version of 4.14 or higher when initializing an HStream Kafka Cluster.
TIP
In the case it is not possible for the user to use a Linux kernel version of 4.14 or above, we recommend adding the option --enable-dscp-reflection=false
to HStore while starting the HStream Kafka Cluster.
Installation
Install docker
TIP
If you have already installed docker, you can skip this step.
See Install Docker Engine, and install it for your operating system. Please carefully check that you have met all prerequisites.
Confirm that the Docker daemon is running:
docker version
TIP
On Linux, Docker needs root privileges. You can also run Docker as a non-root user, see Post-installation steps for Linux.
Install docker compose
TIP
If you have already installed docker compose, you can skip this step.
See Install Docker Compose, and install it for your operating system. Please carefully check that you met all prerequisites.
docker-compose version
Start HStreamDB Services
WARNING
Do NOT use this configuration in your production environment!
Create a docker-compose.yaml file for docker compose, you can download or paste the following contents:
version: "3.5"
services:
hserver:
image: hstreamdb/hstream:latest
depends_on:
- zookeeper
- hstore
ports:
- "127.0.0.1:9092:9092"
expose:
- 9092
networks:
- hstream-kafka-quickstart
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /tmp:/tmp
- data_store:/data/store
command:
- bash
- "-c"
- |
set -e
/usr/local/script/wait-for-storage.sh hstore 6440 zookeeper 2181 600 \
/usr/local/bin/hstream-server kafka \
--bind-address 0.0.0.0 --port 9092 \
--server-id 100 \
--seed-nodes "$$(hostname -I | awk '{print $$1}'):6571" \
--advertised-address $$(hostname -I | awk '{print $$1}') \
--metastore-uri zk://zookeeper:2181 \
--store-config /data/store/logdevice.conf \
--store-log-level warning \
--log-with-color
hstore:
image: hstreamdb/hstream:latest
networks:
- hstream-kafka-quickstart
volumes:
- data_store:/data/store
command:
- bash
- "-c"
- |
set -ex
# N.B. "enable-dscp-reflection=false" is required for linux kernel which
# doesn't support dscp reflection, e.g. centos7.
/usr/local/bin/ld-dev-cluster -n 3 --root /data/store \
--use-tcp --tcp-host $$(hostname -I | awk '{print $$1}') \
--user-admin-port 6440 \
--param enable-dscp-reflection=false \
--no-interactive
zookeeper:
image: zookeeper:3.8
expose:
- 2181
networks:
- hstream-kafka-quickstart
volumes:
- data_zk_data:/data
- data_zk_datalog:/datalog
hserver-init:
image: hstreamdb/hstream:latest
depends_on:
- hserver
networks:
- hstream-kafka-quickstart
command:
- bash
- "-c"
- |
timeout=60
until ( \
/usr/local/bin/hstream-kafka --host hserver --port 9092 node status \
) >/dev/null 2>&1; do
>&2 echo 'Waiting for servers ...'
sleep 1
timeout=$$((timeout - 1))
[ $$timeout -le 0 ] && echo 'Timeout!' && exit 1;
done; \
/usr/local/bin/hstream-kafka --host hserver --port 9092 node init
networks:
hstream-kafka-quickstart:
name: hstream-kafka-quickstart
volumes:
data_store:
name: kafka_quickstart_data_store
data_zk_data:
name: kafka_quickstart_data_zk_data
data_zk_datalog:
name: kafka_quickstart_data_zk_datalog
data_prom_config:
name: kafka_quickstart_data_prom_config
then run:
docker-compose -f kafka-quick-start.yaml up
If you see some thing like this, then you have a running hstream kafka cluster:
hserver_1 |[INFO][2024-01-12T02:15:08+0000][app/lib/KafkaServer.hs:193:11][thread#30]Cluster is ready!
TIP
You can also run in background:
docker-compose -f kafka-quick-start.yaml up -d
TIP
If you want to show logs of server, run:
docker-compose -f kafka-quick-start.yaml logs -f hserver
Connect HStream Kafka with CLI
You can use hstream-kafka
command-line interface (CLI), which is included in the hstreamdb/hstream
image, to interactive with hstream kafka cluster
Start an instance of hstreamdb/hstream
using Docker:
docker run -it --rm --name kafka-cli --network host hstreamdb/hstream:latest bash
Create a topic
To create a topic, you can use hstream-kafka topic create
command. Now we will create a topic with 3 partitions
hstream-kafka topic create demo --partitions 3
+------+------------+--------------------+
| Name | Partitions | Replication-factor |
+------+------------+--------------------+
| demo | 3 | 1 |
+------+------------+--------------------+
Produce data to a topic
The hstream-kafka produce
command can be used to produce data to a topic in a interactive shell.
hstream-kafka produce demo --separator "@" -i
- With the
--separator
option, you can specify the separator for key. The default separator is "@". Using the separator, you can assign a key to each record. Record with same key will be append into same partition of the topic. - Using
-i
option to enter the interactive mode.
fruit@apple
animal@cat
hello hstream!
Here we have written three pieces of data. The first two are associated with specified key. The last one does not specify a key.
For additional information, you can use hstream-kafka produce -h
.
Consume data from a topic
To consume data from a particular topic, the hstream-kafka consume
command is used.
hstream-kafka consume --group-id test-group --topic demo --earliest --verbose --eof
- Using the
--verbose
option will print the creation timestamp and the key of the record. The--eof
option will tell the consumer to exit after receiving the last message in the partition.
CreateTimestamp: 1705026820718 Key: fruit apple
CreateTimestamp: 1705026912306 Key: hello hstream!
CreateTimestamp: 1705026833287 Key: animal cat
EOF reached for all 3 partition(s)
Consumed 3 messages (62 bytes)
For additional information, you can use hstream-kafka consume -h
.