via Kafka
Getting Started
Hydrolix Projects and Tables can continuously ingest data from one or more Kafka - based streaming sources.
Kafka source configuration is completed using the Kafka Sources API.
The basic steps are:
- Create a Project/Table
- Create a Transform
- Configure the Kafka Source and Scale.
It is assumed that the project, table and transform are all already configured. More information on how to set these up can be found here - Projects & Tables, Write Transforms.
Setting up a Kafka source through the API
To create, update, or view a table's Kafka sources through the Hydrolix API, use the /sources/kafka
API endpoint. Note that you will need the IDs of your target organization, project, and table in order to address that endpoint's full path.
Adding Kafka sources to a table involves /sources/kafka
API endpoint, sending it a JSON document describing the connection between your Hydrolix table and the Kafka data streams.
For example, the following JSON documents set up a connection between a pair of Kafka sources running at the domain example.com
and the Hydrolix table my-project.my-table
.
Example (AWS CloudFormation)
{
"name": "my-kafka-ingest",
"type": "pull",
"subtype": "kafka",
"transform": "my_transform",
"table": "my_project.my_table",
"settings": {
"bootstrap_servers": [
"kafka-1.example.com:9092",
"kafka-2.example.com:9092"
],
"topics": [ "my_topic" ]
},
"pool_name": "my-kafka-pool",
"instance_type": "m5.large",
"instance_count": "2"
}
Example (Kubernetes)
{
"name": "my-kafka-ingest",
"type": "pull",
"subtype": "kafka",
"transform": "my_transform",
"table": "my_project.my_table",
"settings": {
"bootstrap_servers": [
"kafka-1.example.com:9092",
"kafka-2.example.com:9092"
],
"topics": [ "my_topic" ]
},
"pool_name": "my-kafka-pool",
"k8s_deployment":{
"cpu": 1,
"replicas": 1,
"service": "kafka-peer"
}
}
Configuration properties
The JSON document describing a Kafka-based data source requires the following properties:
Property | Purpose | Platform |
---|---|---|
name | A name for this data source. Must be unique within the target table's organization. | AWS / Kubernetes |
type | The type of ingestion. Pull only supported at this time. | AWS / Kubernetes |
subtype | The literal value "kafka" . | AWS / Kubernetes |
transform | The name of the transform to apply to this ingestion. | AWS / Kubernetes |
table | The Hydrolix project and table to ingest into, expressed in the format "PROJECT.TABLE" . | AWS / Kubernetes |
settings | The settings to use for this particular Kafka source. | AWS / Kubernetes |
pool_name | The name that Hydrolix will assign to the ingest pool. | AWS / Kubernetes |
instance_type | The type of AWS instance Hydrolix will use within the Kafka ingest pool. | AWS |
instance_count | The number of instances Hydrolix will apply to the Kafka ingest pool. | AWS |
k8s_deployment | Only used for Kubernetes deployments, describes the replicas, memory and CPU and service to be used in the Kafka ingest pool."k8s_deployment":{ "cpu": 1, "replicas": 1, "service": "kafka-peer" } | Kubernetes |
The Settings
object.
Settings
object.The settings
property contains a JSON object that defines the Kafka servers and topics this table should receive events from.
Element | Description |
---|---|
bootstrap_servers | An array of Kafka bootstrap server addresses, in "HOST:PORT" format. |
topics | An array of Kafka topics to import from the given servers. |
Setting up a Kafka source through the UI
To create, view, and manage your stack's Kafka sources through its web UI, visit https://YOUR-HYDROLIX-HOST.hydrolix.live/data_sources/<project_name>/<table_name>?nav_open=true
in your web browser. Click Add New Source
.
Authenticating Kafka connections with TLS
If your Kafka data source requires a TLS-authenticated connection, you can update your Hydrolix cluster with TLS certificate and key information.
To do this, use the hdxctl update CLIENT_ID CLUSTER_ID
command, invoking it with the its three Kafka-related options:
Option | Expected Value |
---|---|
--kafka-tls-ca | A TLS certificate authority file, in PEM format. |
--kafka-tls-cert | A TLS certificate file, in PEM format.| |
--kafka-tls-key | A TLS Key file, in PEM format. |
For example, to update a cluster with the ID hdx-example4321
, with client ID hdxcli-example1234
, using PEM files in your current working directory:
$ hdxctl update hdxcli-example1234 hdx-example4321 \
--kafka-tls-cert kafka_cert.pem \
--kafka-tls-key kafka_key.pem \
--kafka-tls-ca kafka_ca.pem
Exporting certificates and keys from Kafka's Java keystore
By default, Kafka stores its certificate and key information into a java keystore (.jks
) file. Because Hydrolix requires this information as files in PEM format, you must export this information before updating your cluster.
To do this, you must have the keytool
and openssl
command-line programs installed on your system. Then, complete the following steps.
Exporting your CA and certificate files
- List all the certificates present in your keystore:
$ keytool -list -keystore client.keystore.jks
Enter keystore password:
Keystore type: PKCS12
Keystore provider: SUN
Your keystore contains 2 entries
caroot, May 5, 2021, trustedCertEntry,
Certificate fingerprint (SHA-256): A5:87:D0:E4:F6:70:4F:8E:07:2E:EE:56:73:D4:AF:88:DA:D5:8C:9F:67:71:F2:C0:7D:A9:CA:64:2F:F7:04:18
clientcert, May 3, 2021, PrivateKeyEntry,
Certificate fingerprint (SHA-256): 80:A2:28:7C:D9:1B:A8:48:AB:24:76:CC:5A:19:47:29:12:CF:22:A1:8C:92:6E:E4:C0:30:0A:A0:34:73:F7:55
- Locate the CA certificate file--
caroot
, in this example--and export it:
$ keytool -export -alias caroot -file caroot.crt -keystore client.keystore.jks
Enter keystore password:
Certificate stored in file <caroot.crt>
- Use
openssl
to transform it into PEM format:
$ openssl x509 -inform DER -in caroot.crt -out kafka_ca.pem -outform PEM
- Follow the same steps for your TLS certificate file--
clientcert
, in this example:
$ keytool -export -alias clientcert -file clientcert.crt -keystore client.keystore.jks
Enter keystore password:
Certificate stored in file <clientcert.crt>
$ openssl x509 -inform DER -in clientcert.crt -out kafka_cert.pem -outform PEM
Exporting your key file
Exporting your key from the Java keystore takes a couple of additional steps.
- Use
keytool
to create a new PKCS12 store:
$ keytool -v -importkeystore -srckeystore client.keystore.jks \
-srcalias clientcert -destkeystore keystore.p12 -deststoretype PKCS12
Importing keystore client.keystore.jks to keystore.p12...
Enter destination keystore password:
Re-enter new password:
Enter source keystore password:
[Storing keystore.p12]
- Use
openssl
to extract the private key in PEM format, and usesed
to remove extra information:
$ openssl pkcs12 -in keystore.p12 -nodes -nocerts \
| sed -ne '/-BEGIN PRIVATE KEY-/,/-END PRIVATE KEY-/p' \
> kafka_key.pem
Enter Import Password:
MAC verified OK
At this point, you should have the three PEM files you need to update your Hydrolix cluster with your Kafka TLS information.
Getting more help
If you need more help using Kafka with Hydrolix, or you'd just like to learn more about this integration, please contact Hydrolix support.
Updated 5 months ago