Google Pub/Sub
Overview⚓︎
Use Google Cloud Pub/Sub push subscriptions to stream data into Hydrolix. Pub/Sub sends message payloads to the Hydrolix HTTP Stream API, which runs your transform and writes the data to the specified table.
This integration works by configuring a Pub/Sub push subscription to send messages to the Hydrolix /ingest/event endpoint. The push subscription delivers the message payload to Hydrolix, where your transform processes it.
Performance limitations
Google Pub/Sub push subscriptions send one message per HTTP request, which creates significant overhead and can result in poor ingestion performance. For high-throughput workloads, consider alternative streaming methods such as Kafka or Kinesis that support batching multiple messages per request.
Before you begin⚓︎
Verify you have:
- A Google Cloud project with Pub/Sub API enabled
- A Pub/Sub topic with data to send to Hydrolix
- Permission to create service accounts and push subscriptions in Google Cloud
- The Hydrolix cluster endpoint URL
- Permission to create credentials in Hydrolix
- A Hydrolix project and table to receive the data. See Projects and Tables
- A transform to process the incoming data. See Write Transforms
Setup steps⚓︎
- Create a Pub/Sub topic or use an existing topic.
- Create a Google Cloud service account with appropriate permissions.
- Download the service account JSON key file.
- Import the service account credentials into Hydrolix.
- Create a Hydrolix table and transform to receive the data.
- Create a push subscription that sends messages to the Hydrolix HTTP Stream API endpoint.
- Test the integration by publishing messages to the topic.
Create a service account⚓︎
Create a Google Cloud service account for Pub/Sub authentication.
- In the Google Cloud Console, go to IAM & Admin > Service Accounts.
- Select Create service account.
- Enter a Service account name. For example,
hydrolix-pubsub-push. - Optionally, enter a Service account description.
- Select Create and continue.
- Skip granting roles or assign minimal permissions as needed.
- Select Continue, then Done.
Download the service account key⚓︎
- In the Service Accounts list, select the service account you created.
- Open the Keys tab.
- Select Add key > Create new key.
- Select JSON.
- Select Create. The JSON key file downloads to your computer.
Protect the service account key
Store the JSON key file securely. Don't commit it to version control.
Import the service account into Hydrolix⚓︎
Import the Google Cloud service account credentials into Hydrolix so the platform can authenticate incoming Pub/Sub push requests.
- In the Hydrolix UI, go to Admin > Credentials.
- Select Add credential.
- For Name, enter a descriptive name. For example,
pubsub-service-account. - For Type, select GCP Service Account Keys.
- For Cloud, select GCP.
- Upload the JSON key file downloaded from Google Cloud.
- Select Save.
Configure a Hydrolix table and transform⚓︎
Define a table in your Hydrolix cluster to receive the data from Google Pub/Sub, then define a transform to process the data.
Create the destination table⚓︎
Create a table using the Hydrolix UI by selecting the Table option under the Add new menu on the upper right-hand corner of the screen. You can also use the Hydrolix Configuration API to create the table. See Create a Table via API.
Create the transform⚓︎
Google Pub/Sub push subscriptions deliver messages to Hydrolix, but only the message payload is processed and stored. Metadata like message IDs, timestamps, and attributes aren't stored in Hydrolix. This is similar to the intake stages, where the payload passes through the transform pipeline.
Design the transform to process the actual data payload that the application publishes to the Pub/Sub topic. The transform should match the structure of the message data, not the Pub/Sub envelope format.
For detailed information on creating transforms, see Write Transforms. To see the structure of the incoming data, consider using a catch-all transform.
Create a Pub/Sub push subscription⚓︎
Create a push subscription in Google Cloud to send messages to Hydrolix.
- In the Google Cloud Console, go to Pub/Sub > Subscriptions.
- Select Create subscription.
- Enter a Subscription ID. For example,
hydrolix-push-subscription. - Select the Pub/Sub topic to send to Hydrolix.
- For Delivery type, select Push.
-
Enter the Endpoint URL for the Hydrolix HTTP Stream API:
For example:
-
Configure authentication:
- Select Enable authentication
- For Authentication method, select OIDC token
- Select the service account created earlier. For example,
hydrolix-pubsub-push. - For Audience, enter the Hydrolix cluster URL. For example,
https://hydrolix.example.com.
Service account authentication
Google Pub/Sub sends push requests with a JWT Bearer token in the Authorization header. Hydrolix validates this token using the service account credentials you imported. The service account email in the JWT must match a credential imported into Hydrolix.
-
Configure message retention, acknowledgement deadline, and retry policy as needed:
- Message retention: Set how long Pub/Sub retains unacknowledged messages. See Subscription message retention.
- Acknowledgement deadline: Set the deadline to prevent message redelivery during processing delays.
- Retry policy: Configure exponential backoff parameters. See Retry and error handling.
- Select Create.
For more information, see Create push subscriptions and Authenticate push requests.
Test the integration⚓︎
Publish a test message to verify the integration works.
- In the Google Cloud Console, go to Pub/Sub > Topics.
- Select the topic.
- Open the Messages tab.
- Select Publish message.
- Enter a test payload in JSON format that matches the transform schema.
- Select Publish.
Query the Hydrolix table to verify the message arrived.
Additional information⚓︎
Scaling⚓︎
Google Pub/Sub push subscriptions scale automatically for high throughput.
- Endpoint capacity: Verify the Hydrolix cluster can handle the message volume.
- Subscription parallelism: Pub/Sub sends messages to multiple endpoints simultaneously.
- Message ordering: Standard push subscriptions don't guarantee message order. For ordered delivery, use ordering keys. See Message ordering.
- Acknowledgement deadlines: Set acknowledgement deadlines to prevent message redelivery during processing delays.
Delivery guarantees
Standard push subscriptions don't guarantee message order. If ordered delivery is needed, configure ordering keys when creating the Pub/Sub topic. This ensures messages with the same ordering key are delivered in the order they were published.
For information about Pub/Sub quotas and limits, see Google Pub/Sub quotas and limits.
Retry and error handling⚓︎
When Hydrolix fails to process a message, the subscription's retry policy controls what happens:
- Pub/Sub retries messages based on the subscription's retry policy
- Send failed messages to a dead-letter topic for analysis. See Dead-letter topics.
- Configure exponential backoff parameters to control retry behavior
Security⚓︎
- Service account authentication: Configure OIDC token authentication in the push subscription settings. Google sends a JWT Bearer token that Hydrolix validates using the imported service account credentials. This ensures only authorized Google Cloud projects can send data to the cluster.
- Network security: Use VPC Service Controls or Private Service Connect to restrict network access to the Hydrolix endpoints.
- Credential management: Protect service account key files. Don't commit them to version control.
- Principle of least privilege: Grant the service account only the minimum permissions needed.
Troubleshooting⚓︎
Verify service account configuration⚓︎
If messages aren't arriving in Hydrolix:
- Check the service account email: Verify the service account email configured in the push subscription matches the service account imported into Hydrolix.
- Verify credentials in Hydrolix: In the Hydrolix UI, go to Admin > Credentials and confirm the GCP service account credential exists and is properly configured.
- Check subscription status: In the Google Cloud Console, go to Pub/Sub > Subscriptions and check for delivery errors or failed push attempts.
- Review Hydrolix logs: Check the Hydrolix intake logs for authentication errors or validation failures.
Test message delivery⚓︎
- In the Google Cloud Console, go to Pub/Sub > Topics.
- Select the topic.
- Select Publish message to send a test message.
- Check the subscription's Metrics tab to see if messages were delivered successfully.
- Query the Hydrolix table to verify the data arrived.