Hydrolix Spark Connector

Analyze your Hydrolix data using Apache Spark and Databricks

Hydrolix Spark Connector

Overview

The Hydrolix Spark Connector combines the cost and query efficiency of the Hydrolix platform with the rich data analysis and distributed computing power offered by Apache Spark. By integrating your Apache Spark ecosystem with Hydrolix, the Spark Connector enables the cost and performance efficiency gains of Hydrolix as the backing store while presenting the data in familiar notebooks and coding environments.

The latest Hydrolix Spark Connector JAR can be downloaded here.

Versioning

The Spark Connector version is of the format:

{Spark Connector version: major.minor.patch}-{Embedded HDX version: major.minor.patch}

An example of this formatting is the following:

v1.0.0-v4.22.1.jar

Each Spark Connector version is compatible with a corresponding minimum Hydrolix cluster version and all more recent versions. The following is a compatibility matrix between Spark Connector versions and their compatible Hydrolix cluster versions:

Spark Connector Version	Hydrolix Versions	Changelog
v2.0.0-5.1.1	v4.22.1+	22 April 2025 - v2.0.0-5.1.1
v1.0.0-4.22.1	v4.22.1+	4 February 2025 - v1.0.0-4.22.1

Deployment Environments

The Hydrolix Spark Connector can be deployed to multiple platforms. Follow the install instructions for your preferred platform.

Hydrolix Spark Connector

Hydrolix Spark Connector

Overview

Versioning

Deployment Environments

Databricks

Microsoft Fabric

AWS EMR