Full-Text Search
Enable Full-Text Search (FTS) to speed up text search queries on string columns. FTS indexes string data to locate text patterns without scanning entire column values.
When to use Full-Text Search⚓︎
Full-Text Search works best for:
- Log analysis: Search application logs, system logs, or error messages for specific keywords or patterns
- Large text fields: Query string columns with long text values like user agent strings, URLs, or message bodies
- Pattern matching: Find records containing specific words or terms across millions of rows
Without FTS, text search queries scan every value in the column. With FTS enabled, Hydrolix uses the index to identify matching segments and improve query performance.
How Full-Text Search works⚓︎
Full-Text Search splits text into major and minor segments with predefined separators. This segmentation allows Hydrolix to locate matching patterns efficiently.
Example⚓︎
The following log message demonstrates how FTS segments some text:
Full-Text Search identifies the first major segment as 66.249.65.159, which contains these minor segments:
6624965159
Hydrolix indexes these segments, allowing queries to match against the segments rather than scanning the full text.
Enable Full-Text Search⚓︎
Enable Full-Text Search on a column by setting index to true and index_options.fulltext to true in the transform configuration:
| Transform Configuration | |
|---|---|
Hydrolix creates the FTS index during data ingestion.
Query with Full-Text Search⚓︎
Full-Text Search works with multiple query patterns. These operators use the FTS index for better performance.
| Operator | Description | Example |
|---|---|---|
LIKE |
Case-sensitive pattern matching | message LIKE '%error%' |
ILIKE |
Case-insensitive pattern matching | message ILIKE '%ERROR%' |
MATCH |
Regular expression matching | match(message, 'error.*timeout') |
= |
Direct string comparison | message = 'Specific error text' |
| Search Query Example | |
|---|---|
This query searches for the word error in the message column within the last hour.
Additional examples⚓︎
Search user agent strings (case-sensitive):
| User Agent Search with LIKE | |
|---|---|
Case-insensitive search:
| Case-Insensitive Search with ILIKE | |
|---|---|
Find specific error patterns:
| Error Pattern Search | |
|---|---|
Verify index usage⚓︎
Confirm that the query uses an index by enabling query debug. When a column with Full-Text Search enabled appears in the indexes_used field, it indicates the FTS index was used for that column:
| Query Debug Output | |
|---|---|
Separators⚓︎
Full-Text Search uses two types of separators to segment text. These separators are predefined and can't be modified.
Major separators⚓︎
Major separators create text segments:
[]<>(){}|!;,'"*\n\r\s\t&?+
Minor separators⚓︎
Minor separators create sub-segments within major segments:
/:=@.-$#%\_
Recommendations⚓︎
- Match punctuation literally: Use literal punctuation symbols rather than wildcards for better performance.
- Prefer trailing over leading wildcards when possible: Patterns like
error%perform significantly better than leading or middle wildcards (%errororer%or), which can reduce performance and match unintended patterns. For example,er%ormatches both "error" and "herbivore." When searching for terms that may appear anywhere in the text, such as log messages or URLs, leading wildcards (%error%) are often necessary despite the performance trade-off.
Considerations⚓︎
- Storage overhead: FTS indexes require storage space. Enable FTS only on columns searched regularly with text pattern matching. To check storage usage, query the table catalog:
SELECT sum(manifest_size + data_size + index_size) FROM project_name.\table_name#.catalog`;` - Ingestion-time indexing: Hydrolix creates FTS indexes during data ingestion. Adding FTS to existing data requires reingestion for indexing to work.
- ALTER TABLE behavior: When you run
ALTER TABLEon a column, Hydrolix rewrites the partition with the altered data and generates a new index file. This means FTS indexes are automatically updated when columns are altered. - Supported operations: FTS optimizes
LIKE,ILIKE,MATCH, and equality (=) operations. - Index structure: FTS effectiveness depends on separator alignment with the data structure. Text with clear word boundaries and standard punctuation works best.
- position() vs pattern matching: For columns without FTS, the
position()function is more efficient thanLIKEfor substring searches. With FTS enabled, pattern matching queries (LIKE,ILIKE,MATCH) perform well and offer capabilities thatposition()doesn't support.
See also⚓︎
- Transform structure - Configure FTS in output columns
- Writing efficient queries - Query optimization best practices
- Hydrologs query examples - Log searching patterns
- Query troubleshooting - Diagnose query issues