Full Text Search

Hydrolix support full text search analysis.
The functionality uses standard word delimiters:
[ ] < > ( ) { } | ! ; , ' " * \n \r \s \t & ? + / : = @ . - $ # % \ _

For example a column named message containing: Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. will have every word indexed and made searchable. At query time this is done using the function LIKE.

To enable full text search the following should be defined in the transform:

{
    "name": "message",
    "datatype": {
        "type":"string",
        "index": false,
        "fulltext": true
    }
}

Here the column message is a string where fulltext search is enabled.
We don't support any specific analyzer nor stemming, currently we simply index the whole word.

🚧

Full Text Search requires

Full Text Search requires indexing to be disabled and a datatype of string.

By default Hydrolix is using the function LIKE to search the fulltext index created:

SELECT message
FROM project.table
WHERE message LIKE '%error%'
AND timestamp < now()
AND timestamp > (now() - INTERVAL 60 MINUTE)
ORDER BY timestamp DESC
LIMIT 50
SETTINGS hdx_query_debug=true

In this example we are looking for the word error in our column message for the last 1h.

By leveraging the query debug we can see that we are leveraging the index for that query:

X-Hdx-Query-Stats: exec_time=107 rows_read=0 bytes_read=0 num_partitions=58 num_peers=3 query_attempts=1 memory_usage=9491822
index_stats=[{"project.table":{"columns_read":["message","timestamp"],"indexes_used":["message","timestamp"],"shard_key_values_used":[]}}]

By enabling Full Text Search you'll be able to filter and search for words much faster using standard delimiters.


Did this page help you?