General Errors

Alerts

Alert	Check	Solution
Disk Usage Alert	The disk usage alert is fired when the amount of root volume disk utilisation is above the set threshold. Hydrolix uses active disk management with Log and any cache eviction being managed on-box. Typically this will resolve quickly automatically, however if it does not an inspection of the log files is suggested. An alternative to resolve this state in stateless components is to power off the box (SSH sudo poweroff) this works as the majority of components are stateless it can be helpful to simply just turn the box off and allow autoscaling to replace it. Please report this to Hydrolix if this is a continuous problem.
RAM Usage Alert	Where there are continuous levels of high ram usage this can typically be managed with configuration changes or capacity increases of components. For stream, Batch, Kafka and Merge a review of the configuration maybe required to either reduce the amount of data being written at a time or to increase the timing of writes to storage. This may also require a capacity increase. For Query - Typically RAM usage is an indication of queried footprint. Additional capacity at the Head (where larger query responses are expected) or peer can resolve the issue. In addition query optimisation can help significantly in this area, making more use of predicates in a query forces better and typically fewer partition selection thereby reducing RAM footprint required and will actually improve performance.
CPU Usage Monitoring Alert	Where there are continuous levels of high ram usage this can typically be managed with capacity increases of components.

Scaling

Issue: Unable to Scale a Component

Check	Fix
Check the operator status: `kubectl logs -l app=operator --tail=100` The log should end with no errors and include: `Resuming is processed: 1 succeeded; 0 failed.`	Restart the Operator: `kubectl rollout restart deployment operator`
Check additional nodes can be deployed: `kubectl describe nodes`
Check `hydrolixcluster.yaml` has the `scale_off` command applied: `scale_off: true`	Remove `scale_off: true` from `hydrolixcluster.yaml`: `kubectl get hydrolixcluster -o yaml > hydrolixcluster.yaml` Set `scale_off: false`. `kubectl apply -f hydrolixcluster.yaml`

For more information, see the documentation on scale profiles.

Issue: Unable to Connect to the Cluster

Check	Fix
Check the requesting IP has access.	Add the IP to the Allow list. `kubectl get hydrolixcluster -o yaml > hydrolixcluster.yaml` Add the IPs. `kubectl apply -f hydrolixcluster.yaml`

For more information, see Configure IP Access.

Updated 2 months ago