General Errors

Alerts

AlertCheckSolution
Disk Usage AlertThe disk usage alert is fired when the amount of root volume disk utilisation is above the set threshold. Hydrolix uses active disk management with Log and any cache eviction being managed on-box. Typically this will resolve quickly automatically, however if it does not an inspection of the log files is suggested. An alternative to resolve this state in stateless components is to power off the box (SSH sudo poweroff) this works as the majority of components are stateless it can be helpful to simply just turn the box off and allow autoscaling to replace it. Please report this to Hydrolix if this is a continuous problem.
RAM Usage AlertWhere there are continuous levels of high ram usage this can typically be managed with configuration changes or capacity increases of components. For stream, Batch, Kafka and Merge a review of the configuration maybe required to either reduce the amount of data being written at a time or to increase the timing of writes to storage. This may also require a capacity increase. For Query - Typically RAM usage is an indication of queried footprint. Additional capacity at the Head (where larger query responses are expected) or peer can resolve the issue. In addition query optimisation can help significantly in this area, making more use of predicates in a query forces better and typically fewer partition selection thereby reducing RAM footprint required and will actually improve performance.
CPU Usage Monitoring AlertWhere there are continuous levels of high ram usage this can typically be managed with capacity increases of components.

Scaling

Issue: Unable to Scale a Component

CheckFix
Check the operator status:
kubectl logs -l app=operator --tail=100

The log should end with no errors and include:
Resuming is processed: 1 succeeded; 0 failed.
Restart the Operator:
kubectl rollout restart deployment operator
Check additional nodes can be deployed:
kubectl describe nodes
Check hydrolixcluster.yaml has the scale_off command applied:
scale_off: true
Remove scale_off: true from hydrolixcluster.yaml:

kubectl get hydrolixcluster -o yaml > hydrolixcluster.yaml

Set scale_off: false.

kubectl apply -f hydrolixcluster.yaml

For more information, see the documentation on scale profiles.

Issue: Unable to Connect to the Cluster

CheckFix
Check the requesting IP has access.Add the IP to the Allow list.
kubectl get hydrolixcluster -o yaml > hydrolixcluster.yaml

Add the IPs.

kubectl apply -f hydrolixcluster.yaml

For more information, see Configure IP Access.