General Errors
Alerts
Alert | Check | Solution |
---|---|---|
Disk Usage Alert | The disk usage alert is fired when the amount of root volume disk utilisation is above the set threshold. Hydrolix uses active disk management with Log and any cache eviction being managed on-box. Typically this will resolve quickly automatically, however if it does not an inspection of the log files is suggested. An alternative to resolve this state in stateless components is to power off the box (SSH sudo poweroff) this works as the majority of components are stateless it can be helpful to simply just turn the box off and allow autoscaling to replace it. Please report this to Hydrolix if this is a continuous problem. | |
RAM Usage Alert | Where there are continuous levels of high ram usage this can typically be managed with configuration changes or capacity increases of components. For stream, Batch, Kafka and Merge a review of the configuration maybe required to either reduce the amount of data being written at a time or to increase the timing of writes to storage. This may also require a capacity increase. For Query - Typically RAM usage is an indication of queried footprint. Additional capacity at the Head (where larger query responses are expected) or peer can resolve the issue. In addition query optimisation can help significantly in this area, making more use of predicates in a query forces better and typically fewer partition selection thereby reducing RAM footprint required and will actually improve performance. | |
CPU Usage Monitoring Alert | Where there are continuous levels of high ram usage this can typically be managed with capacity increases of components. |
Scaling
Issue: Unable to Scale a Component
Check | Fix |
---|---|
Check the operator status:kubectl logs -l app=operator --tail=100 The log should end with no errors and include: Resuming is processed: 1 succeeded; 0 failed. | Restart the Operator:kubectl rollout restart deployment operator |
Check additional nodes can be deployed:kubectl describe nodes | |
Check hydrolixcluster.yaml has the scale_off command applied:scale_off: true | Remove scale_off: true from hydrolixcluster.yaml :kubectl get hydrolixcluster -o yaml > hydrolixcluster.yaml Set scale_off: false .kubectl apply -f hydrolixcluster.yaml |
For more information, see the documentation on scale profiles.
Issue: Unable to Connect to the Cluster
Check | Fix |
---|---|
Check the requesting IP has access. | Add the IP to the Allow list.kubectl get hydrolixcluster -o yaml > hydrolixcluster.yaml Add the IPs. kubectl apply -f hydrolixcluster.yaml |
For more information, see Configure IP Access.
Updated about 2 months ago