Server Monitoring at Scale – Lessons from HPE Cloudera

As part of a Managed Services & Operational project at PT. Bringin Inti Teknologi (bit.),
I had the opportunity to work on monitoring more than 112 HPE servers that were used in a Cloudera environment.
The monitoring was performed using HPE OneView, along with supporting tools like Zabbix.
Key Responsibilities
During this project, my main tasks included:
-
Basic Server Setup
Installing and configuring Red Hat Enterprise Linux (RHEL) as the primary operating system. -
Monitoring & Health Check
Setting up monitoring for 112 HPE servers using HPE OneView, ensuring that CPU, memory, storage, and network utilization were tracked in real-time. -
Alerting & Event Management
Configuring alerts and log collection to quickly identify failures, bottlenecks, or hardware degradation. -
Integration with Monitoring Tools
Complementing OneView data with Zabbix dashboards for better visualization and trend analysis.
Tools & Technologies
- HPE OneView → Infrastructure and hardware monitoring.
- Red Hat Enterprise Linux (RHEL) → Operating system for most of the servers.
- Zabbix → Visualization, metrics collection, and alerting.
- Cloudera → The main platform running on top of these servers.
Key Takeaways
Working on a large-scale monitoring environment taught me several important lessons:
- Scalability matters – Managing more than 100 servers requires automation and consistent monitoring policies.
- Clear documentation – With multiple engineers on the project, documenting monitoring procedures was essential.
- Proactive alerting – Early detection of hardware issues prevents downtime and reduces business impact.
- Cross-tool integration – Combining HPE OneView with Zabbix provided both low-level hardware visibility and high-level metrics visualization.
This experience strengthened my skills in infrastructure monitoring, incident response, and system administration.
It also gave me a deeper understanding of how large-scale enterprise environments maintain reliability and performance through proactive monitoring.