There are a number of ways to monitor computer systems and their services, but the principles remain the same. Adequate monitoring and alerting of services is the only way to ensure that we know there is a problem, before our customers. From SNMP traps to agents running on machines specific to the services running, configuration of monitoring is an essential step in production deployments of OpenStack. This portion of further topics introduces some tools that can be used to monitor services within our OpenStack environment.
If you would like to become an OpenStack Certified professional, then visit Mindmajix - A Global online training platform: " OpenStack Certification Training Course ". This course will help you to achieve excellence in this domain. |
In researching and developing the monitoring sections for the Grizzly release, we found a wide and varied state of available tools. To that end, we are covering Nagios in this field and will continue to update openstackcookbook.com with information on other available tools.
Nagios is an open-source, mature, and robust network and system monitoring application. It is comprised of a Nagios server and a number of plugins, or checks. Plugins can run either locally to the Nagios server, or as we will be installing them, with the NRPE (Nagios Remote Plugin Execution) plugin. The NRPE plugin allows us to run agent-like checks on remote systems.
Related Article: Learn OpenStack Tutorial |
We will be configuring Nagios on a server that has access to our OpenStack COMPUTE environment hosts with IP address 172.16.0.212. Ensure this server has enough RAM, disk, and CPU capacity for the environment you are running. As a bare minimum in a test environment, it is possible to run this on a VM with 1vCPU, 1.5 GB of RAM, and 8 GB of disk space.
To set up Nagios with OpenStack, carry out the following steps:
Related Article: OpenStack Interview Questions and Answers |
The Nagios server provides the web interface as well as monitors the services. Before we can start monitoring with Nagios, we must install it as follows:
sudo apt-get update
sudo apt-get -y install nagios3 nagios-nrpe-plugin
Tip
If you are installing Nagios in an automated, non-interactive way, you may need to run sudo apt-get -f install to configure Postfix.
With the Nagios server installed, we can now configure the Nagios NRPE on each node, where we want to monitor:
sudo apt-get update
sudo apt-get -y install nagios-nrpe-server nagios-plugins-standard
allowed_hosts=172.16.0.212
Add the following to /etc/nagios/nrpe.cfg:
command[check_keystone_api]=/usr/lib/nagios/plugins/check_http
localhost -p 5000 -R application/vnd.openstack.identity
command[check_keystone_procs]=/usr/lib/nagios/plugins/check_procs -C
keystone-all -u keystone -c 1:1
command[check_glance_api_procs]=/usr/lib/nagios/plugins/check_procs - C
glance-api -u glance -c 1:4 command[check_glance_registry]=/usr/lib/nagios/plugins/check_procs -C glance-registry -u glance -c 1:2
command[check_nova_api]=/usr/lib/nagios/plugins/check_http localhost -p
5000 -R application/vnd.openstack.identity
Add the following to /etc/nagios/nrpe.cfg:
command[check_nova_metadata]=/usr/lib/nagios/plugins/check_procs -C
nova-api-metadata -u nova -c 1:4
command[check_nova_compute]=/usr/lib/nagios/plugins/check_procs -C
nova-compute -u nova -c 1:4
To check Swift, you will need to install the check_swift plugin in the /usr/lib/nagios/plugins folder of your swift servers. At the time of writing, this plugin can be downloaded from: https://exchange.nagios.org/directory/Plugins/Clustering-and-High-2DAvailability/check_swift/details
This can be set up as followed in the swift server’s /etc/nagios/nrpe/nrpe.cfg file:
command[check_swift_api]=/usr/lib/nagios/plugins/check_swift check_swift -A https://172.16.0.200:5000/v2.0/ -U swift -K swift -V 2 -c nagios
In addition to the above services, you can add additional OpenStack related checks to your environment using similar check_procs commands and the NRPE server on your various nodes. Additionally, while outside the scope of this book, there is a robust set of Chef cookbooks for Nagios, so you can integrate monitoring as you scale out your OpenStack build.
Once the lines are in on each of the node’s nrpe.cfg files, we can restart the nagios-npre server service to pick up the change:
service nagios-nrpe-server restart
Now that we have configured NRPE checks on each of our OpenStack nodes, we now need to tell our Nagios server which hosts it should be monitoring with the NRPE plugin. To do this, we need to create a file for each node in /etc/nagios3/conf.d/ on the Nagios server.
Following is the example file for the controller (/etc/nagios3/conf.d/cookbook-controller.cfg) nodes:
define host{ use |
generic-host |
host_name |
controller |
alias | controller |
address |
172.16.0.200 |
} |
define service{ host_name controller
check_command check_nrpe_1arg!check_keystone_api use generic-service
notification_period 24x7 service_description cookbook-keystone
}
define service { host_name controller
check_command check_nrpe_1arg!check_keystone_procs use generic-service
notification_period 24x7 service_description cookbook-keystone_procs
}
define service { host_name controller
check_command check_nrpe_1arg!check_glance_api_procs
use generic-service notification_period 24x7
service_description cookbook-glance_api_procs
}
define service { host_name controller
check_command check_nrpe_1arg!check_glance_registry use generic-service
notification_period 24x7
service_description cookbook-glance_registry
}
define service { host_name controller
check_command check_nrpe_1arg!check_nova_api use generic-service
notification_period 24x7 service_description cookbook-nova_api
}
Following is the example file for the compute (/etc/nagios3/conf.d/cookbook-compute.cfg) nodes:
define host{ use |
generic-host |
host_name |
compute |
alias |
compute |
address |
172.16.0.201 |
} |
define service { host_name compute
check_command check_nrpe_1arg!check_nova_compute use generic-service
notification_period 24x7 service_description cookbook-nova_compute
}
define service { host_name compute
check_command check_nrpe_1arg!check_nova_metadata use generic-service
notification_period 24x7 service_description cookbook-nova-metadata
}
If building checks for the remaining OpenStack services, you will need to configure them similarly:
Explore OpenStack Sample Resumes! Download & Edit, Get Noticed by Top Employers! |
Nagios is an excellent, open-source networked, resource-monitoring tool that can help you analyze resource trends and identify problems with our OpenStack environment. Configuration is very straightforward, with out-of-the-box configuration providing monitoring checks. By adding in a few extra configuration options and plugins, we can extend this to monitoring our OpenStack environment.
Once Nagios has been installed, we have to do a few things to configure it to produce graphed statistics for our environment:
Our work-support plans provide precise options as per your project tasks. Whether you are a newbie or an experienced professional seeking assistance in completing project tasks, we are here with the following plans to meet your custom needs:
Name | Dates | |
---|---|---|
OpenStack Training | Jan 25 to Feb 09 | View Details |
OpenStack Training | Jan 28 to Feb 12 | View Details |
OpenStack Training | Feb 01 to Feb 16 | View Details |
OpenStack Training | Feb 04 to Feb 19 | View Details |
Ravindra Savaram is a Technical Lead at Mindmajix.com. His passion lies in writing articles on the most popular IT platforms including Machine learning, DevOps, Data Science, Artificial Intelligence, RPA, Deep Learning, and so on. You can stay up to date on all these technologies by following him on LinkedIn and Twitter.