I will not present graphite here, if you end up reading this I assume you already have a graphite instance up and running. If not it is a matter of less than an hour to have a usable instance.
Hadoop uses metrics2 which allows multiple metrics output plugins to be used in parallel, supports dynamic reconfiguration of metrics plugins, provides metrics filtering, and allows all metrics to be exported via JMX.
Those metrics can be very easily exported to graphite to then be sliced and diced to your heart’s content.
You only need to modify the file hadoop-metrics2.properties by adding the following snippet:
# Sampling period *.period=10 # Grahite sink class *.sink.graphite.class=org.apache.hadoop.metrics2.sink.GraphiteSink # Location of your graphite instance *.sink.graphite.server_host=10.x.x.x *.sink.graphite.server_port=2003 # Define for each metric group (* in *.prefix) how it should be named # in graphite (part after the =) datanode.sink.graphite.metrics_prefix=hadoop.datanode namenode.sink.graphite.metrics_prefix=hadoop.namenode resourcemanager.sink.graphite.metrics_prefix=hadoop.resourcemanager nodemanager.sink.graphite.metrics_prefix=hadoop.nodemanager jobhistoryserver.sink.graphite.metrics_prefix=hadoop.jobhistoryserver journalnode.sink.graphite.metrics_prefix=hadoop.journalnode maptask.sink.graphite.metrics_prefix=hadoop.maptask reducetask.sink.graphite.metrics_prefix=hadoop.reducetask applicationhistoryserver.sink.graphite.metrics_prefix=hadoop.applicationhistoryserver
In Ambari, just go to HDFS > Config > Advanced hadoop-metrics2.properties, the location for other distributions should be trivial to find.
After that restart hdfs and all relevant services you asked to monitor (if you asked to monitor resourcemanager, restart the resource managers and so on).
That’s it, you’re set.
If you are on HDP, you can go a bit further. HDP actually ships with a grafana instance (if you installed Ambari metrics) which can use graphite a data source. Data will be the same, display will be a tad prettier.
This uses graphite web (port 80 per default) which needs to enable CORS. You can do it in apache (the default graphite web http server) by adding this line in your graphite vhost:
Header set Access-Control-Allow-Origin "*"