Resource monitoring with collectd
Telemetry is bad. Everybody knows it. However, monitoring the health and performance of one’s server is critical for preventing unnecessary problems and it is important to have a system in place that can collect system and application data accurately and display it in an understandable manner.
What is collectd?
collectd
is a daemon which collects system and application performance metrics periodically and provides mechanisms to store the values in a variety of ways, for example in RRD files.collectd
gathers metrics from various sources, e.g. the operating system, applications, logfiles and external devices, and stores this information or makes it available over the network. Those statistics can be used to monitor systems, find performance bottlenecks (i.e. performance analysis) and predict future system load (i.e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you’re at the right place, too ;). homepage
This configuration monitors the following:
- basic system resources (load, processes, CPU cores)
- swap and RAM
- disk(s) usage
- networking on ports 22 (SSH), 80 (HTTP) and 443 (HTTPS)
- number of logged-in users
- sensors data
There are some prerequisites, like a machine with a Debian-based OS (if not Debian-based change your package manager from apt
) and a webserver installed (preferably nginx).
Start by installing collectd
:
$ sudo apt install collectd
Open /etc/collectd/collectd.conf
and replace its contents with:
Hostname "localhost"
FQDNLookup true
BaseDir "/var/lib/collectd"
PluginDir "/usr/lib/collectd"
TypesDB "/usr/share/collectd/types.db"
Interval 30
Timeout 2
ReadThreads 4
LoadPlugin syslog
<plugin syslog>
LogLevel info
</plugin>
LoadPlugin cpu
LoadPlugin users
LoadPlugin load
LoadPlugin memory
LoadPlugin network
LoadPlugin processes
LoadPlugin sensors
LoadPlugin df
<plugin "df">
Device "/dev/sda1"
MountPoint "/"
FSType "ext4"
IgnoreSelected false
ReportInodes false
</plugin>
LoadPlugin disk
<plugin disk>
</plugin>
LoadPlugin interface
<plugin interface>
Interface "eth0"
IgnoreSelected false
</plugin>
LoadPlugin rrdtool
<plugin rrdtool>
DataDir "/var/lib/collectd/rrd"
</plugin>
LoadPlugin swap
<plugin swap>
ReportByDevice true
</plugin>
LoadPlugin tcpconns
<plugin tcpconns>
LocalPort "80"
LocalPort "443"
LocalPort "22"
</plugin>
<Include "/etc/collectd/collectd.conf.d">
Filter "*.conf"
</Include>
There are some configuration values that might be different for you: networking interface might not be named eth0
, main disk might not be /dev/sda1
and you might want to monitor different ports.
Start the collectd
service and give it a bit of time (1-3 minutes) to gather some data.
$ sudo systemctl start collectd
Now that we configured collectd
to output RRD files (into /var/lib/collectd/rrd
) we need to find a way to process those into pretty graphs (actually you can use something like RRDtool to parse them but everybody loves pretty graphs, right?). And Jarmon does just that.
What is Jarmon?
Jarmon is a JavaScript library for creating interactive charts from RRD data, entirely within your web browser.
Jarmon downloads multiple RRD files in parallel from one or more servers using asynchronous XHR requests, extracts and merges the data and then plots the results using HTML5 canvas elements. Jarmon includes a fully working dashboard application - a graphical frontend to collectd.
Jarmon depends on the JavascriptRRD, Flot and jQuery libraries. homepage
Download the latest version (should be 11.8), unzip the archive somewhere and move the docs/examples
directory somewhere in your path, for example /var/www/mon/
.
Open the index.html
file and replace
<script type="text/javascript" src="../../jarmon/jarmon.js"></script>
with
<script type="text/javascript" src="jarmon.js"></script>
And copy the jarmon.js file from the ZIP archive to /var/www/mon/
.
To configure Jarmon to use our RRD file we need to supply a recipe file. Open the index.html
file again and replace
<script type="text/javascript" src="jarmon_example_recipes.js"></script>
with
<script type="text/javascript" src="config.js"></script>
And copy the JavaScript below into a new file named config.js
that will exist inside the /var/www/mon/
directory:
if (typeof(jarmon) === 'undefined') {
var jarmon = {};
}
jarmon.TAB_RECIPES_STANDARD = [
['System', [
'load',
'processes',
'cpu-0',
'cpu-1',
'fork-rate',
'memory',
'swap-io',
'swap-disk'
]
],
['Disk', [
'disk-sda1'
]
],
['Networking', [
'interface',
'tcpconns-22-local',
'tcpconns-80-local',
'tcpconns-443-local'
]
],
['Others', [
'users',
'sensors'
]
]
];
jarmon.CHART_RECIPES_COLLECTD = {
'cpu-0': {
title: 'CPU0 Usage',
data: [
['data/cpu-0/cpu-idle.rrd', 0, 'Idle', '%'],
['data/cpu-0/cpu-interrupt.rrd', 0, 'Interrupt', '%'],
['data/cpu-0/cpu-nice.rrd', 0, 'Nice', '%'],
['data/cpu-0/cpu-softirq.rrd', 0, 'SoftIRQ', '%'],
['data/cpu-0/cpu-system.rrd', 0, 'System', '%'],
['data/cpu-0/cpu-user.rrd', 0, 'User', '%'],
['data/cpu-0/cpu-wait.rrd', 0, 'Wait', '%'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'cpu-1': {
title: 'CPU1 Usage',
data: [
['data/cpu-1/cpu-idle.rrd', 0, 'Idle', '%'],
['data/cpu-1/cpu-interrupt.rrd', 0, 'Interrupt', '%'],
['data/cpu-1/cpu-nice.rrd', 0, 'Nice', '%'],
['data/cpu-1/cpu-softirq.rrd', 0, 'SoftIRQ', '%'],
['data/cpu-1/cpu-system.rrd', 0, 'System', '%'],
['data/cpu-1/cpu-user.rrd', 0, 'User', '%'],
['data/cpu-1/cpu-wait.rrd', 0, 'Wait', '%'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'users': {
title: 'Users',
data: [
['data/users/users.rrd', 0, 'Users', '#']
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
},
'load': {
title: 'Load Average',
data: [
['data/load/load.rrd', 'shortterm', 'Short Term', ''],
['data/load/load.rrd', 'midterm', 'Medium Term', ''],
['data/load/load.rrd', 'longterm', 'Long Term', '']
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
},
'sensors': {
title: 'Sensors',
data: [
['data/sensors-coretemp-isa-0000/temperature-temp2.rrd', 0, 'CPU', '°C'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
},
'processes': {
title: 'Processes state',
data: [
['data/processes/ps_state-blocked.rrd', 0, 'Blocked', '#'],
['data/processes/ps_state-paging.rrd', 0, 'Paging', '#'],
['data/processes/ps_state-running.rrd', 0, 'Running', '#'],
['data/processes/ps_state-zombies.rrd', 0, 'Zombie', '#'],
['data/processes/ps_state-stopped.rrd', 0, 'Stopped', '#'],
['data/processes/ps_state-sleeping.rrd', 0, 'Sleeping', '#'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'fork-rate': {
title: 'Fork rate',
data: [
['data/processes/fork_rate.rrd', 0, 'Fork rate', '#'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'memory': {
title: 'Memory',
data: [
['data/memory/memory-buffered.rrd', 0, 'Buffered', 'B'],
['data/memory/memory-used.rrd', 0, 'Used', 'B'],
['data/memory/memory-cached.rrd', 0, 'Cached', 'B'],
['data/memory/memory-free.rrd', 0, 'Free', 'B']
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'swap-io': {
title: 'Swap',
data: [
['data/swap/swap_io-in.rrd', 0, 'IO in', 'B'],
['data/swap/swap_io-out.rrd', 0, 'IO out', 'B'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'swap-disk': {
title: 'Swap /dev/sda2',
data: [
['data/swap-dev_sda2/swap-free.rrd', 0, 'Free', 'Bytes'],
['data/swap-dev_sda2/swap-used.rrd', 0, 'Used', 'Bytes'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'disk-sda1': {
title: '/dev/sda1',
data: [
['data/disk-sda1/disk_octets.rrd', 0, 'disk_octets', 'Bytes/s'],
['data/disk-sda1/disk_ops.rrd', 0, 'disk_ops', 'Ops/s'],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'interface': {
title: 'Interface',
data: [
['data/interface-eth0/if_octets.rrd', 0, 'if_octets', 'Bytes/s']
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
},
'tcpconns-22-local': {
title: 'Port 22 (SSH)',
data: [
['data/tcpconns-22-local/tcp_connections-CLOSING.rrd', 0, 'CLOSING', ''],
['data/tcpconns-22-local/tcp_connections-SYN_SENT.rrd', 0, 'SYN_SENT', ''],
['data/tcpconns-22-local/tcp_connections-LISTEN.rrd', 0, 'LISTEN', ''],
['data/tcpconns-22-local/tcp_connections-TIME_WAIT.rrd', 0, 'TIME_WAIT', ''],
['data/tcpconns-22-local/tcp_connections-SYN_RECV.rrd', 0, 'SYN_RECV', ''],
['data/tcpconns-22-local/tcp_connections-CLOSE_WAIT.rrd', 0, 'CLOSE_WAIT', ''],
['data/tcpconns-22-local/tcp_connections-CLOSED.rrd', 0, 'CLOSED', ''],
['data/tcpconns-22-local/tcp_connections-LAST_ACK.rrd', 0, 'LAST_ACK', ''],
['data/tcpconns-22-local/tcp_connections-FIN_WAIT1.rrd', 0, 'FIN_WAIT1', ''],
['data/tcpconns-22-local/tcp_connections-FIN_WAIT2.rrd', 0, 'FIN_WAIT2', ''],
['data/tcpconns-22-local/tcp_connections-ESTABLISHED.rrd', 0, 'ESTABLISHED', ''],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'tcpconns-80-local': {
title: 'Port 80 (HTTP)',
data: [
['data/tcpconns-80-local/tcp_connections-CLOSING.rrd', 0, 'CLOSING', ''],
['data/tcpconns-80-local/tcp_connections-SYN_SENT.rrd', 0, 'SYN_SENT', ''],
['data/tcpconns-80-local/tcp_connections-LISTEN.rrd', 0, 'LISTEN', ''],
['data/tcpconns-80-local/tcp_connections-TIME_WAIT.rrd', 0, 'TIME_WAIT', ''],
['data/tcpconns-80-local/tcp_connections-SYN_RECV.rrd', 0, 'SYN_RECV', ''],
['data/tcpconns-80-local/tcp_connections-CLOSE_WAIT.rrd', 0, 'CLOSE_WAIT', ''],
['data/tcpconns-80-local/tcp_connections-CLOSED.rrd', 0, 'CLOSED', ''],
['data/tcpconns-80-local/tcp_connections-LAST_ACK.rrd', 0, 'LAST_ACK', ''],
['data/tcpconns-80-local/tcp_connections-FIN_WAIT1.rrd', 0, 'FIN_WAIT1', ''],
['data/tcpconns-80-local/tcp_connections-FIN_WAIT2.rrd', 0, 'FIN_WAIT2', ''],
['data/tcpconns-80-local/tcp_connections-ESTABLISHED.rrd', 0, 'ESTABLISHED', ''],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
},
'tcpconns-443-local': {
title: 'Port 443 (HTTPS)',
data: [
['data/tcpconns-443-local/tcp_connections-CLOSING.rrd', 0, 'CLOSING', ''],
['data/tcpconns-443-local/tcp_connections-SYN_SENT.rrd', 0, 'SYN_SENT', ''],
['data/tcpconns-443-local/tcp_connections-LISTEN.rrd', 0, 'LISTEN', ''],
['data/tcpconns-443-local/tcp_connections-TIME_WAIT.rrd', 0, 'TIME_WAIT', ''],
['data/tcpconns-443-local/tcp_connections-SYN_RECV.rrd', 0, 'SYN_RECV', ''],
['data/tcpconns-443-local/tcp_connections-CLOSE_WAIT.rrd', 0, 'CLOSE_WAIT', ''],
['data/tcpconns-443-local/tcp_connections-CLOSED.rrd', 0, 'CLOSED', ''],
['data/tcpconns-443-local/tcp_connections-LAST_ACK.rrd', 0, 'LAST_ACK', ''],
['data/tcpconns-443-local/tcp_connections-FIN_WAIT1.rrd', 0, 'FIN_WAIT1', ''],
['data/tcpconns-443-local/tcp_connections-FIN_WAIT2.rrd', 0, 'FIN_WAIT2', ''],
['data/tcpconns-443-local/tcp_connections-ESTABLISHED.rrd', 0, 'ESTABLISHED', ''],
],
options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
jarmon.Chart.STACKED_OPTIONS)
}
};
Your sensors data might be different, after starting collectd
check out the file structure inside the RRD output directory:
$ ls /var/lib/collectd/rrd/localhost
Look for sensors-*
directories and change your config.js
file accordingly. If your HDD has a temperature sensor it can be accessed by loading the drivetemp
kernel module and restarting collectd
.
$ sudo modprobe drivetemp
$ sudo systemctl restart collectd
HDD temperature sensor data will be available in /var/lib/collectd/rrd/localhost/sensors-drivetemp-scsi-0-0/temperature-temp1.rrd
in my case, so I need to add one extra line to the config.js
file to make Jarmon pick up the sensor data and show the modified graphs.
After this line
['data/sensors-coretemp-isa-0000/temperature-temp2.rrd', 0, 'CPU', '°C'],
add
['data/sensors-drivetemp-scsi-0-0/temperature-temp1.rrd', 0, 'HDD', '°C']
Link Jarmon to the collectd
RRD output directory:
$ ln -s /var/lib/collectd/rrd/localhost /var/www/mon/data
The structure of your /var/www/mon/
directory should be identical to this:
$ tree /var/www/mon/
/var/www/mon/
├── assets
│ ├── css
│ │ ├── jquerytools.dateinput.skin1.css
│ │ ├── style.css
│ │ └── tabs-no-images.css
│ ├── icons
│ │ ├── calendar.png
│ │ ├── loading.gif
│ │ ├── next.gif
│ │ └── prev.gif
│ └── js
│ └── dependencies.js
├── data -> /var/lib/collectd/rrd/localhost
├── index.html
├── config.js
└── jarmon.js
Any extra files can be safely removed.
That’s it, point your web browser to the vhost that’s configured to serve /var/www/mon
(configuring nginx or Apache is out of the scope of this article) and enjoy your pretty graphs.