Resource monitoring with collectd

February 20, 2023    Article    1355 words    7 mins read

Telemetry is bad. Everybody knows it. However, monitoring the health and performance of one’s server is critical for preventing unnecessary problems and it is important to have a system in place that can collect system and application data accurately and display it in an understandable manner.

What is collectd?

collectd is a daemon which collects system and application performance metrics periodically and provides mechanisms to store the values in a variety of ways, for example in RRD files. collectd gathers metrics from various sources, e.g. the operating system, applications, logfiles and external devices, and stores this information or makes it available over the network. Those statistics can be used to monitor systems, find performance bottlenecks (i.e. performance analysis) and predict future system load (i.e. capacity planning). Or if you just want pretty graphs of your private server and are fed up with some homegrown solution you’re at the right place, too ;). homepage

This configuration monitors the following:

  • basic system resources (load, processes, CPU cores)
  • swap and RAM
  • disk(s) usage
  • networking on ports 22 (SSH), 80 (HTTP) and 443 (HTTPS)
  • number of logged-in users
  • sensors data

There are some prerequisites, like a machine with a Debian-based OS (if not Debian-based change your package manager from apt) and a webserver installed (preferably nginx).

Start by installing collectd:

$ sudo apt install collectd

Open /etc/collectd/collectd.conf and replace its contents with:

Hostname "localhost"
FQDNLookup true
BaseDir "/var/lib/collectd"
PluginDir "/usr/lib/collectd"
TypesDB "/usr/share/collectd/types.db"
Interval 30
Timeout 2
ReadThreads 4

LoadPlugin syslog
<plugin syslog>
	LogLevel info
</plugin>

LoadPlugin cpu
LoadPlugin users
LoadPlugin load
LoadPlugin memory
LoadPlugin network
LoadPlugin processes
LoadPlugin sensors

LoadPlugin df
<plugin "df">
	Device "/dev/sda1"
	MountPoint "/"
	FSType "ext4"
	IgnoreSelected false
	ReportInodes false
</plugin>

LoadPlugin disk
<plugin disk>
</plugin>

LoadPlugin interface
<plugin interface>
	Interface "eth0"
	IgnoreSelected false
</plugin>

LoadPlugin rrdtool
<plugin rrdtool>
	DataDir "/var/lib/collectd/rrd"
</plugin>

LoadPlugin swap
<plugin swap>
	ReportByDevice true
</plugin>

LoadPlugin tcpconns
<plugin tcpconns>
	LocalPort "80"
	LocalPort "443"
	LocalPort "22"
</plugin>

<Include "/etc/collectd/collectd.conf.d">
	Filter "*.conf"
</Include>

There are some configuration values that might be different for you: networking interface might not be named eth0, main disk might not be /dev/sda1 and you might want to monitor different ports.

Start the collectd service and give it a bit of time (1-3 minutes) to gather some data.

$ sudo systemctl start collectd

Now that we configured collectd to output RRD files (into /var/lib/collectd/rrd) we need to find a way to process those into pretty graphs (actually you can use something like RRDtool to parse them but everybody loves pretty graphs, right?). And Jarmon does just that.

What is Jarmon?

Jarmon is a JavaScript library for creating interactive charts from RRD data, entirely within your web browser.

Jarmon downloads multiple RRD files in parallel from one or more servers using asynchronous XHR requests, extracts and merges the data and then plots the results using HTML5 canvas elements. Jarmon includes a fully working dashboard application - a graphical frontend to collectd.

Jarmon depends on the JavascriptRRD, Flot and jQuery libraries. homepage

Download the latest version (should be 11.8), unzip the archive somewhere and move the docs/examples directory somewhere in your path, for example /var/www/mon/.

Open the index.html file and replace

<script type="text/javascript" src="../../jarmon/jarmon.js"></script>

with

<script type="text/javascript" src="jarmon.js"></script>

And copy the jarmon.js file from the ZIP archive to /var/www/mon/.

To configure Jarmon to use our RRD file we need to supply a recipe file. Open the index.html file again and replace

<script type="text/javascript" src="jarmon_example_recipes.js"></script>

with

<script type="text/javascript" src="config.js"></script>

And copy the JavaScript below into a new file named config.js that will exist inside the /var/www/mon/ directory:

if (typeof(jarmon) === 'undefined') {
	var jarmon = {};
}
jarmon.TAB_RECIPES_STANDARD = [
	['System', [
		'load',
		'processes',
		'cpu-0',
		'cpu-1',
		'fork-rate',
		'memory',
		'swap-io',
		'swap-disk'
		]
	],
	['Disk', [
		'disk-sda1'
		]
	],
	['Networking', [
		'interface',
		'tcpconns-22-local',
		'tcpconns-80-local',
		'tcpconns-443-local'
		]
	],
	['Others', [
		'users',
		'sensors'
		]
	]
];

jarmon.CHART_RECIPES_COLLECTD = {
	'cpu-0': {
		title: 'CPU0 Usage',
		data: [
			['data/cpu-0/cpu-idle.rrd', 0, 'Idle', '%'],
			['data/cpu-0/cpu-interrupt.rrd', 0, 'Interrupt', '%'],
			['data/cpu-0/cpu-nice.rrd', 0, 'Nice', '%'],
			['data/cpu-0/cpu-softirq.rrd', 0, 'SoftIRQ', '%'],
			['data/cpu-0/cpu-system.rrd', 0, 'System', '%'],
			['data/cpu-0/cpu-user.rrd', 0, 'User', '%'],
			['data/cpu-0/cpu-wait.rrd', 0, 'Wait', '%'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'cpu-1': {
		title: 'CPU1 Usage',
		data: [
			['data/cpu-1/cpu-idle.rrd', 0, 'Idle', '%'],
			['data/cpu-1/cpu-interrupt.rrd', 0, 'Interrupt', '%'],
			['data/cpu-1/cpu-nice.rrd', 0, 'Nice', '%'],
			['data/cpu-1/cpu-softirq.rrd', 0, 'SoftIRQ', '%'],
			['data/cpu-1/cpu-system.rrd', 0, 'System', '%'],
			['data/cpu-1/cpu-user.rrd', 0, 'User', '%'],
			['data/cpu-1/cpu-wait.rrd', 0, 'Wait', '%'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'users': {
		title: 'Users',
		data: [
			['data/users/users.rrd', 0, 'Users', '#']
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
	},
	'load': {
		title: 'Load Average',
		data: [
			['data/load/load.rrd', 'shortterm', 'Short Term', ''],
			['data/load/load.rrd', 'midterm', 'Medium Term', ''],
			['data/load/load.rrd', 'longterm', 'Long Term', '']
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
	},
	'sensors': {
		title: 'Sensors',
        data: [
			['data/sensors-coretemp-isa-0000/temperature-temp2.rrd', 0, 'CPU', '°C'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
	},
	'processes': {
		title: 'Processes state',
		data: [
			['data/processes/ps_state-blocked.rrd', 0, 'Blocked', '#'],
			['data/processes/ps_state-paging.rrd', 0, 'Paging', '#'],
			['data/processes/ps_state-running.rrd', 0, 'Running', '#'],
			['data/processes/ps_state-zombies.rrd', 0, 'Zombie', '#'],
			['data/processes/ps_state-stopped.rrd', 0, 'Stopped', '#'],
			['data/processes/ps_state-sleeping.rrd', 0, 'Sleeping', '#'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'fork-rate': {
		title: 'Fork rate',
		data: [
			['data/processes/fork_rate.rrd', 0, 'Fork rate', '#'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'memory': {
		title: 'Memory',
		data: [
			['data/memory/memory-buffered.rrd', 0, 'Buffered', 'B'],
			['data/memory/memory-used.rrd', 0, 'Used', 'B'],
			['data/memory/memory-cached.rrd', 0, 'Cached', 'B'],
			['data/memory/memory-free.rrd', 0, 'Free', 'B']
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'swap-io': {
		title: 'Swap',
		data: [
			['data/swap/swap_io-in.rrd', 0, 'IO in', 'B'],
			['data/swap/swap_io-out.rrd', 0, 'IO out', 'B'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'swap-disk': {
		title: 'Swap /dev/sda2',
		data: [
			['data/swap-dev_sda2/swap-free.rrd', 0, 'Free', 'Bytes'],
			['data/swap-dev_sda2/swap-used.rrd', 0, 'Used', 'Bytes'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'disk-sda1': {
		title: '/dev/sda1',
		data: [
			['data/disk-sda1/disk_octets.rrd', 0, 'disk_octets', 'Bytes/s'],
			['data/disk-sda1/disk_ops.rrd', 0, 'disk_ops', 'Ops/s'],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'interface': {
		title: 'Interface',
		data: [
			['data/interface-eth0/if_octets.rrd', 0, 'if_octets', 'Bytes/s']
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS)
	},
	'tcpconns-22-local': {
		title: 'Port 22 (SSH)',
		data: [
			['data/tcpconns-22-local/tcp_connections-CLOSING.rrd', 0, 'CLOSING', ''],
			['data/tcpconns-22-local/tcp_connections-SYN_SENT.rrd', 0, 'SYN_SENT', ''],
			['data/tcpconns-22-local/tcp_connections-LISTEN.rrd', 0, 'LISTEN', ''],
			['data/tcpconns-22-local/tcp_connections-TIME_WAIT.rrd', 0, 'TIME_WAIT', ''],
			['data/tcpconns-22-local/tcp_connections-SYN_RECV.rrd', 0, 'SYN_RECV', ''],
			['data/tcpconns-22-local/tcp_connections-CLOSE_WAIT.rrd', 0, 'CLOSE_WAIT', ''],
			['data/tcpconns-22-local/tcp_connections-CLOSED.rrd', 0, 'CLOSED', ''],
			['data/tcpconns-22-local/tcp_connections-LAST_ACK.rrd', 0, 'LAST_ACK', ''],
			['data/tcpconns-22-local/tcp_connections-FIN_WAIT1.rrd', 0, 'FIN_WAIT1', ''],
			['data/tcpconns-22-local/tcp_connections-FIN_WAIT2.rrd', 0, 'FIN_WAIT2', ''],
			['data/tcpconns-22-local/tcp_connections-ESTABLISHED.rrd', 0, 'ESTABLISHED', ''],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'tcpconns-80-local': {
		title: 'Port 80 (HTTP)',
		data: [
			['data/tcpconns-80-local/tcp_connections-CLOSING.rrd', 0, 'CLOSING', ''],
			['data/tcpconns-80-local/tcp_connections-SYN_SENT.rrd', 0, 'SYN_SENT', ''],
			['data/tcpconns-80-local/tcp_connections-LISTEN.rrd', 0, 'LISTEN', ''],
			['data/tcpconns-80-local/tcp_connections-TIME_WAIT.rrd', 0, 'TIME_WAIT', ''],
			['data/tcpconns-80-local/tcp_connections-SYN_RECV.rrd', 0, 'SYN_RECV', ''],
			['data/tcpconns-80-local/tcp_connections-CLOSE_WAIT.rrd', 0, 'CLOSE_WAIT', ''],
			['data/tcpconns-80-local/tcp_connections-CLOSED.rrd', 0, 'CLOSED', ''],
			['data/tcpconns-80-local/tcp_connections-LAST_ACK.rrd', 0, 'LAST_ACK', ''],
			['data/tcpconns-80-local/tcp_connections-FIN_WAIT1.rrd', 0, 'FIN_WAIT1', ''],
			['data/tcpconns-80-local/tcp_connections-FIN_WAIT2.rrd', 0, 'FIN_WAIT2', ''],
			['data/tcpconns-80-local/tcp_connections-ESTABLISHED.rrd', 0, 'ESTABLISHED', ''],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	},
	'tcpconns-443-local': {
		title: 'Port 443 (HTTPS)',
		data: [
			['data/tcpconns-443-local/tcp_connections-CLOSING.rrd', 0, 'CLOSING', ''],
			['data/tcpconns-443-local/tcp_connections-SYN_SENT.rrd', 0, 'SYN_SENT', ''],
			['data/tcpconns-443-local/tcp_connections-LISTEN.rrd', 0, 'LISTEN', ''],
			['data/tcpconns-443-local/tcp_connections-TIME_WAIT.rrd', 0, 'TIME_WAIT', ''],
			['data/tcpconns-443-local/tcp_connections-SYN_RECV.rrd', 0, 'SYN_RECV', ''],
			['data/tcpconns-443-local/tcp_connections-CLOSE_WAIT.rrd', 0, 'CLOSE_WAIT', ''],
			['data/tcpconns-443-local/tcp_connections-CLOSED.rrd', 0, 'CLOSED', ''],
			['data/tcpconns-443-local/tcp_connections-LAST_ACK.rrd', 0, 'LAST_ACK', ''],
			['data/tcpconns-443-local/tcp_connections-FIN_WAIT1.rrd', 0, 'FIN_WAIT1', ''],
			['data/tcpconns-443-local/tcp_connections-FIN_WAIT2.rrd', 0, 'FIN_WAIT2', ''],
			['data/tcpconns-443-local/tcp_connections-ESTABLISHED.rrd', 0, 'ESTABLISHED', ''],
		],
		options: jQuery.extend(true, {}, jarmon.Chart.BASE_OPTIONS,
			jarmon.Chart.STACKED_OPTIONS)
	}
};

Your sensors data might be different, after starting collectd check out the file structure inside the RRD output directory:

$ ls /var/lib/collectd/rrd/localhost

Look for sensors-* directories and change your config.js file accordingly. If your HDD has a temperature sensor it can be accessed by loading the drivetemp kernel module and restarting collectd.

$ sudo modprobe drivetemp
$ sudo systemctl restart collectd

HDD temperature sensor data will be available in /var/lib/collectd/rrd/localhost/sensors-drivetemp-scsi-0-0/temperature-temp1.rrd in my case, so I need to add one extra line to the config.js file to make Jarmon pick up the sensor data and show the modified graphs.

After this line

			['data/sensors-coretemp-isa-0000/temperature-temp2.rrd', 0, 'CPU', '°C'],

add

			['data/sensors-drivetemp-scsi-0-0/temperature-temp1.rrd', 0, 'HDD', '°C']

Link Jarmon to the collectd RRD output directory:

$ ln -s /var/lib/collectd/rrd/localhost /var/www/mon/data

The structure of your /var/www/mon/ directory should be identical to this:

$ tree /var/www/mon/
/var/www/mon/
├── assets
│   ├── css
│   │   ├── jquerytools.dateinput.skin1.css
│   │   ├── style.css
│   │   └── tabs-no-images.css
│   ├── icons
│   │   ├── calendar.png
│   │   ├── loading.gif
│   │   ├── next.gif
│   │   └── prev.gif
│   └── js
│       └── dependencies.js
├── data -> /var/lib/collectd/rrd/localhost
├── index.html
├── config.js
└── jarmon.js

Any extra files can be safely removed.

That’s it, point your web browser to the vhost that’s configured to serve /var/www/mon (configuring nginx or Apache is out of the scope of this article) and enjoy your pretty graphs.