Monitoring SQL Server: setting up Logstash with telegraf

Having set up basic ELK stack we probably should get on to ingesting some data.

Wait, what about schema?

Theoretically, we could play it very strict and define our expected documetns similar to the way we define SQL tables. But it’s not fun and besides, Elastic comes with sensible defaults. It has done a decent enough job for us to have never needed to correct it. Not bad at all.

Setting up the pipeline

Logstash operates in terms of pipelines. If I were to reduce it to the bare basics, each pipeline allows us to define where data comes from (input), how Logstash should pick events to process it (filter) and finally, where to send it to (output). The last bit is kinda obvious and easy – we want to send processed events to Elastic. And by virtue of running doker we happen to know that it’s hosted at http://elasticsearch:9200. Too easy.

Input

Being quite modular, Logstash operates in terms of plugins. And there are HEAPS to chose from! Apart from obvious bits you get to see in the documentation, one thing that might not be quite obvious – you can have multiple plugins running for one input stream!

Our telemetry agent of choice, telegraf supports sending metrics via Influx Line Protocol, but Logstash does not have a plugin for that – so we will need to craft something bit more complex. Suppose we want basic metrics and sql-stats

input {
	tcp {
		port => 8098
		type => "telegraf"
	} # opens up a TCP listener for events sent by telegraf (which happens to support Influx Line Protocol)
	tcp {
		port => 8099
		type => "sql-stats"
	} # opens up a TCP listener for events coded in Influx Line protocol
}

Filter

This bit is optional, but sometimes is very helpful. And again, we’ve got quite a variety of plugins to suit. Here we’ve got a chance to inspect incoming events and either transform or outright reject some. For example here we parse Influx Line protocol sent by telegraf and do some basic enhancements on data coming from SQL-stats poller:

filter {
	if [type] == "telegraf" {		
		dissect {
			mapping => {
				"message" => "%{measurement}.%{metric};%{tags} %{value} %{ts}"
			}
		}
		kv {
			source => "tags"
			field_split => ";"
		}	
		date {
			match => ["ts", "UNIX"]
		}
		mutate {
			convert => {
				"value" => "float"
			}
			remove_field => ["tags", "message", "ts", "port"]
		}
	}
	if [type] == "sql-stats" {		
		grok {
			match => {
				"message" => "%{WORD:measurement}(\.%{WORD:metric})?;%{GREEDYDATA:tags} (%{BASE10NUM:value}) %{NUMBER:ts}"
			}
		}
		kv {
			source => "tags"
			field_split => ";"
		}	
		date {
			match => ["ts", "UNIX"]
		}
		mutate {
			convert => {
				"value" => "float"
			}
			rename => {
				"counter" => "metric"
				"clerk_type" => "metric"
			}
			remove_field => ["tags", "message", "ts", "port", "host"]
		}
	}
}

All that’s left

To store the events, we invoke output like so:

output {
	if [type] == "telegraf" {		
		elasticsearch {
			hosts => "elasticsearch:9200"
			index => "telegraf-%{+YYYY.MM}"
		}
	}
	if [type] == "sql-stats" {		
		elasticsearch {
			hosts => "elasticsearch:9200"
			index => "sql-stats-%{+YYYY.MM}"
		}
	}
}

Monitoring SQL Server: setting up ELK+G

In 9 cases out of 10 our clients have some sort of database that they want to interface with. And 8 out of those 9 cases the database is going to be SQL Server. Yes, this is us being biased, but you know what?

It does not matter

The important bit is – out clients like to know how the database is doing. Some are happy to pay for commercial APMs, others either have very specific needs or love the challenge to DIY.

We are here to help

One way to get better picture of what’s happening with the DB would be to keep grabbing vitals over time and plotting them on a graph of some sort. Grafana is a fantastic way to achieve that. It supports a whole bunch of backends (including SQL server) and allows insane amount of customisations.

Diversify

It is possible to store SQL telemetry in another SQL database on the same server (you could even set up SQL Agent jobs to do the polling – all nicely packaged). We however thought it might be a good idea to not store all data on the same machine. We’d like to not overstrain the main database in time of pinch and completely decouple analytics from critical business processes.

ELK G stack

One of many ways to approach this is to introduce a (somewhat) free and open source ElasticSearch into the mix. And mightly Logstash for data ingestion. This is where we’d normally go on to Kibana for dashboards and nice UI (and we did end up running it), but the main focus of this exercise will still fall onto Grafana.

Setting it up

Theres no point repeating official documentation for respective products, let’s instead write up a docker-compose file:

version: '3'
services:
    elasticsearch:
        image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1
        environment:
            - node.name=elastic01
            - discovery.type=single-node  
            - bootstrap.memory_lock=true
            - "ES_JAVA_OPTS=-Xms512m -Xmx512m"        
        volumes:
            - ./elastic:/usr/share/elasticsearch/data
    logstash:
        image: docker.elastic.co/logstash/logstash:7.6.1
        volumes: 
            - ./logstash-pipeline:/usr/share/logstash/pipeline/
            - ./logstash-config/usr/share/logstash/config/
        depends_on:
          - elasticsearch
    kibana:
        image: docker.elastic.co/kibana/kibana:7.6.1
        environment:
          - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
        ports:
          - 5601:5601
        depends_on:
          - elasticsearch
    grafana:
        image: grafana/grafana
        ports:
          - 3000:3000
        depends_on:
          - elasticsearch

All that’s left to do is docker-compose up -d and run. Stay tuned for next posts in the series.