Infrastructure Archives - Page 2 of 2

Having set up basic ELK stack we probably should get on to ingesting some data.

Wait, what about schema?

Theoretically, we could play it very strict and define our expected documetns similar to the way we define SQL tables. But it’s not fun and besides, Elastic comes with sensible defaults. It has done a decent enough job for us to have never needed to correct it. Not bad at all.

Setting up the pipeline

Logstash operates in terms of pipelines. If I were to reduce it to the bare basics, each pipeline allows us to define where data comes from (input), how Logstash should pick events to process it (filter) and finally, where to send it to (output). The last bit is kinda obvious and easy – we want to send processed events to Elastic. And by virtue of running doker we happen to know that it’s hosted at http://elasticsearch:9200. Too easy.

Input

Being quite modular, Logstash operates in terms of plugins. And there are HEAPS to chose from! Apart from obvious bits you get to see in the documentation, one thing that might not be quite obvious – you can have multiple plugins running for one input stream!

Our telemetry agent of choice, telegraf supports sending metrics via Influx Line Protocol, but Logstash does not have a plugin for that – so we will need to craft something bit more complex. Suppose we want basic metrics and sql-stats

input {
	tcp {
		port => 8098
		type => "telegraf"
	} # opens up a TCP listener for events sent by telegraf (which happens to support Influx Line Protocol)
	tcp {
		port => 8099
		type => "sql-stats"
	} # opens up a TCP listener for events coded in Influx Line protocol
}

Filter

This bit is optional, but sometimes is very helpful. And again, we’ve got quite a variety of plugins to suit. Here we’ve got a chance to inspect incoming events and either transform or outright reject some. For example here we parse Influx Line protocol sent by telegraf and do some basic enhancements on data coming from SQL-stats poller:

filter {
	if [type] == "telegraf" {		
		dissect {
			mapping => {
				"message" => "%{measurement}.%{metric};%{tags} %{value} %{ts}"
			}
		}
		kv {
			source => "tags"
			field_split => ";"
		}	
		date {
			match => ["ts", "UNIX"]
		}
		mutate {
			convert => {
				"value" => "float"
			}
			remove_field => ["tags", "message", "ts", "port"]
		}
	}
	if [type] == "sql-stats" {		
		grok {
			match => {
				"message" => "%{WORD:measurement}(\.%{WORD:metric})?;%{GREEDYDATA:tags} (%{BASE10NUM:value}) %{NUMBER:ts}"
			}
		}
		kv {
			source => "tags"
			field_split => ";"
		}	
		date {
			match => ["ts", "UNIX"]
		}
		mutate {
			convert => {
				"value" => "float"
			}
			rename => {
				"counter" => "metric"
				"clerk_type" => "metric"
			}
			remove_field => ["tags", "message", "ts", "port", "host"]
		}
	}
}

All that’s left

To store the events, we invoke output like so:

output {
	if [type] == "telegraf" {		
		elasticsearch {
			hosts => "elasticsearch:9200"
			index => "telegraf-%{+YYYY.MM}"
		}
	}
	if [type] == "sql-stats" {		
		elasticsearch {
			hosts => "elasticsearch:9200"
			index => "sql-stats-%{+YYYY.MM}"
		}
	}
}

Diversify

It is possible to store SQL telemetry in another SQL database on the same server (you could even set up SQL Agent jobs to do the polling – all nicely packaged). We however thought it might be a good idea to not store all data on the same machine. We’d like to not overstrain the main database in time of pinch and completely decouple analytics from critical business processes.

ELK G stack

One of many ways to approach this is to introduce a (somewhat) free and open source ElasticSearch into the mix. And mightly Logstash for data ingestion. This is where we’d normally go on to Kibana for dashboards and nice UI (and we did end up running it), but the main focus of this exercise will still fall onto Grafana.

Setting it up

Theres no point repeating official documentation for respective products, let’s instead write up a docker-compose file:

version: '3' services: elasticsearch: image: docker.elastic.co/elasticsearch/elasticsearch:7.6.1 environment: - node.name=elastic01 - discovery.type=single-node - bootstrap.memory_lock=true - "ES_JAVA_OPTS=-Xms512m -Xmx512m" volumes: - ./elastic:/usr/share/elasticsearch/data logstash: image: docker.elastic.co/logstash/logstash:7.6.1 volumes: - ./logstash-pipeline:/usr/share/logstash/pipeline/ - ./logstash-config/usr/share/logstash/config/ depends_on: - elasticsearch kibana: image: docker.elastic.co/kibana/kibana:7.6.1 environment: - ELASTICSEARCH_HOSTS=http://elasticsearch:9200 ports: - 5601:5601 depends_on: - elasticsearch grafana: image: grafana/grafana ports: - 3000:3000 depends_on: - elasticsearch

All that’s left to do is docker-compose up -d and run. Stay tuned for next posts in the series.

Category: Infrastructure

Monitoring SQL Server: setting up Logstash with telegraf

Input

Filter

Monitoring SQL Server: setting up ELK+G

It does not matter

We are here to help

Diversify

ELK G stack

Setting it up