Easy analytics with Grafana, Postgres, and Kubernetes. The fast and free path to application insights.

You don’t need to pay for commercial data metrics platforms — you can build powerful dashboards with Grafana for free on any Kubernetes cluster.

7 min readJul 8, 2019

Why we need data insights

If you have a business or app you probably want to know who’s using it and what they’re doing the most. You might also want to know whether your user count is increasing and what what kind of load your servers are under.

At MailSlurp we use Grafana, Postgres, and Kubernetes to graph growth and user acquisition in a free and flexible way. In this article we will show you how we do it and explain why open source data platforms are a good BI solution. First, some background.

Background

Our company, MailSlurp, is a API for testing transactional emails. It lets you send and receive email from randomly generated email addresses via REST. We use metrics to analyse product usage and costs. Data dashboards also help us to price our product and plan product features.

Existing SAAS offerings

Google Analytics is a popular free data platform. But do you trust Google with your data?

To answer data related questions most people usually turn to commercial analytics platforms. These services include Google Analytics, MixPanel, DataDog, Tableau and many more.

While these services offer a range of great features they do have a number of issues:

pricing (hidden costs)
privacy (data retention and ownership)
scale (will feature-set scale with requirements?)

MixPanel for instance costs $779 per year while hosted Tableau prices will make a small company’s head spin.

Free services like Google Analytics are very popular but raise serious privacy concerns: is your sensitive company data safe? Will Google sell your customer activity? Is your company under GDPR requirements?

Lastly many of these SAAS products work well at small scale. When your company’s need grow will they still provide the insights you require?

Open source solutions

Luckily, there are a number of open source data analytics options:

the ELK stack is a great one (Elasticsearch, Logstash, Kibana)
Prometheus is a excellent metrics platform
and Grafana is an extremely flexible and fun to use data dashboard package.

In this article we’ll concentrate on Grafana and how you can use it to gain powerful insights into your users — free, securely, and at scale.

What is Grafana?

No matter where your data is, or what kind of database it lives in, you can bring it together with Grafana. Beautifully.

Grafana is an self-hosted service that let’s you load in data and display it on interactive graphs, tables, and text areas. More specifically it is a backend server written in Go (that handles data connections and persisting graphs) and an interactive frontend written in Typescript (that displays information and handles graph creation).

The Grafana website has a great demo dashboard you can play around to get a feel for its features. Here’s what a typical dashboard looks like:

Deploying Grafana with Kubernetes

If your application runs on a Kubernetes cluster it’s easy to deploy Grafana. Grafana has a number of official docker images — all we need to do is write a Kubernetes service and deployment for the image and we can access it via port-forwarding. MailSlurp relies heavily on Kubernetes so integrating Grafana was easy.

Here’s our full Kubernetes setup for Grafana. Notice that we create a PersistentVolumeClaim. This lets the Grafana image persist dashboards and config to disk. We also specify grafana/grafana:5.0.0 as our image as later versions made Docker deployments a little bit more difficult.

# deployment of grafana 5 docker image
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: grafana
  name: grafana-deployment
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      component: grafana
  template:
    metadata:
      labels:
        component: grafana
    spec:
      volumes:
      - name: grafana-claim
        persistentVolumeClaim:
          claimName: grafana-claim
      containers:
      - name: grafana
        image: grafana/grafana:5.0.0
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 3000
        resources:
          limits:
            cpu: 500m
            memory: 2500Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /var/lib/grafana
          name: grafana-claim
---
# grafana persistent volume claim for storing dashboards
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  creationTimestamp: null
  labels:
    component: grafana
  name: grafana-claim
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
# grafana service for accessing via port 3000
apiVersion: v1
kind: Service
metadata:
  name: grafana-ip-service
  namespace: default
spec:
  type: ClusterIP 
  selector:
    component: grafana
  ports:
  - port: 3000
    targetPort: 3000

If we save this to a YAML file and apply it to our cluster we should see the Grafana service and pods with Kubectl.

kubectl apply -f grafana.yml

Connecting to Grafana inside a cluster

Now that we have deployed Grafana we should see its pods, deployments and services with Kubectl.

As we did not add an ingress rule for this service we can only access the Grafana frontend via port-forwarding. This suits us nicely as we don’t want to expose our data to the outside world. (If you did want to expose the service, just add an ingress route for this service.)

We can connect to the Grafana service using the Kubectl port-forward command to make the remote port 3000 available on our localhost.

kubectl port-forward svc/grafana-ip-service 3000:3000

A screenshot of the initial login page accessed via port-forwarding

Now if you type http://localhost:3000into a browser you’ll see the default Grafana login page. The initial username and password are admin and admin. I recommend changing these immediately.

Configuring a datasource (connecting Postgres)

Once we have Grafana deployed and accessible via port-forwarding we need to enable a datasource. This will allow us to query a database and build graphical dashboards from the results.

Many applications these days run on Postgres but Grafana also supports MySQL, Graphite and a variety of other data sources.

Dashboards are collections of graphs, text, and tables that display the results of queries against your datasource.

If you navigate to the configuration page and click on datasources you will see a configuration form. Use this to add your application database. You may want to create a special READ_ONLY database user for Grafana so that you don’t accidentally destroy data while building your graphs.

Screenshot of how to add Postgres datasource to Grafana

Building dashboards with Grafana

Now that you have a datasource enabled you can build your first dashboard. Dashboards are collections of graphs, text, and tables that display the results of queries against your datasource. So what you want to display dictates the type of graph you should use and the type of query you should write.

A simple Text example

For a simple example let’s display the total number of rows in a table.

Say we have an event table that our application writes to each time it does something interesting. Let’s query the total event count and display it in a text field. The steps are as follows:

Create a new Dashboard (or select an existing one)
Create a new panel and select “Text” type
Edit the panel
Change the output format to “Table” and write an SQL query
select count(*) from event will display the total rows in the event table

Screenshots show the process of adding a new panel and editing a query

Timeseries data

What about something more complicated (and interesting). Let’s graph the events over time on a line graph. This time the query is a bit more complicated.

Grafana has a number of macros for use with Postgres timeseries data. These include $__timeGroup and $__timeFilter. We can use these macros to easily partition data into time groups so that we can graph it over time.

SELECT
  $__timeGroup(created_at, '1h'),
  count(*)
FROM
  event
WHERE
  $__timeFilter(created_at)
group by 1
order by 1

Notice that the query groups and orders by the $__timeGroup macro result. This will allow the line in our line chart to fluctuate over 1hour periods so that we see jagged data instead of a straight line. Here’s an example line chart from the Grafana demo page.

Example of a Grafana line graph using the Postgres macros.

Conclusion

Understanding application data can be crucial to a businesses success. Having insights let you focus on what matters. There are many commercial analytics platforms but most have weaknesses. The open source combination of Grafana, Kubernetes, and Postgres is a free, scalable and secure way to deploy your own data dashboards. MailSlurp, an email testing API, using this stack to analyse customer usage and plan features. We hope you gain as much out of it as we did!

❤ MailSlurp

Addendum

If your application sends or receives emails in any way, use MailSlurp to test that functionality end to end. Create email addresses during tests, trigger email functions and then verify the results. Check it out.