Easy analytics with Grafana, Postgres, and Kubernetes. The fast and free path to application insights.
You don’t need to pay for commercial data metrics platforms — you can build powerful dashboards with Grafana for free on any Kubernetes cluster.
Why we need data insights
If you have a business or app you probably want to know who’s using it and what they’re doing the most. You might also want to know whether your user count is increasing and what what kind of load your servers are under.
At MailSlurp we use Grafana, Postgres, and Kubernetes to graph growth and user acquisition in a free and flexible way. In this article we will show you how we do it and explain why open source data platforms are a good BI solution. First, some background.
Background
Our company, MailSlurp, is a API for testing transactional emails. It lets you send and receive email from randomly generated email addresses via REST. We use metrics to analyse product usage and costs. Data dashboards also help us to price our product and plan product features.
Existing SAAS offerings
To answer data related questions most people usually turn to commercial analytics platforms. These services include Google Analytics, MixPanel, DataDog, Tableau and many more.
While these services offer a range of great features they do have a number of issues:
- pricing (hidden costs)
- privacy (data retention and ownership)
- scale (will feature-set scale with requirements?)
MixPanel for instance costs $779 per year while hosted Tableau prices will make a small company’s head spin.
Free services like Google Analytics are very popular but raise serious privacy concerns: is your sensitive company data safe? Will Google sell your customer activity? Is your company under GDPR requirements?
Lastly many of these SAAS products work well at small scale. When your company’s need grow will they still provide the insights you require?
Open source solutions
Luckily, there are a number of open source data analytics options:
- the ELK stack is a great one (Elasticsearch, Logstash, Kibana)
- Prometheus is a excellent metrics platform
- and Grafana is an extremely flexible and fun to use data dashboard package.
In this article we’ll concentrate on Grafana and how you can use it to gain powerful insights into your users — free, securely, and at scale.
What is Grafana?
No matter where your data is, or what kind of database it lives in, you can bring it together with Grafana. Beautifully.
Grafana is an self-hosted service that let’s you load in data and display it on interactive graphs, tables, and text areas. More specifically it is a backend server written in Go (that handles data connections and persisting graphs) and an interactive frontend written in Typescript (that displays information and handles graph creation).
The Grafana website has a great demo dashboard you can play around to get a feel for its features. Here’s what a typical dashboard looks like:
Deploying Grafana with Kubernetes
If your application runs on a Kubernetes cluster it’s easy to deploy Grafana. Grafana has a number of official docker images — all we need to do is write a Kubernetes service and deployment for the image and we can access it via port-forwarding. MailSlurp relies heavily on Kubernetes so integrating Grafana was easy.
Here’s our full Kubernetes setup for Grafana. Notice that we create a PersistentVolumeClaim. This lets the Grafana image persist dashboards and config to disk. We also specify grafana/grafana:5.0.0
as our image as later versions made Docker deployments a little bit more difficult.
# deployment of grafana 5 docker image
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: grafana
name: grafana-deployment
namespace: default
spec:
replicas: 1
selector:
matchLabels:
component: grafana
template:
metadata:
labels:
component: grafana
spec:
volumes:
- name: grafana-claim
persistentVolumeClaim:
claimName: grafana-claim
containers:
- name: grafana
image: grafana/grafana:5.0.0
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
resources:
limits:
cpu: 500m
memory: 2500Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-claim
---
# grafana persistent volume claim for storing dashboards
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
labels:
component: grafana
name: grafana-claim
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
---
# grafana service for accessing via port 3000
apiVersion: v1
kind: Service
metadata:
name: grafana-ip-service
namespace: default
spec:
type: ClusterIP
selector:
component: grafana
ports:
- port: 3000
targetPort: 3000
If we save this to a YAML file and apply it to our cluster we should see the Grafana service and pods with Kubectl.
kubectl apply -f grafana.yml
Connecting to Grafana inside a cluster
Now that we have deployed Grafana we should see its pods, deployments and services with Kubectl.
As we did not add an ingress rule for this service we can only access the Grafana frontend via port-forwarding. This suits us nicely as we don’t want to expose our data to the outside world. (If you did want to expose the service, just add an ingress route for this service.)
We can connect to the Grafana service using the Kubectl port-forward command to make the remote port 3000 available on our localhost.
kubectl port-forward svc/grafana-ip-service 3000:3000
Now if you type http://localhost:3000
into a browser you’ll see the default Grafana login page. The initial username and password are admin
and admin
. I recommend changing these immediately.
Configuring a datasource (connecting Postgres)
Once we have Grafana deployed and accessible via port-forwarding we need to enable a datasource. This will allow us to query a database and build graphical dashboards from the results.
Many applications these days run on Postgres but Grafana also supports MySQL, Graphite and a variety of other data sources.
Dashboards are collections of graphs, text, and tables that display the results of queries against your datasource.
If you navigate to the configuration page and click on datasources you will see a configuration form. Use this to add your application database. You may want to create a special READ_ONLY database user for Grafana so that you don’t accidentally destroy data while building your graphs.
Building dashboards with Grafana
Now that you have a datasource enabled you can build your first dashboard. Dashboards are collections of graphs, text, and tables that display the results of queries against your datasource. So what you want to display dictates the type of graph you should use and the type of query you should write.
A simple Text example
For a simple example let’s display the total number of rows in a table.
Say we have an event
table that our application writes to each time it does something interesting. Let’s query the total event count and display it in a text field. The steps are as follows:
- Create a new Dashboard (or select an existing one)
- Create a new panel and select “Text” type
- Edit the panel
- Change the output format to “Table” and write an SQL query
select count(*) from event
will display the total rows in theevent
table
Timeseries data
What about something more complicated (and interesting). Let’s graph the events over time on a line graph. This time the query is a bit more complicated.
Grafana has a number of macros for use with Postgres timeseries data. These include $__timeGroup
and $__timeFilter
. We can use these macros to easily partition data into time groups so that we can graph it over time.
SELECT
$__timeGroup(created_at, '1h'),
count(*)
FROM
event
WHERE
$__timeFilter(created_at)
group by 1
order by 1
Notice that the query groups and orders by the $__timeGroup
macro result. This will allow the line in our line chart to fluctuate over 1hour periods so that we see jagged data instead of a straight line. Here’s an example line chart from the Grafana demo page.
Conclusion
Understanding application data can be crucial to a businesses success. Having insights let you focus on what matters. There are many commercial analytics platforms but most have weaknesses. The open source combination of Grafana, Kubernetes, and Postgres is a free, scalable and secure way to deploy your own data dashboards. MailSlurp, an email testing API, using this stack to analyse customer usage and plan features. We hope you gain as much out of it as we did!
❤ MailSlurp
Addendum
If your application sends or receives emails in any way, use MailSlurp to test that functionality end to end. Create email addresses during tests, trigger email functions and then verify the results. Check it out.