Jan 28, 2015

Google Analytics Dashboards with R & Shiny

Google Analytics Dashboards with R & Shiny
One of the key activities of any web or digital analyst is to design and create dashboards. The main objective of a web analytics dashboard is to display the current status of your key web metrics and arrange them on a single view, so that information can be monitored at a glance. Great dashboards should allow you/your boss or client to take action quickly and spot trends in data.

There are plenty of tools for creating dashboard out there. You can decide to create your dashboard directly in Google Analytics, using a spreadsheets (e.g. Excel or Google Sheets) or you might decide to go for an ad hoc dashboarding solution such as Tableau, or Klipfolio (I am a heavy user of the latter).

In this blogpost I aim to move away a bit from traditional dashboarding tools, and I wil show you an example of Google Analytics dashboard I've built using the R programming language and the Shiny package. Finally, I will also summarize the main benefits of using such tools for creating dashboards and perform data analysis in a digital analytics context.

[UPDATE: I've recently built a more sophisticated and better looking dashboard using the shinydashboard package. Click here to see it.]

R and Shiny introduction 

R is a very powerful platform for data analysis. R is actually very good at lots of things including statistical modelling, data visualizations, plus it relies on a very large and enthusiastic  community of users and developers which make the product growing and improving regularly. For all these reasons, today R  is widely used by scientists, researchers, and statisticians. And many are the companies that are routinely using R for data analysis: Google, Facebook, The New York Times, Twitter, Coursera, to name a few. As Dave Smith wrote on a recent paper, "R is still hot and getting hotter".  

On the other hand, Shiny is an R package developed by the guys of RStudio, that allows you to build interactive web applications using R code. So let say you have performed some data analysis with R: you can now wrap it into an app and share it to other people, who do not need to be R users.

With the developement of Shiny,  R is gaining more interactivity and is becoming a quite attractive option for analysts (learn from who is already using R for Digital Analytics) to construct interactive dashboards and share data to their boss/clients or co-workers. Pretty cool, isn't it?

Let's show you an example...

A simple dashboard scenario: segment traffic by device

The Shiny application I created simulates a simple dashboard scenario where users can segment data by traffic device (desktop vs mobile vs tablet) through a radio button.

GAdashboard on Make A Gif

make animated gifs like this at MakeAGif

The dashboard is composed of 4 visualizations:

  1. A line chart showing sessions and sign-ups daily trend. Sessions are measured on primary axis while sign-ups on secondary axis.
  2. A bubble chart plotting for each traffic channel three metrics: number of sessions, avg. pages per session and revenue. This visualization can be quite interesting to analyse channels performance with respect to the website objective (e.g. revenue), and currently it is not available in Google Analytics acquisition reports.
  3. A line chart showing bounce rate daily evolution.
  4. A world map visualizing the number of new users: the darker the country and the more new users visited the site from that country.  

The app is currently hosted at Shinyapps.io, a dedicated website where you can deploy and share your Shiny applications online (sorry but, because of current Shinyapps free plan limits, the app is temporarily unavailable. But you can stil get the code at github here and run it on your own machine).

As you can see, it's a very simple scenario, both in terms of user interface and calculations running in the background. Nothing complex, no statistical modelling involved, though this would be definitely a very powerful feature to include in a R coded dashboard.

What I did, was playing a little bit with Google Charts visualizations through the GoogleVis package. GoogleVis is a R package that provides an interface to the Google Vis API, and make creating interactive plots quite easy. Interactive means that users can manipulate data and look for the info they need.

Except for the bubble chart, all the other charts I used to create the dashboard are available in Google Analytics reports. But if like, you can do much much more. Among the charts available in GoogleVis package there are scatter charts, histograms, stepped area charts, org charts, tree maps, gauge charts and boxplots. Here is the complete list of visualizations you can do with Google Charts.

Like all Shiny applications, this dashboard app is made of two code files (ui.R and server.R) which must be placed in the same directory:

  • ui.R = it defines how the web application looks to users. All the calls you make on this file, they generate some HTML code.
  • server.R = this is normal R code where you perform your data analysis.

With respect to building the actual 4 charts dashboard, what I did was first creating each chart object separately, and then merge them in pairs as follows:

D1 <- gvisLineChart(dataDevice, "date", c("sessions","signup"))
D2 <- gvisBubbleChart(channelsDevice, idvar="channel", xvar="sessions", yvar="pages.sessions")
D3 <- gvisLineChart(dataDevice, xvar="date", yvar="bounce.rate",
D4 <- gvisGeoChart(countriesDevice, "country", "new.users")
D12 <- gvisMerge(D1,D2, horizontal=TRUE)
D34 <- gvisMerge(D3,D4, horizontal=TRUE)
D12D34<- gvisMerge(D12,D34, horizontal=FALSE)

All of the code for this dashboard application lives on this GitHub Repo here. Raw data was downloaded manually from Google Analytics in .csv format, though this operation can be automated by connecting directly with Google Analytics API (see RGoogleAnalytics package).

Benefits of using R & Shiny to create a Google Analytics Dashboard

So, what are the main benefits of using R & Shiny to create a Google Anaytics dashboard? And to answer a broader question: why should you use R for web anaytics?

With the development of a package such as Shiny, R definitely becomes a more attractive option for analysts to build dashboards. Here below I put together a list of 12 main benefits you would gain by using R for creating a Google Analytics dashboard:

  1. Advanced statistics capabilities & prediction models. R was born as a statistical language and keeps being the language of reference of any statistician. It has lots of packages for performing any specialized function and it's always up to date thanks to its open source nature. Using R for web analytics would allow you to incorporate sophisticated prediction models easily in your dashboard, and more importantly let your boss/client explore and interact with the models you have built (E-commerce is a very interesting field where to apply prediction models).
  2. State-of-the-Art Visualizations. R has very advanced graphics capabilities which let you create beautiful and interactive dashboards. R offers several powerful packages like GoogleVis (the one I used in the above dashboard), ggplot, ggVis or dygraphs.
  3. Connect directly to Google Analytics API. In my dashboard example I manually downloaded the data in .csv format (mainly for privacy reasons), but you can surely automate the retrieaval of data through ad-hoc R packages. Check out this recent post that explains how to connect Google Anaytics to R using the RGoogleAnalytics package. And learn how to query multiple Views using R.
  4. No web development  knowledge is required, altough if you know some HTML/CSS/ JavaScript  you can fully customize the user interface and make suitable for you and your final users.
  5. Attractive default UI theme, based on Twitter bootstrap.
  6. Shiny can integrate JavaScript libraries like d3.js for visualizations.
  7. Shiny uses a reactive programming model like modern web applications do, which indicates that  when the user changes a value in a ui control (e.g. the radio button), the R code in the background will get recalculated and the output that is bound to the ui (e.g. the 4 charts in the dashboard) will be re-rendered.
  8. Reproducibility. This is a very very important concept at the basis of R (and other programming languages too), and means being able capture each step of your data analysis so that you or other people can reproduce it. In a business scenario, reproducibility means being able to repeat complex functions and dashboards for more than one client.
  9. Scalability. R is a much more powerful and solid compared to other toools like Excel when it comes to process large amount of data.
  10. Integrate different data sources. R can read almost any type of data (.txt, .csv, etc.). There are R packages specifically designed to read Excel, JSON, XML, etc. or you can even scrape data from websites and execute SQL queries. This means you could potentially integrate different sources of data all in the same dashboard. And once imported the data, and cleaned it, you can build a data frame on which you can use all R functions. Very powerful.
  11. R is an open source project, which means it is continually improved, upgraded, enhanced, and expanded by a global community of incredibly passionate developers and users. Currently R has over 5,000 add-ons packages.
  12. it's Free!

What do you think about implementing a dashboard with R & Shiny? Which are the main obstacles you might encounter moving from traditional dashboarding tools to R?

Do you see R & Shiny playing an important role in digital analytics in the near future?

Share your thougths and be social!

Other articles you will find useful:
Playing with R, Shiny Dashboard and Google Analytics Data
R for Digital Analytics: 8 Blogs you should Follow

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.