Aug 17, 2015

Playing with R, Shiny Dashboard and Google Analytics Data

In this post, I want to share some examples of data visualization I was playing with recently. Like in many other occasions, my field of application is digital analytics data. Precisely, data from Google Analytics.

You might remember a previous post where I built a tentative dashboard using R, Shiny and Google Charts. The final result was not too bad, however the layout was somewhat too rigid since I was using the command "merge" to merge the charts and create the final dashboard.

So, I thought to spend some time improving my previous dashboard and include a couple of new visualizations, which will be hopefully inspiring. Of course, I am still using R, Shiny, and in particular shinydashboard: an ad hoc package to build dashboard with R.

The dashboard I've made makes use of the following visualizations:

  • Value boxes
  • Interactive Time Series (dygraphs)
  • Bubble charts
  • Streamgraphs
  • Treemaps 

You can see the final dashboard at shinyapps.io (though, because of basic plan current limits, it might be temporarily unavailable), or better you can check the code at github. Here is a screenshot:



Let's go quickly through each visualization to see what Google Analytics dimension/metrics it shows.

Value Boxes

When you build a dashboard, boxes are probably the main building blocks since they allow organize the information you want to show within the page. When I build a dashboard, I normally start by sketching the layout, and this means placing the main boxes.

A particular type of box available in the Shiny Dashboard package is the valueBox, which lets you display  numeric or text values, and also add an icon. Value boxes are great components to be placed at the top of a dashboard and display main KPI's, change % or add a description to the rest of the dashboard.

In my dashboard I placed 3 boxes at the top, showing the value for my 3 main KPI's: sessions to the website, transactions (conversions) and conversion rate. The code to build a value box with shiny dashboard is very simple and if you want to have dynamic values, like in my case, you have to create in both the server.R and ui.R section of your Shiny app:

Value Boxes with Shiny Dashbard



Interactive Time Series (dygraphs)

Time series charts might get chaotics and not provide clear insights when filled with too many data and series (you might end up with the so called "spaghetti-effect").

But if time series are interactive, user can easily explore and make sense of complex datasets.

For example, users could highlight specific data points, include/exclude time series, zoom in specific time intervals, enrich the graph with shaded regions or annotations, etc. All of these features are offered by the dygraphs Javascript charting library.

I used the R dygraph package (which provides an interface to the Javascript dygraph charting library) to make an interactive time series with my Google Analytics dataset. The simple chart I made shows 3 metrics: sessions, transactions and conversion rate (of those transactions) over the period selected by the user. Both sessions and transactions use the left axis while conversion rate the right one. I included a dyRangeSelector placed at the bottom of the chart that lets you narrow down the time interval.

Dygraphs with R Shiny



Bubble charts

With bubble charts you can show three dimensions of data. I used a bubble chart to visualize the performance of traffic channels: x axis represents the number of sessions, y axis thee avg. pages per session, and finally transactions (that is the ultimate objective of many websites) are proportional to the size of the bubble. The larger the bubble and the higher is the number of transactions  produced by that channel of traffic.

To make this chart I used the GoogleVis package.


Bubble Charts to visualize Traffic Channel Performance


In the dashboard I've also included a one-dimensional bubble chart using the bubbles library. This type of chart works similar to a bar chart though the latter is more accurate in terms of understanding the real value you are showing.

On the other hand, this bubble chart might look more attractive than bar charts and it allows to display lots of values in a small area. I used this chart to show screen resolutions data from Google Analytics mobile reports.

Bubbles showing Sessions by Screen Resolution


Streamgraphs 

Streamgraphs are a type of stacked area charts that are displaced around a central horizontal axis. Stremgraphs are very effective to visualize data series that varies over time, especially if you need to show many categories.

The result is a flowing, organic shape, with strong aesthetic appeal, which is why streamgraphs are becoming more and more popular.

In the dashboard I made a streamgraph to visualize the evolution of sessions among devices (desktop, mobile, tablet) over the past years. To do it in R. I had to play a bit with the streamgraph package.

Here below is the final data viz (I am not completely happy with this visualization as for some reason when I mouse over the series the value showed is always the total of the period, not the one of the specific date I am pointing on. Any help?).

Streamgraph to show Devices Share of Traffic


Another interesting application on web analytics data,  would be using streamgraphs to analyse channels share of traffic over time (direct vs organic vs paid vs referral, etc.).


Treemaps

Treemap visualizations are very effective in showing hierarchical (tree-structured) data in a compact way. They can display lot of information within a limited space and at the same allow users to drilldown into the represented segments.

An example of hierarchical data in Google Analytics reports, is devices as principal segment (main rectangles) and browser as sub-segment (nested rectangles). The area of each rectangle is proportional to the amount of sessions produced by its corresponding segment/sub-segment.

To make in R, I used the treemap library (unfortunately the visualization is not interactive, but you can have a try with the d3treeR library).

Treemap to show Devices and OS Share of Sessions.

I hope you can get inspiration from these visualizations and include some of them in your digital analytics dashboard or reports. My plan is to keep adding more interesting visualizations (that are not currently offered in Google Analytics reports) to this dashboard, to better show digital data. If you have suggestions please leave a comment here or share it via github repo.

May 18, 2015

Query Multiple Google Analytics View IDs with R

Query Multiple View IDs with R

Extracting Google Analytics data from one website is pretty easy, and there are several options to do it quickly. But what if you need to extract data from multiple websites or, to be more precise, from multiple Views? And perhaps you also need to summarize it within a single data frame?

Not long ago I was working on a reporting project, where the client owned over 60 distinct websites. All of them tracked using Google Analytics.

Mar 29, 2015

R Stats + Digital Analytics: 8 Blogs you should Follow


Are you interested in using R for your digital analytics projects? Do you need to perform prediction modelling and visualizations on your digital data and Excel can´t just do the job as you wanted?

Or, you simply have no idea how R could help you in your digital analytics problems and you would like to see some real working examples first?

Well, there are 2 good news for you.

The first one is that you are not alone. There is a quite vibrant community out there, sharing more and more examples on how to get real value from using R in digital analytics. They often post/tweet around the #rstats hashtag.

The second news is that I decided to write a post on this. I am going to list here the main blogs (and people) that might be useful to add to your "R Stats + Digital Analytics" reading list.