Feb 8, 2016

What happens when you have outliers in your data?


In this post I am going to talk briefly about outliers and the effect they might have on your data. With an example of course. Let's start with defining the word "outlier": what is an outlier in math/statistics?

An outlier is basically a number (or data point) in a set o data that is either way smaller or way bigger than most of the other data points.

Let's go through a practical example in order to understand the implications of having an outlier within your data set.

Jan 17, 2016

Scheduling R Markdown Reports via Email

GA markdown report using R
R Markdown is an amazing tool that allows you to blend bits of R code with ordinary text and produce well-formatted data analysis reports very quickly. You can export the final report in many formats like HTML, pdf or MS Words which makes it easy to share with others. And of course, you can modify or update it with fresh data very easily.

I have recently been using it R Markdown for pulling data from various data source such Google Analytics API and MySQL database, perform several operations on it (merging for example) and present the outputs with tables, visualizations and insights (text).

But what about automating the whole report generation and emailing the final report as an attached document every month at a specific time?

Jan 2, 2016

Happy New Year! Most Popular Posts in 2015


2015 has been my 2nd year blogging and I wanted to thank everyone who has taken the time to read my posts, shared and commented. Some of you left such precious feedback which gave me the input for new post ideas and the strenght for keeping up blogging. Thank you everyone!

On a personal level, 2015 has been a very productive year in terms of learning. I've been playing quite a lot with the R language and Google Analytics, often combining both and trying to explore new uses and applications for daily job tasks. R has become an irreplaceable tool in my daily job.   And blogging about it gave me the confidence to use it and recommend it to other colleagues in my team.

Looking quickly at some web analytics metrics, the last was a positive year too. Sessions almost quadrupled compared to 2014 (quite a big number, note anyway this is a very young blog). Organic traffic increased by over 900% (yes this is very good news!) and referral traffic saw a big increase as well, mainly thanks to my R posts incorporation into R-bloggers.com (that was another good news having been accepted).

Moving to a more meanigful KPI, there has been a 115% increase in subscribers compared to 2014. Thank you guys! My major challenge for 2016 would be definitely producing more content, more frequently while keeping the posts interesting and valuable for the audience.

Here below are my 3 most popular posts in 2015. That is, the content that you, the readers, found most interesting. Check them out if you have not seen them yet.

1. Google Analytics Dashboards with R & Shiny

2. Playing with R, Shiny Dashboard and Google Analytics Data

3. R Statistics for Digital Analytics: 8 blogs you should follow

I wish you a great 2016 and thanks again for following my blog! I will be back soon with more content.

Oct 12, 2015

Query your Google Analytics Data with the GAR package

Google Analytics API connection with R
Recently my friend Andrew Geisler released a new version of the GAR package. Like other similar packages, the GAR package is designed to help you retrieve data from Google Analytics using R. But with some new features.

I have been playing a bit with the package and the feature I enjoy the most is the ability to query multiple Google Analytics View IDs in the same query. To do that, you simply need to pass a vector of the View IDs in the correspondent gaRequest() command, and you get back a data frame with each view/profile clearly identified and all their correspondent metrics/dimension you included in the query.

Aug 17, 2015

Playing with R, Shiny Dashboard and Google Analytics Data

In this post, I want to share some examples of data visualization I was playing with recently. Like in many other occasions, my field of application is digital analytics data. Precisely, data from Google Analytics.

You might remember a previous post where I built a tentative dashboard using R, Shiny and Google Charts. The final result was not too bad, however the layout was somewhat too rigid since I was using the command "merge" to merge the charts and create the final dashboard.

So, I thought to spend some time improving my previous dashboard and include a couple of new visualizations, which will be hopefully inspiring. Of course, I am still using R, Shiny, and in particular shinydashboard: an ad hoc package to build dashboard with R.

May 18, 2015

Query Multiple Google Analytics View IDs with R

Query Multiple View IDs with R

Extracting Google Analytics data from one website is pretty easy, and there are several options to do it quickly. But what if you need to extract data from multiple websites or, to be more precise, from multiple Views? And perhaps you also need to summarize it within a single data frame?

Not long ago I was working on a reporting project, where the client owned over 60 distinct websites. All of them tracked using Google Analytics.

Mar 29, 2015

R Statistics for Digital Analytics: 8 Blogs you should Follow


Are you interested in using R for your digital analytics projects? Do you need to perform prediction modelling and visualizations on your digital data and Excel can´t just do the job as you wanted?

Or, you simply have no idea how R could help you in your digital analytics problems and you would like to see some real working examples first?

Well, there are 2 good news for you.

The first one is that you are not alone. There is a quite vibrant community out there, sharing more and more examples on how to get real value from using R in digital analytics. They often post/tweet around the #rstats hashtag.

The second news is that I decided to write a post on this. I am going to list here the main blogs (and people) that might be useful to add to your "R Stats + Digital Analytics" reading list.

Jan 27, 2015

Google Analytics Dashboards with R & Shiny


Google Analytics Dashboards with R & Shiny
One of the key activities of any web or digital analyst is to design and create dashboards. The main objective of a web analytics dashboard is to display the current status of your key web metrics and arrange them on a single view, so that information can be monitored at a glance. Great dashboards should allow you/your boss or client to take action quickly and spot trends in data.

There are plenty of tools for creating dashboard out there. You can decide to create your dashboard directly in Google Analytics, using a spreadsheets (e.g. Excel or Google Sheets) or you might decide to go for an ad hoc dashboarding solution such as Tableau, or Klipfolio (I am a heavy user of the latter).

In this blogpost I aim to move away a bit from traditional dashboarding tools, and I wil show you an example of Google Analytics dashboard I've built using the R programming language and the Shiny package. Finally, I will also summarize the main benefits of using such tools for creating dashboards and perform data analysis in a digital analytics context.

[UPDATE: I've recently built a more sophisticated and better looking dashboard using the shinydashboard package. Click here to see it.]

Nov 23, 2014

Drawbacks of Using Time Metrics to Measure Blogs

When it comes to blogging, we all know that CONTENT is king. We also understand that social interactions and readers engagement play a primary role for making the blog successful.

So far, so good.

But then it's time to analyse data and make decisions. And that's where we often fail.

We usually take a web analytics tool like Google Analytics, install basic tracking code on pages, and analyze the blog like any other website. We look at most common metrics and take them as standard references to evaluate future performance. But we forget about the unique features that differentiate blogs from other digital properties: content consumption and social interactions.

This post will help you understand one of the most misused metrics to measure blogs performance: I am talking about time on page and time on site. Most bloggers don't understand what time metrics actually measure. So, first of all I will try to explain how they are calculated in a typical web analytics tool (it might be different from what you think!).

I will then discuss some of the drawbacks of using time metrics to measure blog performance and finally suggest a couple of more solid KPI's to better measure content engagement.

After reading this post, I am sure you will start looking at time metrics with a bit more critical thinking than before. And perhaps shift your blog analytics focus to other more powerful metrics.

Let's go!

Oct 27, 2014

How I Measure Success for my Blog. A Framework using Google Analytics.

If you are serious about blogging, then you must have a measurement plan. No matter if you have just started and have only a dozen of visitors, or you already have a very popular blog whose primary purpose is making revenue from advertising. As long as you have some objectives for your blog, then you must decide what you need to measure.

Why? Because this is the only way to understand your blog performance and whether you are successful or not for your readers (I assume you are not writing only for yourself!).

Developing a measurement plan is the only way to understand whether you are successful or not for your readers.

In this post I am going to draft a measurement plan for MY BLOG and use it as a learning exercise to discuss critical aspects like choosing KPI's, (Key Performance Indicators) and segments to analyse performance. Google Analytics will be my reference platform for implementing the measurement plan.