Sep 18, 2016

Analyzing Stack Overflow questions and tags with the StackLite dataset

The guys at Stack Overflow have recently released a very interesting dataset containing the entire history of questions made by users since the beginning of the site, back in 2008. It's called StackLite and it contains, for each Stack Overflow question the following data:
  • Question ID
  • Creation Date
  • Closed Date (when applicable)
  • Deletion Date (when applicable)
  • Score
  • Owner user ID
  • Number of answers
  • Tags 

As David Robinson explains in his introductory post, the Stacklite dataset is designed to be easy to read and analysed with any programming language or statistical tool. A fantastic resource if you are a data analyst/scientist and want to crunch some real data! 

I thought to give it a go and perform some exploratory analysis using R. More specifically, I am going to answer the following business questions:
  • What are the most popular tags?
  • How many questions have more than one tag?
  • What is the overall closure rate for the site and which tags present higher values?
  • How much time it takes, on average, to close a question?
  • Which tags tend to have higher/lower score?
  • And in particular: how data science languages perform on the above questions?

Aug 11, 2016

Google Analytics makes Demo Account available to all

Playing with GA data is much much easier now.

Last week biggest news was definitely Google making a Demo Google Analytics Account available to everyone. As the word "demo" says, the main purpose is demonstrating all the features and reports GA offers, and become a learning platform for analysts. But it´s actually real numbers! All the data available come from the Google Merchandise Store (which sells Google branded merchandise), so you can apply your favorite algorithm, find valuable insights from the data and show off your analytics skills to others.

Click on this link to access the GA Demo Account.

  • If you already have a Google Analytics account, Google will add the demo account to it (then you can access it via the Home tab in Google Analytics).
  • If you do not have a Google Analytics account, it will create one for you in association with your Google account (yes you need a Google account first) and add the demo account to it.


What can you do with the GA Demo Account?

As I said it´s real data from an E-commerce site. So, you will be able to see standard reports such as audience, traffic acquisition and behavior as well as transactions data and shopping behavior throughout the visitor journey. Most GA advanced features are already implemented and these includes:
  • Enhanced Ecommerce
  • Goals (there a couple set up) and Funnel 
  • Filters
  • Demographic & Interests reports
  • Adwords integrated reports
  • Search Console reports
  • Site Search data
  • Content Groupings
  • Calculated metrics

As an analyst (either new or more experienced one), here are a couple of things you would like to do:
  • familiarize with the Admin interface and all the account/property/view features available (remember you will have just "Read & Analyze" rights you won´t be able to implement any change).
  • dive into all the standard reports and study visitors flow throughout the website. Create your own segments, custom reports and dashboards.
  • analyse conversions and shopping behavior (Enhanced Ecommerce section).
  • if you are an educator, GA Partner or University teaching digital analytics, the GA Demo account will be your best friend in classes,
  • you are a blogger like me, you might want real e-commerce data to build proof of concepts, dashboards, or perform powerful analysis.

GA demo account shopping behavior



...and if you are a R user?

Of course you can use R to analyse the GA Demo data. It´s real data from the Google Merchandise Store so you might be interested in applying machine learning algorithms, or create beautiful visualizations and dashboards. 

In more than one occasion in this blog I shared examples of GA dashboards made with R and Shiny. Some readers asked me for the original dataset in order to reproduce the code, cause they did not have access to any GA account. With the demo account available, now it´s easy to export the data and import it into R, let say in a .csv format.

As far as I have seen, due to the limited rights granted (only "Read & Analyse") currently it´s not possible to access and extract the data via API. That would be very handy using one of the many available R packages to connect to Google Analytics.


Jun 17, 2016

Where to Live in Barcelona in a Dashboard


Barcelona best barrio visualization


Sometimes data can tell a story much faster and effectively than many words. That's why I´ve decided to start sharing more data stories via this blog, hoping to both:

  1. address specific topics readers want to dive in (often these will not be data-people, they would be new to my blog, probably coming after googling a specific questions e.g. "which are the best boroughs to live in Barcelona?").
  2. showcase data visualization tools and best practices to present your data (these are data-people, yes you my regular readers, you might like to see a tool in action).

Mar 27, 2016

Enhance your Blog Measurement with these Google Analytics Calculated Metrics


Calculated Metrics in GA

Google Analytics has recently incorporated a new powerful feature that offers more flexibility for measuring your own business objectives. I am talking about calculated metrics.

In this post I am going to suggest a list of calculated metrics that you can easily configure in Google Analytics to better measure your blog performance.

As a blogger, when it comes to measure performance of my content, I am very focused on measuring readers engagement with the content I publish. Also, I am constantly looking to increase my readers base, giving my blog more exposure and acquiring new subscribers. Here is an outline of my measurement plan using Google Analytics (I highly recommend this read if you are new on the concept of digital measurement plan).

The new calculated metrics feature gives me the opportunity to customize my own measurement plan. How?