Sep 6, 2014

All Data Journalism Graduates in a Map





This week I got my certificate of completion from the course "Doing Journalism with Data: First Steps, Skills and Tools"(if you like to know more about data journalism check out my post "3 Great Examples of Data Journalism Stories"). I enjoyed the course a lot, and I am proud of being one of the 1250 people who successfully completed the course. I was a bit surprised we were only 1250 graduates!

So, where did we come from and who we are? Above is a map I built using the R programming language, and in particular the GoogleVis package. GoogleVis is a great package that provides an interface to the Gogle Vis API, and make creating interactive plots quite easy. Interactive means that users can manipulate data and look for the info they need. Here a list of visualizations you can do with Google Charts.

The other great thing about this visualization, is that you can make it available over HTML, like I did above (you can edit the HTML if you like). No more static charts on your desktop then, but beautiful, interactive visualization shared on the web!

Below is the simple R code I used to prepare the data and plot the charts. To plot the data about graduates titles (the title people indicated when they enrolled to the course) I used Google Refine and some of its cluster methods to clean/group data (e.g.: "journalist" or journalists" or "periodista" falled into the general category of "Journalist"). Then I load it into R as a .csv file.


ddj<-  read.csv("ddjCleaned.csv")
summary(ddj)
studCountry<-  as.data.frame(table(ddj$country))
names(studCountry)<- c("country","graduates")

studTitle<- as.data.frame(table(ddj$title))
names(studTitle)<-c("title","graduates")

install.packages("googleVis")
library(googleVis)

C<- gvisGeoChart(studCountry, locationvar = "country", colorvar = "graduates", options = list(width = 500, height = 400))
plot(C)

T<- gvisPieChart(head(studTitle[order(studTitle$graduates, decreasing =TRUE),],10), labelvar = "title", numvar="graduates",options = list(width = 500, height = 300))
plot(T)

CT <- gvisMerge(D,T, horizontal=FALSE)
plot(CT)

# to get the HTML code of your visualization you can either print execute the following command:

print(CT)  #print the Object you have just created

# or you can click on the Chart ID link below your visualization.