Jan 18, 2016

Scheduling R Markdown Reports via Email

GA markdown report using R
R Markdown is an amazing tool that allows you to blend bits of R code with ordinary text and produce well-formatted data analysis reports very quickly. You can export the final report in many formats like HTML, pdf or MS Words which makes it easy to share with others. And of course, you can modify or update it with fresh data very easily.

I have recently been using it R Markdown for pulling data from various data source such Google Analytics API and MySQL database, perform several operations on it (merging for example) and present the outputs with tables, visualizations and insights (text).

But what about automating the whole report generation and emailing the final report as an attached document every month at a specific time?
In this post I am going to explain how to do it in Windows. If you do a search on google, you will find several threads on stackoverflow and a few good specific posts on it. However it took me sometimes to get it working and had to try different options before. That's why I am writing this quick tutorial, including screenshots, hoping you might get it your report automated faster!

1. Create your Rmarkdown report

In RStudio create a new Rmarkdown document where you will enter your R code and texts. Mine is called "Schedule_Report.Rmd" and here is what is does:

  • retrieve some data from Google Analytics API using the RGoogleAnalytics library
  • turn dates into a more friendly format
  • create a trend line chart of sessions using the ggplot2 package 

A very basic report. Remember that in Rmarkdown you can decide whether to show each chunk of code or not. I showed just the final outputs that are the table and the bar chart.

2. Create an R script that executes and email your Rmarkdown report 

Create a new R script which will:

  • locate your Rmarkdown document (set the working directory to where your report is located)
  • generate an HTML file (or pdf, MS Word) from your Rmarkdown document
  • send the HTML file via email

To email the report I have used the gmailR library which allows you to generate and send emails directly from R. To make sure the gmailR library will work, first you might need to enable the "Less Secure apps" option in your google account. Open your Google personal account and go to Sign-in and Security section, scrolll down to the bottom of the page and switch on the "Allow less secure apps".

I also made a few tries with the mailR package but without success. I guess this was because of security issues with my google account, I have gmail. Anyway the gmailR package worked perfectly so I sticked to it! Here is the code contained into my R script, which is named "Script.R".

R script scheduling reports

3. Schedule a task in Windows

From the main Windows menu, go to Programs>Accessories>System Tools>Task Scheduler (at least this is the path in my Windows edition). The task scheduler will open up:

Click on Action>Create basic task. Type a name for your task and add a short description if you like. Now select the trigger which means how often you want the task to be executed (to try it first I recommend choosing "One time"). Select the date and time and on the action field choose "Start a program".

In the "Start a Program" step, complete the fields as follows:

>Program/Script: the directory path to where to find both the executable file for R

>Add arguments: CMD BATCH followed by the path of the R script you created at step two. Remember to put the directories path between quotation marks "" like in the image below.

Schedule R reports via email

Click on next and you should now reach the last step and see a confirmation window. Press finish and voila' your task is created and it should execute correctly at the time you set.

4. Check your mail

At the time you set the task you should see the "taskeng" window popping up and disappearing after a few seconds (depending the workload you placed on your R files). Now open the mail account where you sent the report to. Did you get the email with your report attached?

In case you did not receive the email, I recommend you to:

  • Check if the the task was executed in Windows. Open the task scheduler and you will see the         list of tasks. Look for your task name. Make sure the status says "Success" and not failed;
  • If the status says failed, double check you set the task correctly as per step 3. An alternative       is  creating a .bat file separately and enter the path of the .bat file in the task scheduler;
  • As a general troubleshooting method, I also suggest opening your R console (double click     on R.exe) and execute line by line the code of your R script at step 2. This way you can             realize if there is an error inside your R code. I mean, Windows executes the task correcly         but no data is generated/sent by R.

Here below are a few issues that might prevent R from executing the code contained in your R script properly:

*To be able to send mails via gmailR package, make sure you enable the "less secured apps" option in your Google account.

**To be able to create an html document from an Rmarkdown file, make sure you have installed the latest version of pandoc library. To do that, you should, in order:
     > install.packages("installr")
     > install.pandoc()
     > Restart your machine

To recap, the process you have just automated will work as follows:
  1. Windows will start a task at the day/time you specified in the task scheduler
  2. the taskeng will open and executes your R script you create at point step 2 through R
  3. the Rmarkdown report will be converted into an HTML file and sent by email
If you like to reproduce the whole process using my files, you can find them both the Rmarkdown report and the R script at this github repository. I hope the post was helpful and will push you to use R for generating business reports.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.