I like R, but I LOVE doing my data work in Python Pandas. Unfortunately, the Python community has not gotten around to writing a SiteCatalyst API wrapper as quickly as the R community did (thanks Randy Zwitch + team). That's why I reached out to Randy last week to see if anybody had used rpy2 or any other packages to run RSiteCatalyst within Python. He wrote a great blog in response on how to run RSiteCatalyst using rpy2 here.
In the meantime, I found a completely different way to run RsiteCatalyst with Python using the power of the Jupyter notebooks and it's magical functions.
This post shows how to run RSiteCatalyst using the R magics in this Python Jupyter notebook. If interested in magics or R->Python conversion, this blog should be valuable.
Prior to this working it is important to pip install rpy2
I am also using Python 2.7 Anaconda Distribution and did use the conda R essentials dl, but I am not sure if that is necessary to get this running.
#Imports import readline import rpy2.robjects import seaborn as sns import warnings warnings.filterwarnings('ignore')
#Load the R magics that make this work! %load_ext rpy2.ipython #Load inline graphing %matplotlib inline
I can declare variables within python and they read right into R using the Rpush command.
%Rpush date1 date2
I use the %%R magic to declare a full cell will be in R code.
%%R #Load appropriate R Code library(RSiteCatalyst) library(httr) library(curl) #This line is needed to make RSiteCatalyst work in IPython set_config( config( ssl_verifypeer = 0L ) ) #Authenticate into Sitecat API (Enter your Credentials Here) SCAuth('username:company','Web Services Credentials') #Pull a basic report, notice how my date vars are from Python tr<-QueueOvertime('rsid',date1,date2,metrics=("pageviews"))
 "Credentials Saved in RSiteCatalyst Namespace."  "Requesting URL attempt #1"  "Received overtime report."
Easily pull the R dataframe back to python using the Rpull magic.
And it seamlessly acts as a pandas dataframe.
|1||1420088400||Thu. 1 Jan. 2015||2015||1||1||2983|
|2||1420174800||Fri. 2 Jan. 2015||2015||1||2||8567|
|3||1420261200||Sat. 3 Jan. 2015||2015||1||3||2908|
|4||1420347600||Sun. 4 Jan. 2015||2015||1||4||4833|
|5||1420434000||Mon. 5 Jan. 2015||2015||1||5||19685|
You can even plot with Python functionality using it.
sns.distplot(tr["pageviews"], kde=False, color="b")
<matplotlib.axes._subplots.AxesSubplot at 0x7feaae195c10>
This solution is clean to read and easy to execute within the Jupyter notebook. I'll be using this solution to automate reporting pipelines that include Python's unique FuzzyWuzzy package for text cleaning, sci-kit package for ml algorithms, and seaborn for advanced statistical image generation.
Thanks for reading!