Python in RStudio with Reticulate

This tutorial shows how to use reticulate to plot with seaborn inside an R Markdown file and compares the output to a ggplot2 plot

Image credit: Unsplash/David Clode

Python in R Markdown

The following is a short guide for using Python in R Markdown using the reticulate package and the use_python() command. This assumes that you do not have Python installed (or at least Python 3) yet.

Before we begin, go over following steps:

  • Download the miniconda Python distribution
  • Install the reticulate package in R
  • Find the folder (in your computer) where you installed miniconda3

Now let’s load reticulate and find the version of Python we want to use. Then we load the Python interpreter for our R session.

#Load the reticulate package
library(reticulate)
#Find the miniconda3
use_python("C:/Users/alhdz/miniconda3", required = T)
#Load the Python shell
repl_python()

Let’s check which version of python we are using.

#Inspect the Python version
py_config()
## python:         C:/Users/alhdz/miniconda3/python.exe
## libpython:      C:/Users/alhdz/miniconda3/python39.dll
## pythonhome:     C:/Users/alhdz/miniconda3
## version:        3.9.1 (default, Dec 11 2020, 09:29:25) [MSC v.1916 64 bit (AMD64)]
## Architecture:   64bit
## numpy:          C:/Users/alhdz/miniconda3/Lib/site-packages/numpy
## numpy_version:  1.19.2
## sys:            [builtin module]
## 
## NOTE: Python version was forced by use_python function

To run Python code, chunks should be named python rather than r. You can use the Insert option in the top right corner of the RStudio editor window and select the Python option. It is important to remember that objects that are loaded into the Python environment will not show up in your RStudio environment window. Similarly, if you type repl_python() in your console, you will notice that when it expecting Python code, you will see >>> rather than >, you can go back to R by typing exit.

Let’s make an example. If you try the syntax below in an r chunk, it will not run! In a python chunk however:

a="Hello"+" World"
print(a)
## Hello World

Python Libraries

Now let’s install some libraries for Python, these are like packages in R. Let’s run the Anaconda Prompt (miniconda3) and type conda install seaborn then write pip install -U scikit-learn to install scikit-learn. The former is a data visualization library (similar to ggplot2) and the latter a machine learning library.

Now, let’s load the libraries we just installed. We can use R syntax in an r chunk, for example:

sklearn <- import('sklearn')
sklearn
## Module(sklearn)

We also can use the Python syntax in a python chunk. Let’s do some troubleshooting that is common for Markdown Python plots in Windows.

#Sometimes R Markdown has trouble finding some plotting plugins! 
import os
os.environ['QT_QPA_PLATFORM_PLUGIN_PATH'] = 'C:/Users/alhdz/miniconda3/Library/plugins/platforms'

Penguins with seaborn

Now lets import seaborn using Python syntax and use the Palmer Penguins data set to make a histogram.

import seaborn as sns
df = sns.load_dataset("penguins")
sns.histplot(data=df, x="flipper_length_mm", hue="species", multiple="stack")

Penguins with ggplot2

Finally, let’s do the same thing, but this time with ggplot2 and R syntax.

library(palmerpenguins) #you may need to install!
library(tidyverse)
df <- data.frame(penguins)

df %>% ggplot(aes(x=flipper_length_mm, fill=species)) + geom_histogram(color="black", bins = 10)

Who wore it better?

Alfredo Hernandez Sanchez
Alfredo Hernandez Sanchez
Research Master’s in International Studies Coordinator

Quantitative Text Analysis, Data Visualization, Policy Evaluation

comments powered by Disqus

Related