Interactive Data Visualization with the plotly

The plotly package in R provides a powerful framework for creating interactive, web-based visualizations. It builds on the JavaScript Plotly library, offering dynamic plots like scatter, line, bar, and 3D charts that support zooming, panning, and tooltips. This tutorial introduces plotly for beginners, using built-in datasets like iris and mtcars. We’ll cover basic plots, customization, and interactive features, assuming basic R knowledge.

Installation and Setup

Install the plotly package from CRAN and load it:

Code
#install.packages("plotly")
library(plotly)
Loading required package: ggplot2

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout

Basic Scatter Plot

Create a scatter plot using the iris dataset to visualize Sepal Length vs. Sepal Width.

The plot_ly() function is the core of plotly, using a formula interface (e.g., x = ~variable) to specify data. The %>% or |> pipe operator (from magrittr) chains commands for customization.

Code
p1 <- plot_ly(data = iris, 
              x = ~Sepal.Length, 
              y = ~Sepal.Width, 
              type = "scatter", 
              mode = "markers",
              marker = list(size = 10),
              text = ~paste("Species:", Species),
              hoverinfo = "text+x+y") %>%
  layout(title = "Sepal Length vs. Width",
         xaxis = list(title = "Sepal Length"),
         yaxis = list(title = "Sepal Width"))

# Display the plot
p1
  • type = "scatter" and mode = "markers" create a scatter plot.
  • text and hoverinfo customize tooltips to show species and coordinates.

Scatter Plot with Grouping

Add color grouping by species:

Code
p2 <- plot_ly(data = iris, 
              x = ~Sepal.Length, 
              y = ~Sepal.Width, 
              color = ~Species,  # Color by species
              type = "scatter", 
              mode = "markers",
              marker = list(size = 10),
              text = ~paste("Species:", Species),
              hoverinfo = "text+x+y") %>%
  layout(title = "Sepal Length vs. Width by Species",
         xaxis = list(title = "Sepal Length"),
         yaxis = list(title = "Sepal Width"),
         showlegend = TRUE)

p2
  • color = ~Species assigns colors to each species.
  • showlegend = TRUE adds a legend.

Bar Plot

Create a bar plot of cylinder counts from the mtcars dataset:

Code
# Summarize cylinder counts
mtcars_cyl <- table(mtcars$cyl)
cyl_data <- data.frame(cyl = names(mtcars_cyl), count = as.numeric(mtcars_cyl))

p3 <- plot_ly(data = cyl_data, 
              x = ~cyl, 
              y = ~count, 
              type = "bar",
              marker = list(color = c("#1f77b4", "#ff7f0e", "#2ca02c"))) %>%
  layout(title = "Distribution of Cylinders in mtcars",
         xaxis = list(title = "Number of Cylinders"),
         yaxis = list(title = "Count"))

p3
  • table() and data.frame() prepare the data.
  • marker sets custom bar colors.

Line Plot

Plot a time series of unemployment from the economics dataset (available in ggplot2):

Code
library(ggplot2)
data(economics)

p4 <- plot_ly(data = economics, 
              x = ~date, 
              y = ~unemploy, 
              type = "scatter", 
              mode = "lines",
              line = list(color = "#d62728", width = 2)) %>%
  layout(title = "US Unemployment Over Time",
         xaxis = list(title = "Year"),
         yaxis = list(title = "Unemployment"))

p4
  • mode = "lines" creates a continuous line.
  • line customizes color and width.

3D Scatter Plot

Create a 3D scatter plot with iris data:

Code
p5 <- plot_ly(data = iris, 
              x = ~Sepal.Length, 
              y = ~Sepal.Width, 
              z = ~Petal.Length, 
              color = ~Species,
              type = "scatter3d", 
              mode = "markers",
              marker = list(size = 5)) %>%
  layout(title = "3D Scatter Plot of Iris Data",
         scene = list(xaxis = list(title = "Sepal Length"),
                      yaxis = list(title = "Sepal Width"),
                      zaxis = list(title = "Petal Length")))

p5
  • type = "scatter3d" enables 3D plotting.
  • scene customizes 3D axes.

Box Plot

Code
data(quakes)  # Load quakes dataset
# Create depth bins
quakes$depth_bin <- cut(quakes$depth, breaks = seq(0, 700, by = 100),
                        labels = paste0(seq(0, 600, by = 100), "-", seq(100, 700, by = 100), " km"))

# Box plot
p2 <- plot_ly(data = quakes, x = ~depth_bin, y = ~mag, type = "box",
              color = ~depth_bin, colors = "Set2") %>%
  layout(title = "Box Plot of Earthquake Magnitude by Depth Bin",
         xaxis = list(title = "Depth Bin (km)"),
         yaxis = list(title = "Magnitude (Richter)"),
         showlegend = FALSE,
         width = 700,  # Width in pixels
         height = 500)
Warning: Specifying width/height in layout() is now deprecated.
Please specify in ggplotly() or plot_ly()
Code
p2

Heatmap

Code
p1 <- plot_ly(data = quakes, 
              x = ~long, 
              y = ~lat, 
              type = "histogram2d",
              nbinsx = 20, nbinsy = 20,
              colorscale = "Viridis") %>%
  layout(title = "Heatmap of Earthquake Locations",
         xaxis = list(title = "Longitude"),
         yaxis = list(title = "Latitude"),
         width = 700,  # Width in pixels
         height = 500,  # Height in pixelsS
         yaxis = list(title = "Latitude"))
Warning: Specifying width/height in layout() is now deprecated.
Please specify in ggplotly() or plot_ly()
Code
p1

Contour Plot

p3 <- plot_ly(data = quakes, 
              x = ~long, 
              y = ~lat, 
              z = ~depth, 
              type = "contour",
              contours = list(showlabels = TRUE),
              colorscale = "Hot") %>%
  layout(title = "Contour Plot of Earthquake Depth",
         xaxis = list(title = "Longitude"),
         yaxis = list(title = "Latitude"))

p3

Subplots

Combine multiple plots into a single figure:

Code
p6 <- subplot(p2, p3, nrows = 2, shareX = FALSE) %>%
  layout(title = "Combined Scatter and Bar Plots",
         showlegend = TRUE)

p6
  • subplot() stacks plots vertically (nrows = 2).
  • shareX = FALSE allows independent x-axes.

Adding Annotations

Add text annotations to a scatter plot:

Code
p7 <- plot_ly(data = iris, 
              x = ~Sepal.Length, 
              y = ~Sepal.Width, 
              type = "scatter", 
              mode = "markers") %>%
  layout(title = "Scatter Plot with Annotation",
         xaxis = list(title = "Sepal Length"),
         yaxis = list(title = "Sepal Width"),
         annotations = list(
           list(x = 6, y = 4, 
                text = "Note: Zoom or pan to explore!",
                showarrow = TRUE, 
                arrowhead = 2,
                ax = 20, ay = -30)))

p7
  • annotations adds text with an arrow at coordinates (6, 4).

Exporting a Plot

Save an interactive plot as an HTML file:

Code
# Save as HTML (uncomment to use)
# htmlwidgets::saveWidget(p2, "iris_scatter.html")
  • saveWidget() from htmlwidgets exports plots for sharing.

Best Practices

  • Data Preparation: Use data frames. Summarize data with table() or dplyr for bar plots or aggregations.
  • Interactivity: Leverage plotly’s zoom, pan, and hover features for data exploration.
  • Customization: Use layout() for titles, axes, and annotations; marker or line for styling points and lines.
  • Performance: For large datasets, sample or aggregate to improve rendering speed.
  • Resources: Explore ?plot_ly or the Plotly R documentation for advanced features like heatmaps, box plots, or animations.

Summary and Conclusions

This plotly tutorial demonstrated creating interactive visualizations in R using the quakes dataset (1000 seismic events near the Tonga Trench). It covered heatmap (histogram2d), box, and contour plots with plot_ly(), plus size adjustments via layout(width, height). Examples showed earthquake location density, magnitude distributions by depth bins, and depth contours across latitude and longitude. The tutorial fixed errors (e.g., hist2d to histogram2d) and provided best practices for data preparation, interactivity, and performance.

plotly excels at creating interactive, web-based plots with zooming, panning, and tooltips, surpassing lattice for dynamic visualizations. It’s ideal for exploring multivariate data like quakes but requires careful binning or subsampling for large datasets. The formula interface and customization options make it versatile, complementing static tools like lattice or ggplot2.

Resources