This R code creates a captivating data visualization exploring Groundhog Day predictions in the USA, utilizing TidyTuesday data for week 5, 2024. The visualization comprises three subplots, each offering distinct insights into predictor distribution, temporal trends, and top predictors.

The Code

# Load necessary libraries
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.2     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(tidytuesdayR)
library(showtext)
## Lade nötiges Paket: sysfonts
## Lade nötiges Paket: showtextdb
library(glue)
library(ggtext)
library(cropcircles)
## Warning: Paket 'cropcircles' wurde unter R Version 4.3.2 erstellt
library(magick)
## Linking to ImageMagick 6.9.12.3
## Enabled features: cairo, freetype, fftw, ghostscript, heic, lcms, pango, raw, rsvg, webp
## Disabled features: fontconfig, x11
library(patchwork)
library(maps)
## 
## Attache Paket: 'maps'
## 
## Das folgende Objekt ist maskiert 'package:purrr':
## 
##     map
library(thematic)

# Set background color and colors for data points
bg <- "white"
col1 <- okabe_ito()[1]
col2 <- okabe_ito()[2]

# Define symbols for Twitter, Mastodon, Link, and Data
twitter <- glue("<span style='color:{col2};font-family:fa-brands;'>&#xf099;</span>")
mastodon <- glue("<span style='color:{col2};font-family:fa-brands;'>&#xf4f6;</span>")
link <- glue("<span style='color:{col2};font-family:fa-solid;'>&#xf0c1;</span>")
data <- glue("<span style='color:{col2};font-family:fa-solid;'>&#xf1c0;</span>")
space <- glue("<span style='color:{bg}'>-</span>")
space2 <- glue("<span style='color:{bg}'>--</span>") # Horizontal lines for formatting.

# Define subtitles for each subplot
s.1 <- glue("
	<strong>Predictor Distribution Across the USA</strong><br>
	The size of the points indicates how many predictions have already been made by the predictor.
	<br>The color indicates whether this predictor <span style='color:{col2};'><strong>
	is a living groundhog</strong></span> or <span style='color:{col1};'><strong>not</strong></span>.
	")
s.2 <- glue("
	<strong>Temporal Trends in Groundhog Predictions:</strong><br>
	An Exploration of Predictions in the USA (Post-2000)<br>
	<span style='color:{col1};'><strong>Early Spring</strong></span> vs. 
	<span style='color:{col2};'><strong>More Winter</strong></span>.
	")
s.3 <- glue("
	<strong>Visualizing Groundhog Predictions:</strong><br>
	Top 6 predictors with most predictions in the USA<br>
	<span style='color:{col1};'><strong>Early Spring</strong></span> vs. 
	<span style='color:{col2};'><strong>More Winter</strong></span>.
	")

# Define title and caption
t <- "<strong>Exploring Groundhog Day: Predictor Distribution, 
	Temporal Trends, and Top Predictors in the USA</strong>"
cap <- glue("{twitter}{space2}@web_design_fh{space2} 
	{space2}{mastodon}{space2}@frankhaenel @fosstodon.org{space2}
	{space2}{link}{space}{space2}www.frankhaenel.de{space2}
	{data}{space2}groundhog-day.com")

# Load TidyTuesday data for week 5, 2024
tuesdata <- tidytuesdayR::tt_load(2024, week = 5)
## --- Compiling #TidyTuesday Information for 2024-01-30 ----
## --- There are 2 files available ---
## --- Starting Download ---
## 
## 	Downloading file 1 of 2: `predictions.csv`
## 	Downloading file 2 of 2: `groundhogs.csv`
## --- Download complete ---
groundhogs <- tuesdata$groundhogs
predictions <- tuesdata$predictions

# Add Font Awesome fonts
font_add('fa-reg', 'c:/Users/info/OneDrive/Dokumente/fonts/Font Awesome 6 Free-Regular-400.otf')
font_add('fa-brands', 'c:/Users/info/OneDrive/Dokumente/fonts/Font Awesome 6 Brands-Regular-400.otf')
font_add('fa-solid', 'c:/Users/info/OneDrive/Dokumente/fonts/Font Awesome 6 Free-Solid-900.otf')
showtext_auto()

# Create subplot 1: US States map with groundhog predictions
us_states <- map_data("state")
sub.1 <- ggplot() +
  geom_polygon(data = us_states,
               mapping = aes(x = long, y = lat,
                             group = group), fill = "white", color = "black") +
  geom_point(data = groundhogs %>% filter(country == "USA"),
             aes(x=longitude, y = latitude,
                 size=predictions_count, color=is_groundhog),
             show.legend = FALSE) +
  theme_void() +
  scale_color_manual(values=c(col1, col2)) +
  labs(subtitle = s.1) +
  theme(plot.subtitle = element_markdown())

# Create subplot 2: Temporal trends in groundhog predictions
a <- predictions %>%
  select(-details) %>%
  drop_na() %>%
  left_join(groundhogs, by = "id") %>%
  filter(country == "USA" & year > 2000) %>%
  group_by(year, shadow) %>%
  count()

sub.2 <- a %>%
  ggplot(aes(x=year, y=n, group=shadow, color=shadow)) +
  geom_line(show.legend = FALSE, linewidth=2) +
  scale_color_manual(values=c(col1, col2)) +
  theme_bw() +
  theme(plot.subtitle = element_markdown(),
        axis.title.x = element_blank()) +
  labs(subtitle = s.2)

# Create subplot 3: Visualizing top 6 predictors with most predictions
groundhogs$crop <- 0
for (i in 1:10){
  a <- image_read(groundhogs$image[i]) %>%
    image_write(paste0(i,".jpeg"))
  b <- crop_circle(
    a,
    to = NULL,
    border_size = NULL,
    border_colour = "black",
    bg_fill = NULL,
    just = "center"
  )
  groundhogs$crop[i] <- image_read(b) %>%
    image_write(paste0(i,".png"))
  groundhogs$crop[i] <- glue("<img src='{groundhogs$crop[i]}' 
							width='60'/><br>
                           {groundhogs$shortname[i]}")
}

a <- head(
  predictions %>%
    select(-details) %>%
    drop_na() %>%
    group_by(id, shadow) %>%
    count() %>%
    left_join(groundhogs, by = "id") %>%
    filter(country == "USA"),
  12)

sub.3 <- ggplot(data=a, aes(x=crop, y=n, fill=shadow)) +
  theme_bw() +
  theme(plot.subtitle = element_markdown(),
        axis.text.x = element_markdown(lineheight = 1.3),
        axis.title.x = element_blank()) +
  scale_fill_manual(values=c(col1, col2)) +
  geom_col(show.legend = FALSE) +
  labs(subtitle = s.3)

# Create final plot combining all subplots
sub.1 + (sub.2/sub.3) +
  plot_layout(widths = c(1.5, 1)) + plot_annotation(
    title = t,
    caption = cap) &
  theme(plot.title = element_markdown(size = 15, hjust = 0.5, lineheight = 1.3),
        plot.caption = element_markdown(size = 10, hjust = 0, lineheight = 1.3))
Three-panel data visualization illustrating Groundhog Day predictions in the USA, showcasing predictor distribution, temporal trends, and top predictors. Each subplot provides unique insights. Panel 1: USA map with groundhog predictions. Panel 2: Line chart depicting temporal trends. Panel 3: Bar chart presenting top predictors with circular-cropped groundhog images.

R Code Documentation

Overview

This R code generates a comprehensive and visually appealing data visualization exploring Groundhog Day predictions across the USA. Leveraging TidyTuesday data for week 5, 2024, the visualization includes three subplots, each providing unique insights into predictor distribution, temporal trends, and top predictors.

Libraries Used

  • tidyverse(Wickham et al. 2019): A collection of packages for data manipulation and visualization.
  • tidytuesdayR(Hughes 2022): A package for accessing TidyTuesday datasets.
  • showtext(Qiu 2023): Used for incorporating custom fonts into the plot.
  • glue(Hester and Bryan 2022): Enables the creation of HTML-like text strings.
  • ggtext(Wilke and Wiernik 2022): Allows the use of rich text formatting in ggplot2 plots.
  • cropcircles(Oehm 2023), magick(Ooms 2023): Utilized for image processing and circular cropping.
  • patchwork(Pedersen 2023): Facilitates the arrangement of multiple plots.
  • maps(Brownrigg 2022), thematic(Sievert, Schloerke, and Cheng 2021): Used for creating thematic maps.

Symbols and Styling

  • Custom symbols from Font Awesome are used for Twitter, Mastodon, Link, and Data icons.
  • Glue is employed for creating stylized text strings and formatting elements.
  • Horizontal lines and spacing are incorporated for improved visual presentation.

Subplots

  1. Predictor Distribution Across the USA
    • A map of US states with points representing groundhog predictions.
    • Point size indicates prediction frequency, and color distinguishes living groundhogs.
  2. Temporal Trends in Groundhog Predictions
    • A line chart showcasing temporal trends in groundhog predictions post-2000.
    • Differentiating between “Early Spring” and “More Winter” predictions.
  3. Visualizing Top Predictors
    • A bar chart displaying the top 6 predictors with the most predictions.
    • Circular-cropped images of the groundhogs accompany the chart.

Fonts and Styling

  • Font Awesome fonts (fa-reg, fa-brands, fa-solid) are added for custom styling.
  • showtext is used to automatically load and utilize these custom fonts.
  • Markdown-like styling is applied to titles, subtitles, and captions for enhanced visual appeal.

Execution Steps

  1. Load necessary libraries.
  2. Define symbols and styling elements.
  3. Load TidyTuesday data for week 5, 2024.
  4. Create three subplots, each focusing on different aspects of Groundhog Day predictions.
  5. Combine subplots into a final visualization.
  6. Apply formatting and styling to the overall plot.
  7. Display the plot with a title, subtitle, and caption.

Conclusion

This code delivers an engaging and informative exploration of Groundhog Day predictions, showcasing a blend of data science, visualization, and modern design principles. Whether used for educational purposes, presentations, or personal exploration, this visualization promises a captivating experience.

References