Day 03 Applied Skills

Data Visualization with ggplot2

ggplot2 is the most powerful data visualization library in any language. Based on the Grammar of Graphics, it builds plots by layering components. Today yo

~1 hour Hands-on Precision AI Academy

Today's Objective

stat_smooth() adds regression lines with confidence intervals.

01

The Grammar of Graphics

ggplot2 maps data variables (aesthetics: x, y, color, size, shape) to geometric objects (geoms: points, lines, bars, boxes). Every plot starts with ggplot(data, aes(x=var, y=var)) then adds layers with +. geom_point() makes scatter plots; geom_line() line charts; geom_histogram() histograms; geom_col()/geom_bar() bar charts; geom_boxplot() box plots. Facets (facet_wrap, facet_grid) create small multiples.

02

Scales, Themes, and Labels

Scales control how data maps to visual properties: scale_color_brewer() uses ColorBrewer palettes (designed for clarity and accessibility), scale_x_log10() transforms axes, scale_fill_manual() sets custom colors. Themes control non-data elements: theme_minimal(), theme_classic(), theme_bw(). Labs(title, x, y, color) sets labels. theme() customizes individual elements: theme(legend.position='bottom').

03

Advanced ggplot2: Statistical Layers

stat_smooth() adds regression lines with confidence intervals. geom_violin() shows distribution shape. geom_density_2d() shows 2D density contours. geom_tile() makes heat maps. coord_flip() rotates axes for horizontal bar charts. patchwork library combines multiple plots. ggsave() exports to PDF, PNG, or SVG at any resolution — R produces print-ready graphics with no post-processing.

r
r
library(ggplot2)
library(dplyr)

# Scatter plot with regression line
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3, alpha = 0.8) +
  geom_smooth(method = 'lm', se = TRUE, aes(group = 1),
              color = 'black', linetype = 'dashed') +
  scale_color_brewer(palette = 'Set1', name = 'Cylinders') +
  labs(
    title    = 'Car Weight vs Fuel Efficiency',
    subtitle = 'Data: mtcars | Dashed line: linear fit',
    x        = 'Weight (1000 lbs)',
    y        = 'Miles per Gallon'
  ) +
  theme_minimal(base_size = 13)

# Save high-resolution
ggsave('mpg_plot.pdf', width = 8, height = 5, dpi = 300)

# Faceted bar chart
ggplot(diamonds, aes(x = cut, fill = clarity)) +
  geom_bar(position = 'fill') +
  facet_wrap(~color) +
  coord_flip() +
  labs(y = 'Proportion', title = 'Diamond Cut by Clarity and Color') +
  theme_bw()
💡
Save all ggplot2 graphics with ggsave() as PDF or SVG for scalable vector output. PNG at 300 DPI is fine for presentations. Never screenshot plots from the RStudio Viewer — the resolution will be poor.

Supporting References & Reading

Go deeper with these external resources.

Docs
Data Visualization with ggplot2 Official documentation for r programming.
GitHub
Data Visualization with ggplot2 Open source examples and projects for Data Visualization with ggplot2
MDN
MDN Web Docs Comprehensive web technology reference

Day 3 Checkpoint

Before moving on, confirm understanding of these key concepts:

Continue To Day 4
Day 4 of the R Programming in 5 Days course