+ - 0:00:00
Notes for current slide
Notes for next slide

Tidy Time Series Graphics with R

IASSL Workshop

Dr. Priyanga D. Talagala, University of Moratuwa

21-25, February, 2022

1

Main packages required

# Data manipulation and plotting functions
library(tidyverse)
# Data for "Forecasting: Principles and Practice" (3rd Edition)
library(fpp3)
  • tsibble Package: Time series manipulation
  • tsibbledata Package: Tidy time series data
  • feasts Package: Time series graphics and statistics
  • fable Package: Forecasting functions

tidyverts: Tidy tools for time series: https://tidyverts.org/

2

tsibble objects

  • A tsibble allows storage and manipulation of multiple time series in R.
  • It contains:

    – An index: time information about the observation

    – Measured variable(s): numbers of interest

    – Key variable(s): optional unique identifiers for each series

    – It works with tidyverse functions.

3

The tsibble index

set.seed(1)
ts <- tsibble(t = seq(36),
y = rnorm(36),
index = t)
ts
## # A tsibble: 36 x 2 [1]
## t y
## <int> <dbl>
## 1 1 -0.626
## 2 2 0.184
## 3 3 -0.836
## 4 4 1.60
## 5 5 0.330
## 6 6 -0.820
## 7 7 0.487
## 8 8 0.738
## 9 9 0.576
## 10 10 -0.305
## # … with 26 more rows
4

The tsibble index

set.seed(1)
ts <- tsibble(t = seq(36),
y = rnorm(36),
index = t)
ts
## # A tsibble: 36 x 2 [1]
## t y
## <int> <dbl>
## 1 1 -0.626
## 2 2 0.184
## 3 3 -0.836
## 4 4 1.60
## 5 5 0.330
## 6 6 -0.820
## 7 7 0.487
## 8 8 0.738
## 9 9 0.576
## 10 10 -0.305
## # … with 26 more rows
mydata <- tsibble(
year = 2016:2020,
y = c(123, 39, 78, 52, 110),
index = year)
mydata
## # A tsibble: 5 x 2 [1Y]
## year y
## <int> <dbl>
## 1 2016 123
## 2 2017 39
## 3 2018 78
## 4 2019 52
## 5 2020 110
4

tibble vs tsibble

#tibble
mytibble <- tibble(
date = as.Date("2017-01-01") + 0:10,
y = c(123, 39, 78, 52, 110, 59,
78, 67, 67, 80, 90))
mytibble
## # A tibble: 11 × 2
## date y
## <date> <dbl>
## 1 2017-01-01 123
## 2 2017-01-02 39
## 3 2017-01-03 78
## 4 2017-01-04 52
## 5 2017-01-05 110
## 6 2017-01-06 59
## 7 2017-01-07 78
## 8 2017-01-08 67
## 9 2017-01-09 67
## 10 2017-01-10 80
## 11 2017-01-11 90
5

tibble vs tsibble

#tibble
mytibble <- tibble(
date = as.Date("2017-01-01") + 0:10,
y = c(123, 39, 78, 52, 110, 59,
78, 67, 67, 80, 90))
mytibble
## # A tibble: 11 × 2
## date y
## <date> <dbl>
## 1 2017-01-01 123
## 2 2017-01-02 39
## 3 2017-01-03 78
## 4 2017-01-04 52
## 5 2017-01-05 110
## 6 2017-01-06 59
## 7 2017-01-07 78
## 8 2017-01-08 67
## 9 2017-01-09 67
## 10 2017-01-10 80
## 11 2017-01-11 90
# Converting to a tsibble
mytsibble <- mytibble %>%
as_tsibble(index = date)
mytsibble
## # A tsibble: 11 x 2 [1D]
## date y
## <date> <dbl>
## 1 2017-01-01 123
## 2 2017-01-02 39
## 3 2017-01-03 78
## 4 2017-01-04 52
## 5 2017-01-05 110
## 6 2017-01-06 59
## 7 2017-01-07 78
## 8 2017-01-08 67
## 9 2017-01-09 67
## 10 2017-01-10 80
## 11 2017-01-11 90
5
mytsibble
## # A tsibble: 11 x 2 [1D]
## date y
## <date> <dbl>
## 1 2017-01-01 123
## 2 2017-01-02 39
## 3 2017-01-03 78
## 4 2017-01-04 52
## 5 2017-01-05 110
## 6 2017-01-06 59
## 7 2017-01-07 78
## 8 2017-01-08 67
## 9 2017-01-09 67
## 10 2017-01-10 80
## 11 2017-01-11 90
6
mytsibble
## # A tsibble: 11 x 2 [1D]
## date y
## <date> <dbl>
## 1 2017-01-01 123
## 2 2017-01-02 39
## 3 2017-01-03 78
## 4 2017-01-04 52
## 5 2017-01-05 110
## 6 2017-01-06 59
## 7 2017-01-07 78
## 8 2017-01-08 67
## 9 2017-01-09 67
## 10 2017-01-10 80
## 11 2017-01-11 90
mytsibble %>% autoplot(y)

6

Working with tsibble objects

Example: Quarterly Australian Electricity Production

elec <- aus_production %>%
select(Quarter, Electricity) %>%
filter(year(Quarter) >= 1992)
elec
## # A tsibble: 74 x 2 [1Q]
## Quarter Electricity
## <qtr> <dbl>
## 1 1992 Q1 38332
## 2 1992 Q2 39774
## 3 1992 Q3 42246
## 4 1992 Q4 38498
## 5 1993 Q1 39460
## 6 1993 Q2 41356
## 7 1993 Q3 42949
## 8 1993 Q4 40974
## 9 1994 Q1 40162
## 10 1994 Q2 41199
## # … with 64 more rows
7

Working with tsibble objects

Example: Quarterly Australian Electricity Production

elec <- aus_production %>%
select(Quarter, Electricity) %>%
filter(year(Quarter) >= 1992)
elec
## # A tsibble: 74 x 2 [1Q]
## Quarter Electricity
## <qtr> <dbl>
## 1 1992 Q1 38332
## 2 1992 Q2 39774
## 3 1992 Q3 42246
## 4 1992 Q4 38498
## 5 1993 Q1 39460
## 6 1993 Q2 41356
## 7 1993 Q3 42949
## 8 1993 Q4 40974
## 9 1994 Q1 40162
## 10 1994 Q2 41199
## # … with 64 more rows
elec %>%
autoplot(Electricity)

7

Seasonal plots

elec <- aus_production %>%
select(Quarter, Electricity) %>%
filter(year(Quarter) >= 1992)
elec
## # A tsibble: 74 x 2 [1Q]
## Quarter Electricity
## <qtr> <dbl>
## 1 1992 Q1 38332
## 2 1992 Q2 39774
## 3 1992 Q3 42246
## 4 1992 Q4 38498
## 5 1993 Q1 39460
## 6 1993 Q2 41356
## 7 1993 Q3 42949
## 8 1993 Q4 40974
## 9 1994 Q1 40162
## 10 1994 Q2 41199
## # … with 64 more rows
8

Seasonal plots

elec <- aus_production %>%
select(Quarter, Electricity) %>%
filter(year(Quarter) >= 1992)
elec
## # A tsibble: 74 x 2 [1Q]
## Quarter Electricity
## <qtr> <dbl>
## 1 1992 Q1 38332
## 2 1992 Q2 39774
## 3 1992 Q3 42246
## 4 1992 Q4 38498
## 5 1993 Q1 39460
## 6 1993 Q2 41356
## 7 1993 Q3 42949
## 8 1993 Q4 40974
## 9 1994 Q1 40162
## 10 1994 Q2 41199
## # … with 64 more rows
elec %>%
gg_season(Electricity,
labels="both")

8

Seasonal plots

elec %>%
gg_season(Electricity,
labels="left")

  • Data plotted against the individual “seasons” in which the data were observed. (In this case a “season” is a month.)

  • Something like a time plot except that the data from each season are overlapped.

  • Enables the underlying seasonal pattern to be seen more clearly, and also allows any substantial departures from the seasonal pattern to be easily identified.

9
elec %>%
gg_season(Electricity, labels="left") +
ylab("Electricity production in gigawatt hours") +
ggtitle("Quarterly Australian Electricity Production")

10

Seasonal subseries plots

  • Data for each season collected together in time plot as separate time series.

  • Enables the underlying seasonal pattern to be seen clearly, and changes in seasonality over time to be visualized.

11

Seasonal subseries plots

  • Data for each season collected together in time plot as separate time series.

  • Enables the underlying seasonal pattern to be seen clearly, and changes in seasonality over time to be visualized.

elec %>% gg_subseries(Electricity)

11

Multiple seasonal periods

Half-hourly electricity demand for Victoria, Australia

vic_elec
## # A tsibble: 52,608 x 5 [30m] <Australia/Melbourne>
## Time Demand Temperature Date Holiday
## <dttm> <dbl> <dbl> <date> <lgl>
## 1 2012-01-01 00:00:00 4383. 21.4 2012-01-01 TRUE
## 2 2012-01-01 00:30:00 4263. 21.0 2012-01-01 TRUE
## 3 2012-01-01 01:00:00 4049. 20.7 2012-01-01 TRUE
## 4 2012-01-01 01:30:00 3878. 20.6 2012-01-01 TRUE
## 5 2012-01-01 02:00:00 4036. 20.4 2012-01-01 TRUE
## 6 2012-01-01 02:30:00 3866. 20.2 2012-01-01 TRUE
## 7 2012-01-01 03:00:00 3694. 20.1 2012-01-01 TRUE
## 8 2012-01-01 03:30:00 3562. 19.6 2012-01-01 TRUE
## 9 2012-01-01 04:00:00 3433. 19.1 2012-01-01 TRUE
## 10 2012-01-01 04:30:00 3359. 19.0 2012-01-01 TRUE
## # … with 52,598 more rows
12
vic_elec %>%
autoplot(Demand)

13
vic_elec %>%
gg_season(Demand)

14
vic_elec %>%
gg_season(Demand, period = "month")

15
vic_elec %>%
gg_season(Demand, period = "week")

16
vic_elec %>%
gg_season(Demand, period = "day")

17
vic_elec %>%
gg_season(Demand, period = "day") +
theme(text = element_text(size=14))

18
# install.packages("devtools")
# devtools::install_github("thiyangt/covid19srilanka")
library(covid19srilanka)
district.wise.cases
## # A tibble: 832 × 3
## Date District Count
## <date> <chr> <dbl>
## 1 2021-08-01 Colombo 71267
## 2 2021-08-01 Gampaha 56085
## 3 2021-08-01 Kalutara 33300
## 4 2021-08-01 Kandy 14576
## 5 2021-08-01 Kurunagala 15327
## 6 2021-08-01 Galle 14841
## 7 2021-08-01 Ratnapura 12267
## 8 2021-08-01 Matara 7778
## 9 2021-08-01 Matale 5694
## 10 2021-08-01 Nuwara Eliya 7917
## # … with 822 more rows
19
# install.packages("devtools")
# devtools::install_github("thiyangt/covid19srilanka")
library(covid19srilanka)
district.wise.cases
## # A tibble: 832 × 3
## Date District Count
## <date> <chr> <dbl>
## 1 2021-08-01 Colombo 71267
## 2 2021-08-01 Gampaha 56085
## 3 2021-08-01 Kalutara 33300
## 4 2021-08-01 Kandy 14576
## 5 2021-08-01 Kurunagala 15327
## 6 2021-08-01 Galle 14841
## 7 2021-08-01 Ratnapura 12267
## 8 2021-08-01 Matara 7778
## 9 2021-08-01 Matale 5694
## 10 2021-08-01 Nuwara Eliya 7917
## # … with 822 more rows
covidsl <- district.wise.cases %>%
as_tsibble(index = Date,
key = District)
covidsl
## # A tsibble: 832 x 3 [1D]
## # Key: District [26]
## Date District Count
## <date> <chr> <dbl>
## 1 2021-08-01 Ampara 2117
## 2 2021-08-02 Ampara 2125
## 3 2021-08-03 Ampara 2137
## 4 2021-08-04 Ampara 2144
## 5 2021-08-05 Ampara 2150
## 6 2021-08-06 Ampara 2150
## 7 2021-08-07 Ampara 2153
## 8 2021-08-08 Ampara 2175
## 9 2021-08-09 Ampara 2180
## 10 2021-08-10 Ampara 2190
## # … with 822 more rows
19
p <- covidsl %>% autoplot(Count)
p

20
plotly::ggplotly(p)
Aug 02Aug 09Aug 16Aug 23Aug 300250005000075000100000
AmparaAnuradhapuraBadullaBatticaloaColomboGalleGampahaHambantotaJaffnaKalmunaiKalutaraKandyKegalleKilinochchiKurunagalaMannarMataleMataraMonaragalaMullaitivuNuwara EliyaPolonnaruwaPuttalamRatnapuraTrincomaleeVavuniyaDate [1D]CountDistrict
21
library(coronavirus)
head(coronavirus)
## date province country lat long type cases
## 1 2020-01-22 Alberta Canada 53.9333 -116.5765 confirmed 0
## 2 2020-01-23 Alberta Canada 53.9333 -116.5765 confirmed 0
## 3 2020-01-24 Alberta Canada 53.9333 -116.5765 confirmed 0
## 4 2020-01-25 Alberta Canada 53.9333 -116.5765 confirmed 0
## 5 2020-01-26 Alberta Canada 53.9333 -116.5765 confirmed 0
## 6 2020-01-27 Alberta Canada 53.9333 -116.5765 confirmed 0
22
covidsl <- coronavirus %>%
filter(country == "Sri Lanka") %>%
select(date, type, cases)
head(covidsl)
## date type cases
## 1 2020-01-22 confirmed 0
## 2 2020-01-23 confirmed 0
## 3 2020-01-24 confirmed 0
## 4 2020-01-25 confirmed 0
## 5 2020-01-26 confirmed 0
## 6 2020-01-27 confirmed 1
23
covidsl <- coronavirus %>%
filter(country == "Sri Lanka") %>%
select(date, type, cases)
head(covidsl)
## date type cases
## 1 2020-01-22 confirmed 0
## 2 2020-01-23 confirmed 0
## 3 2020-01-24 confirmed 0
## 4 2020-01-25 confirmed 0
## 5 2020-01-26 confirmed 0
## 6 2020-01-27 confirmed 1
covidsl <- covidsl %>%
as_tsibble(index = date,
key = type )
head(covidsl)
## # A tsibble: 6 x 3 [1D]
## # Key: type [1]
## date type cases
## <date> <chr> <int>
## 1 2020-01-22 confirmed 0
## 2 2020-01-23 confirmed 0
## 3 2020-01-24 confirmed 0
## 4 2020-01-25 confirmed 0
## 5 2020-01-26 confirmed 0
## 6 2020-01-27 confirmed 1
23
p <- covidsl %>% autoplot(cases)
p

24
library(gganimate)
p +
transition_reveal(along = date)

25

pridiltal and thiyangt

Acknowledgements:

This work was supported in part by RETINA research lab funded by the OWSD, a program unit of United Nations Educational, Scientific and Cultural Organization (UNESCO).

Key References

All rights reserved by Thiyanga S. Talagala and Priyanga D Talagala

26

Main packages required

# Data manipulation and plotting functions
library(tidyverse)
# Data for "Forecasting: Principles and Practice" (3rd Edition)
library(fpp3)
  • tsibble Package: Time series manipulation
  • tsibbledata Package: Tidy time series data
  • feasts Package: Time series graphics and statistics
  • fable Package: Forecasting functions

tidyverts: Tidy tools for time series: https://tidyverts.org/

2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow