Daily Routines Visualisation
In this take-home exercise 4, we are going to analyse and describe the daily routines of two selected participants of the city of Engagement, Ohio USA by using appropriate visual analytics methods.
The data is processed by using appropriate tidyverse family of packages, whereas the statistical graphics are prepared using ggplot2/ViSiElse and its extensions.
Before we get started, it is important for us to ensure that the required R packages have been installed. If yes, we will load the R packages. If they have yet to be installed, we will install the R packages and load them onto R environment.
The packages required for this exercise are tidyverse, data.table, ViSiElse, zoo and patchwork.
packages = c('tidyverse','data.table','ViSiElse','zoo','patchwork')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
The original datasets were obtained from VAST Challenge 2022 in csv format. They show the daily routines of 1011 residents of Engagement, OH that have agreed to participate in this study.
The code chunk below imports datasets from a folder called
ActivityLogs into R by using fread() of
data.table and saves it as tibble data frame called
logs. But firstly, list.files() is used to define
the path where the data files reside and map_df() of
purrr is used to apply a function to elements of a
list, and then bind the data frames together.
logs <- list.files(path = "./data/ActivityLogs/",
pattern = "*.csv",
full.names = T) %>%
map_df(~fread(.))
As the datasets are large and the process is time consuming, we will
save the output tibble into an output file in rds format. We will
randomly select 2 participants from the joined dataset, namely
Participant #1 and #2 by using filter() of dplyr and we
will see how typical Monday and Sunday look like for each participant by
selecting May 2, 2022 and May 8, 2022. To do this, we will create a new
column called Date by using as.Date() of zoo.
In the future, we just need to read the saved rds data file into R by
using readRDS().
logs <- readRDS('data/logs.rds')
In this section, we will combine the 2 columns - currentMode and
sleepStatus into one column called Activity by using
unite().
The code chunk below will select participant #1 and his/her activities on Monday and save it as a new data frame called logs1_mon.
The code chunk below will select participant #2 and his/her activities on Monday and save it as a new data frame called logs2_mon.
The code chunk below will select participant #1 and his/her activities on Sunday and save it as a new data frame called logs1_sun.
The code chunk below will select participant #2 and his/her activities on Sunday and save it as a new data frame called logs2_sun.
m1 <- ggplot(data = logs1_mon,
aes(x = timestamp, y = Activity)) +
geom_point() +
labs(x = "Time",
y = "Activity",
title = "Typical Monday for Participant #1") +
theme(axis.title.y= element_text(angle=0))
m2 <- ggplot(data = logs2_mon,
aes(x = timestamp, y = Activity)) +
geom_point() +
labs(x = "Time",
y = "Activity",
title = "Typical Monday for Participant #2") +
theme(axis.title.y= element_text(angle=0))
m1 / m2
It is observed that participant #2 wakes up earlier than participant #1 and sleeps earlier at night. In the morning, participant #2 arrives at the office and immediately goes to restaurant for breakfast, while participant #1 does not go to restaurant at all throughout the day. The working hours for both participants are similar, but participant #2 starts and stops working earlier than participant #1.
s1<- ggplot(data = logs1_sun,
aes(x = timestamp, y = Activity)) +
geom_point() +
labs(x = "Time",
y = "Activity",
title = "Typical Sunday for Participant #1") +
theme(axis.title.y= element_text(angle=0))
s2<- ggplot(data = logs2_sun,
aes(x = timestamp, y = Activity)) +
geom_point() +
labs(x = "Time",
y = "Activity",
title = "Typical Sunday for Participant #2") +
theme(axis.title.y= element_text(angle=0))
s1 / s2
Similar to Monday’s routines, on Sunday participant #2 also wakes up and sleeps earlier than participant #1. Participant #2 spends his/her entire day at home, while participant #1 spends some time in the morning to go to the restaurant for breakfast.
Daily activities could also be visualised by using ViSiElse method. However, we must specify duration for each activity and put this on x-axis instead of timestamp. The below snapshot shows the example of ViSiElse graph.
The initial code chunk is shown here. To get a total duration for each activity, we will aggregate a new column called Duration.
logs1_v <- logs %>%
filter(participantId == '1') %>%
unite(Activity, c(currentMode,sleepStatus), sep = '-', remove = FALSE) %>%
mutate(Date=as.Date(timestamp),
EndTime = lead(timestamp),
Duration = difftime(EndTime,timestamp,units="mins")) %>%
filter(Date == '2022-05-02') %>%
pivot_wider(names_from = "Activity",
values_from = "Duration")
print(logs1_v)
# A tibble: 288 x 18
timestamp currentLocation participantId currentMode
<dttm> <chr> <int> <chr>
1 2022-05-02 00:00:00 POINT (-1526.9372331~ 1 AtHome
2 2022-05-02 00:05:00 POINT (-1526.9372331~ 1 AtHome
3 2022-05-02 00:10:00 POINT (-1526.9372331~ 1 AtHome
4 2022-05-02 00:15:00 POINT (-1526.9372331~ 1 AtHome
5 2022-05-02 00:20:00 POINT (-1526.9372331~ 1 AtHome
6 2022-05-02 00:25:00 POINT (-1526.9372331~ 1 AtHome
7 2022-05-02 00:30:00 POINT (-1526.9372331~ 1 AtHome
8 2022-05-02 00:35:00 POINT (-1526.9372331~ 1 AtHome
9 2022-05-02 00:40:00 POINT (-1526.9372331~ 1 AtHome
10 2022-05-02 00:45:00 POINT (-1526.9372331~ 1 AtHome
# ... with 278 more rows, and 14 more variables: hungerStatus <chr>,
# sleepStatus <chr>, apartmentId <int>, availableBalance <dbl>,
# jobId <int>, financialStatus <chr>, dailyFoodBudget <int>,
# weeklyExtraBudget <dbl>, Date <date>, EndTime <dttm>,
# `AtHome-Sleeping` <drtn>, `AtHome-Awake` <drtn>,
# `Transport-Awake` <drtn>, `AtWork-Awake` <drtn>
Marsja, E. (2021, February 14). How to Concatenate Two Columns (or More) in R – stringr, tidyr. https://www.marsja.se/how-to-concatenate-two-columns-or-more-in-r-stringr-tidyr/