The Safe Blues Campus Experiment is taking place at The University of Auckland during 2021 and 2022. Live data from the experiment is presented on the Safe Blues Dashboard. Further, participant prize statistics are here.
Importantly, data from Phases 1, 2, and 3 of the experiment that took place during 2021 is now available. Download the 2021 data zip file (43.8Mb). Open publication of this fully anonymized data is approved within the experiment's ethics application and its revisions (UAHPEC22143).
This is a dataset generated by the Safe Blues campus experiment at the University of Auckland. This experiment involved the simulation of several simultaneous real-world epidemics by spreading safe virtual virus-like tokens (or strands) between participating smartphones. Here, we present anonymised and aggregated results regarding strand transmission and campus attendence.
The Safe Blues dataset is organised within the data
folder, which includes
the strand parameters, the transmission data, and the participant data.
This folder is arranged in the following layout:
data
├── daily
│ ├── strands
│ │ ├── strand1.csv
│ │ ├── strand2.csv
│ │ └── ...
│ └── participants.csv
├── hourly
│ ├── strands
│ │ ├── strand1.csv
│ │ ├── strand2.csv
│ │ └── ...
│ └── participants.csv
└── strands.csv
The strand parameters determine the epidemiological behaviour of each virtual
virus-like token, including its transmissibility, incubation duration, and
infection duration. A table of these parameters is stored in
data/strands.csv
. This table contains a row for each strand circulated during
the campus experiment and contains the following columns:
strand_id
: the unique numerical identifier given to each strand,batch
: the release batch or group of strands within which each strand
was released,model
: the epidemiological model used by each strand (either "SI", "SEI",
"SIR", or "SEIR"),start_utc
& start_nzt
: the time at which each strand's reporting begins
in either the UTC or NZST/NZDT timezone,seed_utc
& seed_nzt
: the time at which each strand's initial infections
occur in either the UTC or NZST/NZDT timezone,stop_utc
& stop_nzt
: the time at which each strand's reporting ends in
either the UTC or NZST/NZDT timezone.initial
: a participant's probability of initial infection for each
strand,strength
: the strength of transmission for each strand,radius
: the maximum radius of transmission for each strand,incubation_mean
& incubation_shape
: the mean and shape parameters of
each strand's gamma-distributed incubation duration (missing when
model == "SI"
or model == "SIR"
), andinfection_mean
& infection_shape
: the mean and shape parameters of
each strand's gamma-distributed infection duration (missing when
model == "SI"
or model == "SEI"
).The transmission data provides aggregate measurements of the spread of strands
throughout their reporting periods. We store the progression of the i
ᵗʰ
strand (strand_id == i
) as a daily time series in
data/daily/strands/strand(i).csv
and as an hourly time series in
data/hourly/strands/strand(i).csv
. These tables have a row for each time
point (either daily or hourly) and have the following columns:
strand_id
: the unique numerical identifier given to the strand,time_utc
& time_nzt
: the time of each measurement in either the UTC or
NZST/NZDT timezone,susceptible
: the current number of participants that are susceptible to
the strand,exposed
: the current number of participants that are incubating the
strand (missing when model == "SI"
or model == "SIR"
),infected
: the current number of participants that are infected with the
strand,recovered
: the current number of participants that have recovered from
the strand (missing when model == "SI"
or model == "SEI"
), anddistance_factor
: the artificial distance multiplier used to emulate
social distancing.The participant data presents aggregate information on the engagement of
participants throughout the course of the experiment. This is also available as
either a daily or hourly time series in data/daily/participants.csv
and
data/hourly/participants.csv
, respectively. Again, these tables contain a row
for each time point (either daily or hourly) and contain the following columns:
time_utc
& time_nzt
: the time of each measurement in either the UTC or
NZST/NZDT timezone,count_campus
: the number of participants on campus during the current
NZST/NZDT day,count_reporting
: the number of participants whose phones are sending
reports during the current UTC day,count_registered
: the number of participants who have registered for the
experiment by the current NZST/NZDT day,hours_mean
: the mean number of campus hours collected by participants who
attended campus on the current NZST/NZDT day (missing when
count_campus <= 5
),hours_min
: the minimum number of campus hours collected by participants
who attended campus on the current NZST/NZDT day (missing when
count_campus <= 5
),hours_q1
: the lower quartile number of campus hours collected by
participants who attended campus on the current NZST/NZDT day (missing when
count_campus <= 5
),hours_q2
: the median number of campus hours collected by participants who
attended campus on the current NZST/NZDT day (missing when
count_campus <= 5
),hours_q3
: the upper quartile number of campus hours collected by
participants who attended campus on the current NZST/NZDT day (missing when
count_campus <= 5
), andhours_max
: the maximum number of campus hours collected by participants
who attended campus on the current NZST/NZDT day (missing when
count_campus <= 5
).Note that, as mentioned above, all statistics reported in
data/daily/participants.csv
and data/hourly/participants.csv
are in
reference to NZST/NZDT days except for count_reporting
, which is in
reference to UTC days.