Skip to contents

Function that takes a data frame with `onset_date` and `report_date` and generates all possible combinations of onset_dates and report_dates observable controlling by the covariates specified in `...`

Usage

preprocess_for_nowcast(
  .disease_data,
  onset_date,
  report_date,
  strata = NULL,
  now,
  units,
  max_delay = Inf,
  data_type = c("auto", "linelist", "count")
)

Arguments

.disease_data

A time series of reporting data in aggregated line list format such that each row has a column for onset date, report date, and

onset_date

In quotations, the name of the column of datatype Date designating the date of case onset. e.g. "onset_week"

report_date

In quotations, the name of the column of datatype Date designating the date of case report. e.g. "report_week"

strata

Character vector of names of the strata included in the data.

now

An object of datatype Date indicating the date at which to perform the nowcast.

units

Time scale of reporting. Options: "1 day", "1 week".

max_delay

Maximum possible delay observed or considered for estimation of the delay distribution (numeric). Default: `Inf`

data_type

Either `linedata` if each row represents a test or `counts` if there is a column named `n` with counts of how many tests had that onset and report dates

Value

A `data.frame` with all possible counts for all delay-onset combinations. The new column with the counts is named `n`. Additional columns `.tval` and `.delay` are added where `.tval` codifies the dates as numbers (starting at 0) and delay codifies the difference between onset and report.

Examples

data(denguedat)

# Get counts by onset date and report week consider all possible delays
preprocess_for_nowcast(denguedat, "onset_week", "report_week",
  units = "weeks", now = as.Date("1990-03-05")
)
#>  Assuming data is linelist-data where each observation is a test. If you are working with count-data set `data_type = "count"`
#> # A tibble: 55 × 5
#>    onset_week .delay report_week     n .tval
#>    <date>      <dbl> <date>      <int> <dbl>
#>  1 1990-01-01      0 1990-01-01      3     1
#>  2 1990-01-01      1 1990-01-08     24     1
#>  3 1990-01-01      2 1990-01-15     23     1
#>  4 1990-01-01      3 1990-01-22      8     1
#>  5 1990-01-01      4 1990-01-29      1     1
#>  6 1990-01-01      5 1990-02-05      0     1
#>  7 1990-01-01      6 1990-02-12      1     1
#>  8 1990-01-01      7 1990-02-19      0     1
#>  9 1990-01-01      8 1990-02-26      0     1
#> 10 1990-01-01      9 1990-03-05      1     1
#> # ℹ 45 more rows

# Complete one date when there was no onset week
df <- data.frame(
  onset_week  = as.Date(c("1994-09-19", "1994-10-03", "1994-10-03", "1994-10-10")),
  report_week = as.Date(c("1994-09-19", "1994-10-03", "1994-10-10", "1994-10-10"))
)
preprocess_for_nowcast(df, "onset_week", "report_week",
  units = "weeks",
  now = as.Date("1994-10-10")
)
#>  Assuming data is linelist-data where each observation is a test. If you are working with count-data set `data_type = "count"`
#> # A tibble: 7 × 5
#>   onset_week .delay report_week     n .tval
#>   <date>      <dbl> <date>      <int> <dbl>
#> 1 1994-09-19      0 1994-09-19      1     1
#> 2 1994-09-19      1 1994-09-26      0     1
#> 3 1994-09-26      0 1994-09-26      0     2
#> 4 1994-09-26      1 1994-10-03      0     2
#> 5 1994-10-03      0 1994-10-03      1     3
#> 6 1994-10-03      1 1994-10-10      1     3
#> 7 1994-10-10      0 1994-10-10      1     4

# Complete one date when there was no report of delay 3 mostly
df <- data.frame(
  onset_week  = as.Date(c("1994-09-19", "1994-10-03", "1994-10-03", "1994-10-10")),
  report_week = as.Date(c("1994-10-10", "1994-10-03", "1994-10-10", "1994-10-10"))
)
preprocess_for_nowcast(df, "onset_week", "report_week",
  units = "weeks",
  now = as.Date("1994-10-10")
)
#>  Assuming data is linelist-data where each observation is a test. If you are working with count-data set `data_type = "count"`
#> # A tibble: 10 × 5
#>    onset_week .delay report_week     n .tval
#>    <date>      <dbl> <date>      <int> <dbl>
#>  1 1994-09-19      0 1994-09-19      0     1
#>  2 1994-09-19      1 1994-09-26      0     1
#>  3 1994-09-19      2 1994-10-03      0     1
#>  4 1994-09-19      3 1994-10-10      1     1
#>  5 1994-09-26      0 1994-09-26      0     2
#>  6 1994-09-26      1 1994-10-03      0     2
#>  7 1994-09-26      2 1994-10-10      0     2
#>  8 1994-10-03      0 1994-10-03      1     3
#>  9 1994-10-03      1 1994-10-10      1     3
#> 10 1994-10-10      0 1994-10-10      1     4

# Get counts by onset date and report week stratifying by gender and state
df <- data.frame(
  onset_week = sample(as.Date(c("1994-09-19", "1994-10-03", "1994-10-10")), 100, replace = TRUE),
  gender = sample(c("Male", "Female"), 100, replace = TRUE),
  state = sample(c("A", "B", "C", "D"), prob = c(0.5, 0.2, 0.2, 0.1), size = 100, replace = TRUE)
)
df$report_week <- df$onset_week +
  sample(c(lubridate::weeks(1), lubridate::weeks(2)), 100, replace = TRUE)
preprocess_for_nowcast(df, "onset_week", "report_week", c("gender", "state"),
  units = "weeks",
  now = as.Date("1994-09-26")
)
#>  Assuming data is linelist-data where each observation is a test. If you are working with count-data set `data_type = "count"`
#> # A tibble: 8 × 7
#>   onset_week .delay gender state report_week     n .tval
#>   <date>      <dbl> <chr>  <chr> <date>      <int> <dbl>
#> 1 1994-09-19      1 Male   A     1994-09-26      2     1
#> 2 1994-09-19      1 Male   D     1994-09-26      2     1
#> 3 1994-09-19      1 Male   B     1994-09-26      0     1
#> 4 1994-09-19      1 Male   C     1994-09-26      2     1
#> 5 1994-09-19      1 Female A     1994-09-26      2     1
#> 6 1994-09-19      1 Female D     1994-09-26      2     1
#> 7 1994-09-19      1 Female B     1994-09-26      3     1
#> 8 1994-09-19      1 Female C     1994-09-26      5     1