Skip to contents

Calculates the potential impact fraction pif or the population attributable fraction paf for a categorical exposure considering an observed prevalence of p and a relative risk (or relative risk parameter) of beta.

Usage

paf(
  p,
  beta,
  var_p = NULL,
  var_beta = NULL,
  rr_link = "identity",
  rr_link_deriv = NULL,
  link = "log-complement",
  link_inv = NULL,
  link_deriv = NULL,
  conf_level = 0.95,
  quiet = FALSE
)

pif(
  p,
  p_cft = rep(0, length(p)),
  beta,
  var_p = NULL,
  var_beta = NULL,
  rr_link = "identity",
  rr_link_deriv = NULL,
  link = "log-complement",
  link_inv = NULL,
  link_deriv = NULL,
  conf_level = 0.95,
  type = "PIF",
  quiet = FALSE
)

Arguments

p

Prevalence (proportion) of the exposed individuals for each of the N exposure levels.

beta

Relative risk parameter for which standard deviation is available (usually its either the relative risk directly or the log of the relative risk as most RRs, ORs and HRs come from exponential models).

var_p

Estimate of the link_covariance matrix of p where the entry var_p[i,j] represents the link_covariance between p[i] and p[j].

var_beta

Estimate of the link_covariance matrix of beta where the entry var_beta[i,j] represents the link_covariance between beta[i] and beta[j].

Link function such that the relative risk is given by rr_link(beta).

Derivative of the link function for the relative risk. The function tries to build it automatically from rr_link using Deriv::Deriv().

Link function such that the pif confidence intervals stays within the expected bounds.

The inverse of link. For example if link is logit this should be inv_logit.

Derivative of the link function. The function tries to build it automatically from link using Deriv::Deriv().

conf_level

Confidence level for the confidence interval (default 0.95).

quiet

Whether to show messages.

p_cft

Counterfactual prevalence (proportion) of the exposed individuals for each of the N exposure levels.

type

Character either Potential Impact Fraction (PIF) or Population Attributable Fraction (PAF)

Note

This function assumes p and beta have been pre-computed from the data and the individual-level data are not accessible to the researchers. If either the data for the individual-level prevalence of exposure p or the data for the individual-level risk estimate beta can be accessed by the researcher other methods (such as the pifpaf package should be preferred).

Formulas

This function computed the potential impact fraction and its confidence intervals using Walter's formula:

$$ \textrm{PIF} = \dfrac{ \sum\limits_{i=1}^N p_i \text{RR}_i - \sum\limits_{i=1}^N p_i^{\text{cft}} \text{RR}_i }{ \sum\limits_{i=1}^N p_i \text{RR}_i }, \quad \text{ and } \quad \textrm{PAF} = \dfrac{ \sum\limits_{i=1}^N p_i \text{RR}_i - 1 }{ \sum\limits_{i=1}^N p_i \text{RR}_i } $$

in the case of N exposure categories which is equivalent to Levine's formula when there is only 1 exposure category:

$$ \textrm{PIF} = \dfrac{ p (\text{RR} - 1) - p^{\text{cft}} (\text{RR} - 1) }{ 1 + p (\text{RR} - 1) } \quad \textrm{ and } \quad \textrm{PAF} = \dfrac{ p (\text{RR} - 1) }{ 1 + p (\text{RR} - 1) } $$

By default the pif and paf calculations use the log-complement link which guarantees the impact fractions' intervals have valid values (between -oo and 1). Depending on the application the following link functions are also implemented:

log-complement

To achieve fractions between (-Inf, 1). This is the function f(x) = ln(1 - x) with inverse finv(x) = 1 - exp(x)

logit

To achieve strictly positive fractions in (0,1). This is the function f(x) = ln(x / (1 - x)) with inverse finv(x) = 1 / (1 + exp(-x))

identity

An approximation for not-so-extreme fractions. This is the function f(x) = x.with inverse finv(x) = x

hawkins

Hawkins' fraction for controlling variance. This is the function f(x) = ln(x + sqrt(x^2 + 1)) with inverse finv(x) = 0.5 * exp(-x) * (exp(2 * x) - 1)

In general, logit should be preferred if it is known and certain that the fractions can only be positive (i.e. when all relative risks (including CIs) are > 1 and prevalence > 0 and there is an epidemiological / biological explanation).

Mathematically the variance that is calculated is

$$ \sigma_f^2 = \text{Var}\Big[ f(\textrm{PIF}) \Big] $$ and the intervals are constructed as:

$$ \text{CI} = f^{-1}\Big( f(\textrm{PIF}) \pm Z_{\alpha/2} \cdot \sigma_f \Big) $$

Custom link functions can be implemented as long as they are invertible in the range of interest by providing the function link, its inverse link_inv and its derivative link_deriv. If no derivative is provided the package does an attempt to estimate it symbolically using Deriv::Deriv() however there is no guarantee that this will work non-standard functions (i.e. not logarithm / trigonometric / exponential)

By default the pif and paf use the identity link which means that the values for beta are the relative risks directly and the variance var_beta corresponds to the relative risk's variance. Depending on the relative risk's source the following options are available:

identity

An approximation for not-so-extreme fractions. This is the function f(beta) = beta.with inverse finv(rr) = rr = beta

exponential

The exponential function f(beta) = exp(beta) with inverse finv(rr) = log(rr) = beta

As in the previous section, custom link functions can be implemented as long as they are invertible in the range of interest by providing the function rr_link and its derivative rr_link_deriv. If no derivative is provided the package does an attempt to estimate it symbolically using Deriv::Deriv() however there is no guarantee that this will work non-standard functions (i.e. not logarithm / trigonometric / exponential)

Population Attributable Fraction

The population attributable fraction corresponds to the potential impact fraction at the theoretical minimum risk level. It is assumed that the theoretical minimum risk level is a relative risk of 1. If no counterfactual prevalence p_cft is specified, the model computes the population attributable fraction.

Examples

# This example comes from Levin 1953
# Relative risk of lung cancer given smoking was 3.6
# Proportion of individuals smoking where 49.9%
# Calculates PAF (i.e. counterfactual is no smoking)
paf(p = 0.499, beta = 3.6)
#> ! Assuming parameters `p` have no variance Use `var_p` to input their link_variances and/or covariance
#> ! Assuming parameters `beta` have no variance Use `var_beta` to input their link_variances and/or covariance
#> 
#> ── Population Attributable Fraction ──
#> 
#> PAF = 56.473% [95% CI: 56.473% to 56.473%]
#> standard_deviation(paf %) = 0.000
#> standard_deviation(link(paf)) = 0.000

# Assuming that beta and p had a link_variance
paf(p = 0.499, beta = 3.6, var_p = 0.001, var_beta = 1)
#> 
#> ── Population Attributable Fraction ──
#> 
#> PAF = 56.473% [95% CI: 32.990% to 71.726%]
#> standard_deviation(paf %) = 9.582
#> standard_deviation(link(paf)) = 0.220

# If the link_variance was to high a logistic transform would be required
# Generates incorrect values for the interval:
paf(p = 0.499, beta = 3.6, var_p = 0.1, var_beta = 3)
#> 
#> ── Population Attributable Fraction ──
#> 
#> PAF = 56.473% [95% CI: -20.431% to 84.268%]
#> standard_deviation(paf %) = 22.601
#> standard_deviation(link(paf)) = 0.519

# Logit fixes it
paf(p = 0.499, beta = 3.6, var_p = 0.1, var_beta = 3,
    link = "logit", quiet = TRUE)
#> 
#> ── Population Attributable Fraction ──
#> 
#> PAF = 56.473% [95% CI: 17.628% to 88.720%]
#> standard_deviation(paf %) = 22.601
#> standard_deviation(link(paf)) = 0.919

# If the counterfactual was reducing the smoking population by 1/2
pif(p = 0.499, beta = 1.6, p_cft = 0.499/2, var_p = 0.001,
    var_beta = 1, link = "logit", quiet = TRUE)
#> 
#> ── Potential Impact Fraction ──
#> 
#> PIF = 11.521% [95% CI: 0.746% to 69.285%]
#> standard_deviation(pif %) = 14.833
#> standard_deviation(link(pif)) = 1.455