
Potential Impact fraction and Population Attributable Fraction
pifpaf.Rd
Calculates the potential impact fraction pif
or the population
attributable fraction paf
for a categorical exposure considering
an observed prevalence of p
and a relative risk
(or relative risk parameter) of beta
.
Usage
paf(
p,
beta,
var_p = NULL,
var_beta = NULL,
rr_link = "identity",
rr_link_deriv = NULL,
link = "log-complement",
link_inv = NULL,
link_deriv = NULL,
conf_level = 0.95,
quiet = FALSE
)
pif(
p,
p_cft = rep(0, length(p)),
beta,
var_p = NULL,
var_beta = NULL,
rr_link = "identity",
rr_link_deriv = NULL,
link = "log-complement",
link_inv = NULL,
link_deriv = NULL,
conf_level = 0.95,
type = "PIF",
quiet = FALSE
)
Arguments
- p
Prevalence (proportion) of the exposed individuals for each of the
N
exposure levels.- beta
Relative risk parameter for which standard deviation is available (usually its either the relative risk directly or the log of the relative risk as most RRs, ORs and HRs come from exponential models).
- var_p
Estimate of the link_covariance matrix of
p
where the entryvar_p[i,j]
represents the link_covariance betweenp[i]
andp[j]
.- var_beta
Estimate of the link_covariance matrix of
beta
where the entryvar_beta[i,j]
represents the link_covariance betweenbeta[i]
andbeta[j]
.- rr_link
Link function such that the relative risk is given by
rr_link(beta)
.- rr_link_deriv
Derivative of the link function for the relative risk. The function tries to build it automatically from
rr_link
usingDeriv::Deriv()
.- link
Link function such that the
pif
confidence intervals stays within the expected bounds.- link_inv
The inverse of
link
. For example iflink
islogit
this should beinv_logit
.- link_deriv
Derivative of the
link
function. The function tries to build it automatically fromlink
usingDeriv::Deriv()
.- conf_level
Confidence level for the confidence interval (default 0.95).
- quiet
Whether to show messages.
- p_cft
Counterfactual prevalence (proportion) of the exposed individuals for each of the
N
exposure levels.- type
Character either Potential Impact Fraction (
PIF
) or Population Attributable Fraction (PAF
)
Note
This function assumes p
and beta
have been pre-computed from
the data and the individual-level data are not accessible to the
researchers. If either the data for the individual-level prevalence of
exposure p
or the data for the individual-level risk estimate beta
can be accessed by the researcher other methods (such as the pifpaf
package should be preferred).
Formulas
This function computed the potential impact fraction and its confidence intervals using Walter's formula:
$$ \textrm{PIF} = \dfrac{ \sum\limits_{i=1}^N p_i \text{RR}_i - \sum\limits_{i=1}^N p_i^{\text{cft}} \text{RR}_i }{ \sum\limits_{i=1}^N p_i \text{RR}_i }, \quad \text{ and } \quad \textrm{PAF} = \dfrac{ \sum\limits_{i=1}^N p_i \text{RR}_i - 1 }{ \sum\limits_{i=1}^N p_i \text{RR}_i } $$
in the case of N
exposure categories which is equivalent to Levine's formula
when there is only 1
exposure category:
$$ \textrm{PIF} = \dfrac{ p (\text{RR} - 1) - p^{\text{cft}} (\text{RR} - 1) }{ 1 + p (\text{RR} - 1) } \quad \textrm{ and } \quad \textrm{PAF} = \dfrac{ p (\text{RR} - 1) }{ 1 + p (\text{RR} - 1) } $$
Link functions for the PIF
By default the pif
and paf
calculations use the log-complement
link
which guarantees the impact fractions' intervals have valid values (between -oo and 1).
Depending on the application the following link functions are also implemented:
- log-complement
To achieve fractions between (-Inf, 1). This is the function
f(x) = ln(1 - x)
with inversefinv(x) = 1 - exp(x)
- logit
To achieve strictly positive fractions in (0,1). This is the function
f(x) = ln(x / (1 - x))
with inversefinv(x) = 1 / (1 + exp(-x))
- identity
An approximation for not-so-extreme fractions. This is the function
f(x) = x
.with inversefinv(x) = x
- hawkins
Hawkins' fraction for controlling variance. This is the function
f(x) = ln(x + sqrt(x^2 + 1))
with inversefinv(x) = 0.5 * exp(-x) * (exp(2 * x) - 1)
In general, logit
should be preferred if it is known and certain that the fractions
can only be positive (i.e. when all relative risks (including CIs) are > 1 and
prevalence > 0 and there is an epidemiological / biological explanation).
Mathematically the variance that is calculated is
$$ \sigma_f^2 = \text{Var}\Big[ f(\textrm{PIF}) \Big] $$ and the intervals are constructed as:
$$ \text{CI} = f^{-1}\Big( f(\textrm{PIF}) \pm Z_{\alpha/2} \cdot \sigma_f \Big) $$
Custom link functions can be implemented as long as they are invertible
in the range of interest by providing the function link
,
its inverse link_inv
and its derivative link_deriv
. If no derivative
is provided the package does an attempt to estimate it symbolically
using Deriv::Deriv()
however there is no guarantee that this
will work non-standard functions (i.e. not logarithm / trigonometric /
exponential)
Link functions for beta
By default the pif
and paf
use the identity
link which means that
the values for beta
are the relative risks directly and the
variance var_beta
corresponds to the relative risk's variance. Depending
on the relative risk's source the following options are available:
- identity
An approximation for not-so-extreme fractions. This is the function
f(beta) = beta
.with inversefinv(rr) = rr = beta
- exponential
The exponential function
f(beta) = exp(beta)
with inversefinv(rr) = log(rr) = beta
As in the previous section, custom link functions can be implemented
as long as they are invertible in the range of interest by providing the
function rr_link
and its derivative rr_link_deriv
. If no derivative
is provided the package does an attempt to estimate it symbolically
using Deriv::Deriv()
however there is no guarantee that this
will work non-standard functions (i.e. not logarithm / trigonometric /
exponential)
Population Attributable Fraction
The population attributable fraction corresponds to the potential impact
fraction at the theoretical minimum risk level. It is assumed that the
theoretical minimum risk level is a relative risk of 1. If no
counterfactual prevalence p_cft
is specified, the model computes
the population attributable fraction.
Examples
# This example comes from Levin 1953
# Relative risk of lung cancer given smoking was 3.6
# Proportion of individuals smoking where 49.9%
# Calculates PAF (i.e. counterfactual is no smoking)
paf(p = 0.499, beta = 3.6)
#> ! Assuming parameters `p` have no variance Use `var_p` to input their link_variances and/or covariance
#> ! Assuming parameters `beta` have no variance Use `var_beta` to input their link_variances and/or covariance
#>
#> ── Population Attributable Fraction ──
#>
#> PAF = 56.473% [95% CI: 56.473% to 56.473%]
#> standard_deviation(paf %) = 0.000
#> standard_deviation(link(paf)) = 0.000
# Assuming that beta and p had a link_variance
paf(p = 0.499, beta = 3.6, var_p = 0.001, var_beta = 1)
#>
#> ── Population Attributable Fraction ──
#>
#> PAF = 56.473% [95% CI: 32.990% to 71.726%]
#> standard_deviation(paf %) = 9.582
#> standard_deviation(link(paf)) = 0.220
# If the link_variance was to high a logistic transform would be required
# Generates incorrect values for the interval:
paf(p = 0.499, beta = 3.6, var_p = 0.1, var_beta = 3)
#>
#> ── Population Attributable Fraction ──
#>
#> PAF = 56.473% [95% CI: -20.431% to 84.268%]
#> standard_deviation(paf %) = 22.601
#> standard_deviation(link(paf)) = 0.519
# Logit fixes it
paf(p = 0.499, beta = 3.6, var_p = 0.1, var_beta = 3,
link = "logit", quiet = TRUE)
#>
#> ── Population Attributable Fraction ──
#>
#> PAF = 56.473% [95% CI: 17.628% to 88.720%]
#> standard_deviation(paf %) = 22.601
#> standard_deviation(link(paf)) = 0.919
# If the counterfactual was reducing the smoking population by 1/2
pif(p = 0.499, beta = 1.6, p_cft = 0.499/2, var_p = 0.001,
var_beta = 1, link = "logit", quiet = TRUE)
#>
#> ── Potential Impact Fraction ──
#>
#> PIF = 11.521% [95% CI: 0.746% to 69.285%]
#> standard_deviation(pif %) = 14.833
#> standard_deviation(link(pif)) = 1.455