Title: | Moment-Based Estimation for Hierarchical Models |
---|---|
Description: | Fast moment-based hierarchical model fitting. Implements methods from the papers "Fast Moment-Based Estimation for Hierarchical Models," by Perry (2017) and "Fitting a Deeply Nested Hierarchical Model to a Large Book Review Dataset Using a Moment-Based Estimator," by Zhang, Schmaus, and Perry (2018). |
Authors: | Patrick O. Perry [aut, cre], Timothy Sweetser [ctb], Kyle Schmaus [ctb], Ningshan Zhang [aut, ctb] |
Maintainer: | Patrick O. Perry <[email protected]> |
License: | Apache License (== 2.0) | file LICENSE |
Version: | 0.6 |
Built: | 2024-11-07 04:27:11 UTC |
Source: | https://github.com/patperry/r-mbest |
Fast moment-based hierarchical model fitting. Implements methods from the papers "Fast Moment-Based Estimation for Hierarchical Models," by Perry (2017) and "Fitting a Deeply Nested Hierarchical Model to a Large Book Review Dataset Using a Moment-Based Estimator," by Zhang, Schmaus, and Perry (2018).
Package: | mbest |
Version: | 0.6 |
Date: | 2018-05-24 |
Title: | Moment-Based Estimation for Hierarchical Models |
Authors@R: | c(person(c("Patrick", "O."), "Perry", role = c("aut", "cre"), email = "[email protected]"), person("Timothy", "Sweetser", role = "ctb"), person("Kyle", "Schmaus", role = "ctb"), person("Ningshan", "Zhang", role = c("aut", "ctb"))) |
Maintainer: | Patrick O. Perry <[email protected]> |
Description: | Fast moment-based hierarchical model fitting. Implements methods from the papers "Fast Moment-Based Estimation for Hierarchical Models," by Perry (2017) and "Fitting a Deeply Nested Hierarchical Model to a Large Book Review Dataset Using a Moment-Based Estimator," by Zhang, Schmaus, and Perry (2018). |
Depends: | nlme (>= 3.1-124) |
Imports: | abind, bigmemory, foreach, lme4, logging |
Suggests: | testthat |
LazyData: | Yes |
License: | Apache License (== 2.0) | file LICENSE |
URL: | https://github.com/patperry/r-mbest |
Encoding: | UTF-8 |
Config/pak/sysreqs: | cmake make |
Repository: | https://patperry.r-universe.dev |
RemoteUrl: | https://github.com/patperry/r-mbest |
RemoteRef: | HEAD |
RemoteSha: | a4414178b72d67a6d64c04a34faee96eb03534a4 |
Author: | Patrick O. Perry [aut, cre], Timothy Sweetser [ctb], Kyle Schmaus [ctb], Ningshan Zhang [aut, ctb] |
Index of help topics:
firthglm.fit Fitting Generalized Linear Models with Firth's Bias Reduction fixef Mixed Effects mbest-package Moment-Based Estimation for Hierarchical Models mhglm Fitting Moment Hierarchical Generalized Linear Models mhglm.control Auxiliary for Controlling Moment Heirarchical GLM Fitting mhglm_sim Simulate response patterns model.matrix.mhglm Terms and Model Matrix predict.mhglm Prediction
Basic usage is to call mhglm
.
P. O. Perry (2017) "Fast moment-based estimation for hierarchical models."
N. Zhang, K. Schmaus, and P. O. Perry (2018) "Fitting deeply-nested hierarchical models to a large book review dataset using moment-based estimators."
mhglm
,
fixef.mhglm
,
ranef.mhglm
,
VarCorr.mhglm
,
predict.mhglm
.
Get the fixed effects, random effect variances, and empirical Bayes random effect estimates.
## S3 method for class 'mhglm' fixef(object, ...) ## S3 method for class 'mhglm' ranef(object, condVar = FALSE, ...) ## S3 method for class 'mhglm' vcov(object, ...) ## S3 method for class 'mhglm' VarCorr(x, sigma = 1, ...) ## S3 method for class 'mhglm_ml' fixef(object, ...) ## S3 method for class 'mhglm_ml' ranef(object, condVar = FALSE, ...) ## S3 method for class 'mhglm_ml' vcov(object, ...) ## S3 method for class 'mhglm_ml' VarCorr(x, sigma = 1, ...)
## S3 method for class 'mhglm' fixef(object, ...) ## S3 method for class 'mhglm' ranef(object, condVar = FALSE, ...) ## S3 method for class 'mhglm' vcov(object, ...) ## S3 method for class 'mhglm' VarCorr(x, sigma = 1, ...) ## S3 method for class 'mhglm_ml' fixef(object, ...) ## S3 method for class 'mhglm_ml' ranef(object, condVar = FALSE, ...) ## S3 method for class 'mhglm_ml' vcov(object, ...) ## S3 method for class 'mhglm_ml' VarCorr(x, sigma = 1, ...)
object , x
|
an |
sigma |
a factor by which to scale the random effect variance-covariance matrix. |
condVar |
a logical indicating whether conditional covariance matrices for the random effects should be returned. |
... |
further arguments passed to or from other methods. |
fixef
returnes the fixed effects, while vcov
returns the
variance-covariance matrix of the fixed effect estimates.
VarCorr
returns the random effect covariance matrix. ranef
returns the empirical Bayes random effect estimates.
These functions behave like their counterparts in the nlme package.
fixef
, ranef
, VarCorr
,
from package nlme.
A drop-in replacement for glm.fit
which uses Firth's
bias-reduced estimates instead of maximum likelihood.
firthglm.fit(x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(...), intercept = TRUE, singular.ok = TRUE, ...) firthglm.control(epsilon = 1e-8, maxit = 25, qr.tol = 1e-7, improve.tol = 1e-4, curvature.tol = 0.9, linesearch.method = "linesearch", linesearch.maxit = 20, trace = FALSE)
firthglm.fit(x, y, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(...), intercept = TRUE, singular.ok = TRUE, ...) firthglm.control(epsilon = 1e-8, maxit = 25, qr.tol = 1e-7, improve.tol = 1e-4, curvature.tol = 0.9, linesearch.method = "linesearch", linesearch.maxit = 20, trace = FALSE)
x , y , weights , start , etastart , mustart , offset , family , control , intercept , singular.ok , ...
|
arguments that have the same functions as for |
qr.tol |
tolerance parameter for determining the rank of |
epsilon , maxit
|
convergence parameters for the quasi-Newton method. |
linesearch.method |
line search methods (one of "linesearch", "backtrack", or "blindsearch") |
improve.tol , curvature.tol , linesearch.maxit
|
tolerance parameters for the linesearch procedure. |
trace |
logical indicating if output should be produced for each iteration. |
Firth's modified score function gives rise to estimates with smaller biases than their maximum likelihood counterparts. Unlike the maximum likelihood estimates, if the design matrix is of full rank, then the Firth bias-reduced estimate is finite.
By default, the fitting procedure uses a quasi-Newton optimization method, with a More-Thuente linesearch.
firthglm.fit
returns an object having the same components that a call
to glm.fit
would produce.
Currently, only families with canonical link functions are supported.
Patrick O. Perry
Firth, D. (1993) Bias reduction of maximum likelihood estimates. Biometrika 80, 27-–38.
More, J. J. and Thuente, D. J. (1994) Line search algorithms with guaranteed sufficient decrease. ACM Transactions on Mathematical Software 20 286–307.
logistf
(package logistf) and
brglm
(package brglm) for alternative
implementations of Firth's bias-reduced estimators.
## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) ## Use firthglm.fit instead of glm.fit: glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(), method="firthglm.fit") summary(glm.D93)
## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) ## Use firthglm.fit instead of glm.fit: glm.D93 <- glm(counts ~ outcome + treatment, family = poisson(), method="firthglm.fit") summary(glm.D93)
mhglm
is used to fit a moment hierarchical generalized linear model of one level.
mhglm_ml
is used to fit a moment hierarchical generalized linear model of arbitrary number of levels
(including one level).
mhglm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(), model = TRUE, method = "mhglm.fit", x = FALSE, z = FALSE, y = TRUE, group = TRUE, contrasts = NULL) mhglm.fit(x, z, y, group, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE, dispersion = NULL) mhglm_ml(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(), model = TRUE, method = "mhglm_ml.fit", x = FALSE, z = FALSE, y = TRUE, group = TRUE, contrasts = NULL) mhglm_ml.fit(x, z, y, group, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE)
mhglm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(), model = TRUE, method = "mhglm.fit", x = FALSE, z = FALSE, y = TRUE, group = TRUE, contrasts = NULL) mhglm.fit(x, z, y, group, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE, dispersion = NULL) mhglm_ml(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(), model = TRUE, method = "mhglm_ml.fit", x = FALSE, z = FALSE, y = TRUE, group = TRUE, contrasts = NULL) mhglm_ml.fit(x, z, y, group, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = list(), intercept = TRUE)
formula , family , data , weights , subset , na.action , start , etastart , mustart , offset , model , contrasts , intercept
|
These arguments
are analogous to the similarly-named arguments for the |
control |
a list of parameters for controlling the fitting
process. For |
method |
the method to be used in fitting the model. The default
method |
x , z , y , group
|
For For For |
dispersion |
If |
These functions are analogues of glm
and
glm.fit
, meant to be used for fitting hierarchical
generalized linear models. A typical predictor has the form
response ~ terms + (reterms | group)
where
response
is the (numeric) response vector, terms
is a
series of terms which specifies a linear predictor for
response
, reterms
is a series of terms with random
coefficients (effects), and group
is a grouping factor; observations
with the same grouping factor share the same random effects.
mhglm
and mhglm.fit
only allow one random effect term,
along with a single level of hierarchy.
mghlm_ml
and mhglm_ml.fit
allow multiple random effect terms so
long as levels of random effects are hierarchically nested. If the random
effect design matrices are the same for each level, a predictor has the form
response ~ terms + (reterms | g1/.../gQ)
. If the random effects design
matrices differ from level to level, colons are used to delineate the nesting
structure; for example,
response ~ fe + (re1 | g1) + (re2 | g2:g1) + (re3 | g3:g2:g1)
.
mhglm
allows ||
in the formula
response ~ terms + (reterms || group)
to indicate that random effects
are independent, that is the random effects covariance matrix has non-zero
value only on its diagonal.
mhglm_ml
currently does not support ||
, to indicate indpenedent
random effects, set control=list(diagcov = TRUE)
.
mhglm
returns an object of class inheriting from "mhglm"
.
mhglm_ml
returns an object of class inheriting from "mhglm_ml"
.
The function summary
can be used to obtain or print a summary
of the results.
The generic accessor functions fixef
, ranef
,
VarCorr
, sigma
, fitted.values
and
residuals
can be used to extract various useful features of the
value returned by mhglm
or mhglm_ml
.
If the moment-based random effect covariance is not positive-semidefinite, then a warning will be issued, and a projection of the estimate to the positive-semidefinite cone will be used instead.
P. O. Perry (2017) "Fast moment-based estimation for hierarchical models."
N. Zhang, K. Schmaus, and P. O. Perry (2018) "Fitting deeply-nested hierarchical models to a large book review dataset using moment-based estimators."
terms.mhglm
, model.matrix.mhglm
, and
predict.mhglm
for mhglm
methods, and the
generic functions fitted.values
, residuals
,
summary
, vcov
, and weights
.
Generic functions fixef
, ranef
,
VarCorr
, and sigma
for features
related to mixed effect models.
glmer
(package lme4) for
fitting generalized linear mixed models with likelihood-based estimates.
library(lme4) ## The following examples are adapted from lme4: (fm1 <- mhglm(Reaction ~ Days + (Days | Subject), gaussian, sleepstudy)) (fm2 <- mhglm(Reaction ~ Days + (Days || Subject), gaussian, sleepstudy)) (gm <- mhglm(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial)) ## The following examples are for multilevel models g_data <- mhglm_sim(n = 30, m_per_level = c(10, 5, 2), sd_intercept = c(1, 1, 1), sd_slope = c(1, 1, 1), family = "gaussian", seed = 12345) (m1 <- mhglm_ml(y ~ 1 + x + (1 + x | g1/g2/g3), gaussian, g_data)) # or equivalent (m2 <- mhglm_ml(y ~ 1 + x + (1 + x | g1) + (1 + x | g2:g1) + (1 + x | g3:g2:g1), gaussian, g_data))
library(lme4) ## The following examples are adapted from lme4: (fm1 <- mhglm(Reaction ~ Days + (Days | Subject), gaussian, sleepstudy)) (fm2 <- mhglm(Reaction ~ Days + (Days || Subject), gaussian, sleepstudy)) (gm <- mhglm(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial)) ## The following examples are for multilevel models g_data <- mhglm_sim(n = 30, m_per_level = c(10, 5, 2), sd_intercept = c(1, 1, 1), sd_slope = c(1, 1, 1), family = "gaussian", seed = 12345) (m1 <- mhglm_ml(y ~ 1 + x + (1 + x | g1/g2/g3), gaussian, g_data)) # or equivalent (m2 <- mhglm_ml(y ~ 1 + x + (1 + x | g1) + (1 + x | g2:g1) + (1 + x | g3:g2:g1), gaussian, g_data))
Simulate response patterns for generalized linear models of gaussian
or
binomial
families, with both an intercept and slope covariate. Used
primarily for testing purposes.
mhglm_sim(n, m_per_level, sd_intercept, sd_slope, family = c("gaussian", "binomial"), seed)
mhglm_sim(n, m_per_level, sd_intercept, sd_slope, family = c("gaussian", "binomial"), seed)
n |
an integer scalar, the number of observations at the lowest grouping level. |
m_per_level |
an integer vector, the number of grouping levels nested under the level above. |
sd_intercept |
a numeric vector, the standard deviations of the intercept random effects. |
sd_slope |
a numeric vector, the standard deviations of the slope random effects. |
family |
a character scalar, either |
seed |
a single value, interpreted as an integer, or |
returns a data.frame with design matrix, response, and group levels.
mhglm_sim(n = 2, m_per_level = c(3, 3), sd_intercept = c(1, 2), sd_slope = c(2, 1), family = "gaussian", seed = 123) mhglm_sim(n = 2, m_per_level = c(3, 3), sd_intercept = c(1, 2), sd_slope = c(2, 1), family = "binomial", seed = 123)
mhglm_sim(n = 2, m_per_level = c(3, 3), sd_intercept = c(1, 2), sd_slope = c(2, 1), family = "gaussian", seed = 123) mhglm_sim(n = 2, m_per_level = c(3, 3), sd_intercept = c(1, 2), sd_slope = c(2, 1), family = "binomial", seed = 123)
Auxiliary function for mhglm
fitting. Typically only used
internally by mhglm.fit
, but may be used to construct a
control argument to either function.
mhglm.control(standardize = TRUE, steps = 1, parallel = FALSE, diagcov = FALSE, fit.method = "firthglm.fit", fixef.rank.warn = FALSE, cov.rank.warn = FALSE, cov.psd.warn = TRUE, fit.control = list(...), ...) mhglm_ml.control(standardize = FALSE, steps = 1, parallel = FALSE, diagcov = FALSE, fit.method = "firthglm.fit", fixef.rank.warn = FALSE, cov.rank.warn = FALSE, cov.psd.warn = FALSE, fit.control = list(...), ...)
mhglm.control(standardize = TRUE, steps = 1, parallel = FALSE, diagcov = FALSE, fit.method = "firthglm.fit", fixef.rank.warn = FALSE, cov.rank.warn = FALSE, cov.psd.warn = TRUE, fit.control = list(...), ...) mhglm_ml.control(standardize = FALSE, steps = 1, parallel = FALSE, diagcov = FALSE, fit.method = "firthglm.fit", fixef.rank.warn = FALSE, cov.rank.warn = FALSE, cov.psd.warn = FALSE, fit.control = list(...), ...)
standardize |
logitcal indicating if predictors should be standardized before moment-based fitted |
steps |
number of refinement steps |
parallel |
fit the group-specific estimates in parallel rather than sequentially |
diagcov |
estimate random effect covairance matrix with diagonal approximation |
fit.method |
method for obtaining group-specific effect estimates |
fixef.rank.warn |
if TRUE, print warnings when fixef is unidentifiable |
cov.rank.warn |
if TRUE, print warnings when covariance matrix is unidentifiable |
cov.psd.warn |
if TRUE, print warnings when moment based estimates of covariance matrix is not positive semi-definite |
fit.control |
control parameters for |
... |
arguments to be used to form the |
Setting standardize = TRUE
ensures that the procedure is equivariant,
and generally leads to better estimation performance.
Right now standardize = TRUE
is not allowed for mhglm_ml
.
The steps
argument gives the number of refinement steps for the moment
based parameters. In each step, the previous fixed effect and random effect
covariance matrix estimates are used to weight the subpopulation-specific
effect estimates. In principle, higher values of steps
could lead to
more accurate estimates, but in simulations, the differences are negligible.
A list with components named as the arguments.
mhglm.fit
, the fitting procedure used by
mhglm
.
firthglm.fit
, the default subpopulation-specific fitting method.
library(lme4) # for cbpp data # The default fitting method uses Firth's bias-corrected estimates (gm.firth <- mhglm(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial, control=mhglm.control(fit.method="firthglm.fit"))) # Using maximum likelihood estimates is less reliable (gm.ml <- mhglm(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial, control=mhglm.control(fit.method="glm.fit")))
library(lme4) # for cbpp data # The default fitting method uses Firth's bias-corrected estimates (gm.firth <- mhglm(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial, control=mhglm.control(fit.method="firthglm.fit"))) # Using maximum likelihood estimates is less reliable (gm.ml <- mhglm(cbind(incidence, size - incidence) ~ period + (1 | herd), data = cbpp, family = binomial, control=mhglm.control(fit.method="glm.fit")))
Get the terms or model matrix from an mhglm
object.
## S3 method for class 'mhglm' model.matrix(object, type = c("fixed", "random"), ...) ## S3 method for class 'mhglm_ml' model.matrix(object, type = c("fixed", "random"), ...) ## S3 method for class 'mhglm' terms(x, type = c("fixed", "random"), ...) ## S3 method for class 'mhglm_ml' terms(x, type = c("fixed", "random"), ...)
## S3 method for class 'mhglm' model.matrix(object, type = c("fixed", "random"), ...) ## S3 method for class 'mhglm_ml' model.matrix(object, type = c("fixed", "random"), ...) ## S3 method for class 'mhglm' terms(x, type = c("fixed", "random"), ...) ## S3 method for class 'mhglm_ml' terms(x, type = c("fixed", "random"), ...)
object , x
|
an |
type |
which terms to get (for the fixed or for the random effects). |
... |
further arguments passed to or from other methods. |
predict
gives empirical Bayes predictions of the response, while
sigma
gives the dispersion parameter.
## S3 method for class 'mhglm' predict(object, newdata = NULL, type = c("link", "response"), se.fit = FALSE, na.action = na.pass, ...) ## S3 method for class 'mhglm' sigma(object, ...)
## S3 method for class 'mhglm' predict(object, newdata = NULL, type = c("link", "response"), se.fit = FALSE, na.action = na.pass, ...) ## S3 method for class 'mhglm' sigma(object, ...)
object |
an |
newdata , type , se.fit , na.action
|
these arguments behave as in |
... |
further arguments passed to or from other methods. |
The predict
function gives empirical Bayes posterior mean estimates of
response values. If se.fit = TRUE
, then the conditional variances of
the random effects are used along with the fixed effect variance-covariance
matrix to estimate the standard errors.
The sigma
function gives the square root of the dispersion parameter
or the model; for linear models, this is the error standard deviation.