regmdc is an R package for fitting nonparametric regression models with mixed derivative constraints. Available estimation methods are entirely monotonic regression, Hardy—Krause variation denoising, totally concave regression, MARS via LASSO, and their generalizations. Entirely monotonic regression and Hardy—Krause variation denoising are multivariate generalizations of isotonic regression and total variation denoising, introduced in Fang et al. (2021). MARS via LASSO is a LASSO variant of the usual MARS method, introduced in Ki et al. (2024+). It can be thought of not only as a multivariate generalization of locally adaptive regression splines but also as second-order Hardy—Krause variation denoising. Totally concave regression is a multivariate extension of univariate concave regression based on total concavity (see, e.g., Gal (2008)). A document for totally concave regression is now in preparation.
regmdc mainly consists of the following two generic functions:
regmdc()
predict_regmdc()
Given an estimation method, regmdc()
builds the model fit to data, by
solving the corresponding constrained LASSO problem. For details on the
LASSO problem of each estimation method, see, for example, Section 3 of
Fang et al. (2021) (for entirely monotonic regression and Hardy—Krause
variation denoising) and Section 2 of Ki et al. (2024+) (for MARS via
LASSO). There are some other ongoing research and working papers, and
they would be available in the future.
Given the model obtained from regmdc()
, predict_regmdc()
provides
predictions at new data points.
You first need to install MOSEK, a software package for optimization, and then Rmosek, the MOSEK interface in R.
Once they are properly installed, you can install our package by running in R the following commands:
# install.packages("devtools")
devtools::install_github("DohyeongKi/regmdc")
Here are some examples showing how regmdc can be used. Please refer to the document of each function in the package for more details.
library(regmdc)
################################################################################
# (1) Entirely monotonic regression
# (2) Hardy—Krause variation denoising
# (3) Generalization of entirely monotonic regression and Hardy—Krause variation denoising
################################################################################
fstar <- function(x) {(
(x[1] - 0.25 >= 0) + (x[2] - 0.25 >= 0)
+ (x[1] - 0.25 >= 0) * (x[2] - 0.25 >= 0)
)} # the true underlying function
X_design <- expand.grid(rep(list(seq(0, 1, length.out = 10L)), 3L)) # a design matrix
colnames(X_design) <- c("VarA", "VarB", "VarC")
theta <- apply(X_design, MARGIN = 1L, FUN = fstar) # the values of f* at the design points
sigma <- 0.1 # standard Gaussian noise
y <- theta + sigma * rnorm(nrow(X_design)) # an observation vector of a response variable
# Build an entirely monotonic regression model
# em_model <- regmdc(X_design, y, s = 2L, method = "em")
# em_model <- regmdc(X_design, y, s = 2L, method = "em", is_scaled = TRUE)
em_model <- regmdc(X_design, y, s = 2L, method = "em", is_lattice = TRUE)
# em_model <- regmdc(X_design, y, s = 2L, method = "em",
# is_monotonically_increasing = FALSE)
# em_model <- regmdc(X_design, y, s = 2L, method = "em",
# increasing_covariates = c(1L, 2L))
# em_model <- regmdc(X_design, y, s = 2L, method = "em",
# decreasing_covariates = c(3L))
# em_model <- regmdc(X_design, y, s = 2L, method = "em",
# increasing_covariates = c("VarA", "VarB"),
# decreasing_covariates = c("VarC"))
# Build a Hardy-Krause variation denoising model
# hk_model <- regmdc(X_design, y, s = 2L, method = "hk", V = 3.0)
# hk_model <- regmdc(X_design, y, s = 2L, method = "hk", V = 3.0, is_scaled = TRUE)
hk_model <- regmdc(X_design, y, s = 2L, method = "hk", V = 3.0, is_lattice = TRUE)
# Build a generalized model
# emhk_model <- regmdc(X_design, y, s = 2L, method = "emhk", V = 2.0, is_lattice = TRUE,
# variation_constrained_covariates = c(2L))
# emhk_model <- regmdc(X_design, y, s = 2L, method = "emhk", V = 2.0,
# is_monotonically_increasing = FALSE,
# variation_constrained_covariates = c("VarB"))
emhk_model <- regmdc(X_design, y, s = 2L, method = "emhk", V = 2.0,
increasing_covariates = c(1L),
variation_constrained_covariates = c(2L))
# emhk_model <- regmdc(X_design, y, s = 2L, method = "emhk", V = 2.0,
# increasing_covariates = c("VarA"),
# decreasing_covariates = c("VarC"),
# variation_constrained_covariates = c("VarB"))
# Generate predictions at new data points
X_pred <- c(1.0/3, 2.0/3, 1.0/3)
predict_regmdc(em_model, X_pred)
predict_regmdc(hk_model, X_pred)
predict_regmdc(emhk_model, X_pred)
X_pred <- matrix(c(1.0/3, 2.0/3, 1.0/3,
2.0/3, 1.0/3, 2.0/3),
ncol = 3L, byrow = TRUE)
predict_regmdc(em_model, X_pred)
predict_regmdc(hk_model, X_pred)
predict_regmdc(emhk_model, X_pred)
library(regmdc)
################################################################################
# (4) Totally concave regression
# (5) MARS via LASSO
# (6) Generalization of totally concave regression and MARS via LASSO
################################################################################
fstar <- function(x) {(
- max(x[1] - 0.25, 0) - max(x[2] - 0.25, 0)
- max(x[1] - 0.25, 0) * max(x[2] - 0.25, 0)
)} # the true underlying function
X_design <- cbind(runif(100), runif(100), runif(100))
colnames(X_design) <- c("VarA", "VarB", "VarC")
theta <- apply(X_design, MARGIN = 1L, FUN = fstar) # the values of f* at the design points
sigma <- 0.1 # standard Gaussian noise
y <- theta + sigma * rnorm(nrow(X_design)) # an observation vector
# Build a totally convex regression model
tc_model <- regmdc(X_design, y, s = 2L, method = "tc")
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# is_totally_concave = FALSE)
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# concave_covariates = c(1L, 2L))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# convex_covariates = c(3L))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# concave_covariates = c("VarA", "VarB"),
# convex_covariates = c("VarC"))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# extra_linear_covariates = c(3L))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# is_totally_concave = FALSE,
# extra_linear_covariates = c("VarC"))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# extra_linear_covariates = c(2L, 3L))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# concave_covariates = c("VarA"),
# extra_linear_covariates = c("VarC"))
# tc_model <- regmdc(X_design, y, s = 2L, method = "tc",
# number_of_bins = 20L,
# extra_linear_covariates = c(3L))
# Build a MARS via LASSO model
mars_model <- regmdc(X_design, y, s = 2L, method = "mars", V = 3.0)
# mars_model <- regmdc(X_design, y, s = 2L, method = "mars", V = 3.0,
# number_of_bins = 20L)
# mars_model <- regmdc(X_design, y, s = 2L, method = "mars", V = 3.0,
# number_of_bins = c(10L, 20L, 20L))
# mars_model <- regmdc(X_design, y, s = 2L, method = "mars", V = 3.0,
# number_of_bins = c(10L, 20L, NA))
# mars_model <- regmdc(X_design, y, s = 2L, method = "mars", V = 3.0,
# number_of_bins = c(10L, 20L, NA),
# extra_linear_covariates = c("VarC"))
# Build a generalized model
# tcmars_model <- regmdc(X_design, y, s = 2L, method = "tcmars", V = 2.0,
# variation_constrained_covariates = c(2L))
# tcmars_model <- regmdc(X_design, y, s = 2L, method = "tcmars", V = 2.0,
# is_totally_concave = FALSE,
# variation_constrained_covariates = c(2L))
tcmars_model <- regmdc(X_design, y, s = 2L, method = "tcmars", V = 2.0,
concave_covariates = c(1L),
variation_constrained_covariates = c(2L))
# tcmars_model <- regmdc(X_design, y, s = 2L, method = "tcmars", V = 2.0,
# concave_covariates = c("VarA"),
# convex_covariates = c("VarC"),
# variation_constrained_covariates = c("VarB"))
# tcmars_model <- regmdc(X_design, y, s = 2L, method = "tcmars", V = 2.0,
# concave_covariates = c(1L),
# variation_constrained_covariates = c(2L),
# extra_linear_covariates = c(3L))
# tcmars_model <- regmdc(X_design, y, s = 2L, method = "tcmars", V = 2.0,
# number_of_bins = 20L,
# concave_covariates = c("VarA"),
# variation_constrained_covariates = c("VarB"),
# extra_linear_covariates = c("VarC"))
# Generate predictions at new data points
X_pred <- c(1.0/3, 2.0/3, 1.0/3)
predict_regmdc(tc_model, X_pred)
predict_regmdc(mars_model, X_pred)
predict_regmdc(tcmars_model, X_pred)
X_pred <- matrix(c(1.0/3, 2.0/3, 1.0/3,
2.0/3, 1.0/3, 2.0/3),
ncol = 3L, byrow = TRUE)
predict_regmdc(tc_model, X_pred)
predict_regmdc(mars_model, X_pred)
predict_regmdc(tcmars_model, X_pred)
[1] Ki, D., Fang, B., and Guntuboyina, A. (2024+). MARS via LASSO. Accepted at Annals of Statistics. Available at https://arxiv.org/abs/2111.11694.
[2] Fang, B., Guntuboyina, A., and Sen, B. (2021). Multivariate extensions of isotonic regression and total variation denoising via entire monotonicity and Hardy—Krause variation. Annals of Statistics, 49(2), 769-792.
[3] Gal, S. G. (2008). Shape-Preserving Approximation by Real and Complex Polynomials. Birkhäuser, Boston.