-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
54 lines (35 loc) · 3.44 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
title: "readme"
author: "X. Bry, G. Cornu, F. Mortier and C. Trottier"
date: "25 juin 2018"
output:
md_document:
variant: gfm
link-citations: true
bibliography: assets/SCGLR.bib
---
<!-- File generated from README.Rmd. Changes must be done from there -->
# SCGLR <img src="man/figures/SCGLR_small.jpg" align="right">
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/SCGLR)](https://cran.r-project.org/package=SCGLR)
## Introduction
**SCGLR** is an open source implementation of the Supervised Component Generalized Linear Regression
[@bry13;-@bry16;-@bry18], which identifies, among a large set of potentially multicolinear predictors, the strong dimensions most predictive of a set of responses.
**SCGLR** is an extension of partial least square regression (PLSR) to the uni- and multivariate generalized linear framework. PLSR is particularly well suited for analyzing a large array of explanatory variables and many studies have demonstrated its predictive performance in various biological fields such as genetics [@boulesteix07] or ecology [@carrascal09]. While PLSR is well adapted for continuous variables, maximizing the covariance between linear combination of dependent variables, and linear combinations of covariates, **SCGLR** is suited for non-Gaussian outcomes and non-continuous covariates.
**SCGLR** is a model-based approach that extends PLS [@tenenhaus98], PCA on instrumental variables [@sabatier89], canonical correspondence analysis [@terbraak87], and other related empirical methods, by capturing the trade-off between goodness-of-fit and common structural relevance of explanatory components. The notion of structural relevance has been introduced [@bry15].
**SCGLR** can deal with covariates partitioned in several groups called "themes", plus a group of additional covariates. Each theme is searched for orthogonal components representing its variables in the model, whereas the additional covariates appear directly in the model, without the mediation of a component [@bry18].
**SCGLR** works also for mixed models using an extension of the Schall's algorithm to combine Supervised-Component regression with GLMM estimation in the multivariate context.
## Installation
``` r
# Install release version from CRAN
install.packages("SCGLR")
# Install development version from GitHub
remotes::install_github("SCnext/SCGLR")
```
## Main functions and works in progress
**SCGLR** is designed to deal with outcomes from multiple distributions: Gaussian, Bernoulli, binomial and Poisson separately or simultaneously [@bry13]. Moreover **SCGLR** is also able to deal with multiple conceptually homogeneous explanatory variable groups [@bry18].
**SCGLR** is a set of **R** functions illustrated on a floristic data set, _genus_. `scglr` and `scglrTheme` are respectively dedicated to fitting the model with one or more thematic group of regressors. `scglrCrossVal` and `scglrThemeBackward` are respectively dedicated to selecting the number of components. `print`, `summary` and `plot` methods are also available for the `scglr` and `scglrTheme` function results.
Different works are in progress both dealing for instance with the inclusion of random effects extending **SCGLR** to the generalized linear mixed model framework [@chauvet18;-@chauvet18b], or the Cox regression model.
## Fundings
The GAMBAS project funded by the Agence Nationale pour la Recherche (ANR-18-CE02-0025). <https://gambas.cirad.fr/>
<img src="man/figures/logo_gambas.jpg">
## References