Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatsModels.jl >v0.7 changes order of parameters #286

Closed
sindresops opened this issue Apr 3, 2023 · 7 comments · Fixed by #287
Closed

StatsModels.jl >v0.7 changes order of parameters #286

sindresops opened this issue Apr 3, 2023 · 7 comments · Fixed by #287

Comments

@sindresops
Copy link

sindresops commented Apr 3, 2023

My code was breaking because StatsModels.jl has changed the default ordering of FormulaTerm. (Intercept) used to come first, now its last. I was dependent on GLM.coef() returning the regression coefficients in that specific order. The new change is good, since now the higher order terms come first. But just an FYI in case anyone else experiences the same.

Edit 1: (Added MWE)

using Pkg; using DataFrames; Pkg.add(name="StatsModels",version="0.6.33"); using GLM; x = collect(1:10); y = 2x .+ randn(length(x)); lm(@formula(y~x+1),DataFrame(x=x,y=y));

Returns
image

using Pkg; using DataFrames; Pkg.add(name="StatsModels",version="0.7"); using GLM; x = collect(1:10); y = 2x .+ randn(length(x)); lm(@formula(y~x+1),DataFrame(x=x,y=y));

Returns
image

Other info
Julia version 1.8.0
GLM v1.8.2

@kleinschmidt
Copy link
Member

Do you have an example you can share?

@sindresops
Copy link
Author

Apologies. I was being lazy. I have updated the original post with a minimum working example.

@kleinschmidt
Copy link
Member

Ah wow that's a surprise to me! IMO the earlier behavior was a bug, since we (generally) sort terms by degree even in pre-0.7. In fact, I can't reproduce your example on my machine:

Project setup _ _ _ _(_)_ | Documentation: https://docs.julialang.org (_) | (_) (_) | _ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help. | | | | | | |/ _` | | | | |_| | | | (_| | | Version 1.8.5 (2023-01-08) _/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release |__/ | (@v1.8) pkg> activate --temp Activating new project at `/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_ipptqs` (jl_ipptqs) pkg> add DataFrames, GLM, [email protected] Updating registry at `~/.julia/registries/Beacon` Updating git-repo `https://github.com/beacon-biosignals/BeaconRegistry.git` Updating registry at `~/.julia/registries/General.toml` Resolving package versions... Installed FillArrays ──── v1.0.0 Installed Distributions ─ v0.25.87 Updating `/private/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_ipptqs/Project.toml` [a93c6f00] + DataFrames v1.5.0 [38e38edf] + GLM v1.8.2 ⌃ [3eaba693] + StatsModels v0.6.33 Updating `/private/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_ipptqs/Manifest.toml` [49dc2e85] + Calculus v0.5.1 [d360d2e6] + ChainRulesCore v1.15.7 [9e997f8a] + ChangesOfVariables v0.1.6 [34da2185] + Compat v4.6.1 [a8cc5b0e] + Crayons v4.1.1 [9a962f9c] + DataAPI v1.14.0 [a93c6f00] + DataFrames v1.5.0 [864edb3b] + DataStructures v0.18.13 [e2d170a0] + DataValueInterfaces v1.0.0 [b429d917] + DensityInterface v0.4.0 [31c24e10] + Distributions v0.25.87 [ffbed154] + DocStringExtensions v0.9.3 [fa6b7ba4] + DualNumbers v0.6.8 [1a297f60] + FillArrays v1.0.0 [59287772] + Formatting v0.4.2 [38e38edf] + GLM v1.8.2 [34004b35] + HypergeometricFunctions v0.3.14 [842dd82b] + InlineStrings v1.4.0 [3587e190] + InverseFunctions v0.1.8 [41ab1584] + InvertedIndices v1.3.0 [92d709cd] + IrrationalConstants v0.2.2 [82899510] + IteratorInterfaceExtensions v1.0.0 [692b3bcd] + JLLWrappers v1.4.1 [b964fa9f] + LaTeXStrings v1.3.0 [2ab3a3ac] + LogExpFunctions v0.3.23 [e1d29d7a] + Missings v1.1.0 [77ba4419] + NaNMath v1.0.2 [bac558e1] + OrderedCollections v1.6.0 [90014a1f] + PDMats v0.11.17 [69de0a69] + Parsers v2.5.8 [2dfb63ee] + PooledArrays v1.4.2 [21216c6a] + Preferences v1.3.0 [08abe8d2] + PrettyTables v2.2.3 [1fd47b50] + QuadGK v2.8.2 [189a3867] + Reexport v1.2.2 [79098fc4] + Rmath v0.7.1 [91c51154] + SentinelArrays v1.3.18 [1277b4bf] + ShiftedArrays v2.0.0 [66db9d55] + SnoopPrecompile v1.0.3 [a2af1166] + SortingAlgorithms v1.1.0 [276daf66] + SpecialFunctions v2.2.0 [82ae8749] + StatsAPI v1.6.0 [2913bbd2] + StatsBase v0.33.21 [4c63d2b9] + StatsFuns v1.3.0 ⌃ [3eaba693] + StatsModels v0.6.33 [892a3eda] + StringManipulation v0.3.0 [3783bdb8] + TableTraits v1.0.1 [bd369af6] + Tables v1.10.1 [efe28fd5] + OpenSpecFun_jll v0.5.5+0 [f50d1b31] + Rmath_jll v0.4.0+0 [0dad84c5] + ArgTools v1.1.1 [56f22d72] + Artifacts [2a0f44e3] + Base64 [ade2ca70] + Dates [f43a241f] + Downloads v1.6.0 [7b1f6079] + FileWatching [9fa8497b] + Future [b77e0a4c] + InteractiveUtils [b27032c2] + LibCURL v0.6.3 [76f85450] + LibGit2 [8f399da3] + Libdl [37e2e46d] + LinearAlgebra [56ddb016] + Logging [d6f4376e] + Markdown [ca575930] + NetworkOptions v1.2.0 [44cfe95a] + Pkg v1.8.0 [de0858da] + Printf [3fa0cd96] + REPL [9a3f8284] + Random [ea8e919c] + SHA v0.7.0 [9e88b42a] + Serialization [6462fe0b] + Sockets [2f01184e] + SparseArrays [10745b16] + Statistics [4607b0f0] + SuiteSparse [fa267f1f] + TOML v1.0.0 [a4e569a6] + Tar v1.10.1 [8dfed614] + Test [cf7118a7] + UUIDs [4ec0a83e] + Unicode [e66e0078] + CompilerSupportLibraries_jll v1.0.1+0 [deac9b47] + LibCURL_jll v7.84.0+0 [29816b5a] + LibSSH2_jll v1.10.2+0 [c8ffd9c3] + MbedTLS_jll v2.28.0+0 [14a3606d] + MozillaCACerts_jll v2022.2.1 [4536629a] + OpenBLAS_jll v0.3.20+0 [05823500] + OpenLibm_jll v0.8.1+0 [83775a58] + Zlib_jll v1.2.12+3 [8e850b90] + libblastrampoline_jll v5.1.1+0 [8e850ede] + nghttp2_jll v1.48.0+0 [3f19e933] + p7zip_jll v17.4.0+0 Info Packages marked with ⌃ have new versions available and may be upgradable. Precompiling project... 3 dependencies successfully precompiled in 7 seconds. 54 already precompiled.
julia> using Pkg; using DataFrames; Pkg.add(name="StatsModels",version="0.6.33"); using GLM; x = collect(1:10); y = 2x .+ randn(length(x)); lm(@formula(y~x+1),DataFrame(x=x,y=y))
   Resolving package versions...
  No Changes to `/private/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_ipptqs/Project.toml`
  No Changes to `/private/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_ipptqs/Manifest.toml`
[ Info: Precompiling GLM [38e38edf-8417-5370-95a0-9cbb8c7f171a]
StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}}, Matrix{Float64}}

y ~ 1 + x

Coefficients:
─────────────────────────────────────────────────────────────────────────
                 Coef.  Std. Error      t  Pr(>|t|)  Lower 95%  Upper 95%
─────────────────────────────────────────────────────────────────────────
(Intercept)  -0.259833    0.85121   -0.31    0.7680   -2.22273    1.70306
x             2.02703     0.137185  14.78    <1e-06    1.71068    2.34338
─────────────────────────────────────────────────────────────────────────

(jl_ipptqs) pkg> st
Status `/private/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_ipptqs/Project.toml`
  [a93c6f00] DataFrames v1.5.0
  [38e38edf] GLM v1.8.2
⌃ [3eaba693] StatsModels v0.6.33
Info Packages marked with ⌃ have new versions available and may be upgradable.

@kleinschmidt
Copy link
Member

Ah wow wait nevermind, I misread the report. This is definitely a bug!

@kleinschmidt kleinschmidt transferred this issue from JuliaStats/GLM.jl Apr 4, 2023
@kleinschmidt
Copy link
Member

I think the issue is that the constant term is incorrectly assigned the same degree as x. Sorting works correctly with an interaction term like this:

julia> f2 = @formula(y ~ x & z + x + 1)
FormulaTerm
Response:
  y(unknown)
Predictors:
  x(unknown)
  1
  x(unknown) & z(unknown)

(jl_QKrdLJ) pkg> st
Status `/private/var/folders/kg/y0c0ksr56_g800hjvcpx_d4w0000gp/T/jl_QKrdLJ/Project.toml`
  [a93c6f00] DataFrames v1.5.0
  [38e38edf] GLM v1.8.2
  [3eaba693] StatsModels v0.7.0

@kleinschmidt
Copy link
Member

@sindresops this is fixed on master now and will be released as StatsModels 0.7.1: JuliaRegistries/General#81005

Thanks for the report!

@sindresops
Copy link
Author

Thanks for patching!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants