Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatsModels is incorrectly type pirating show() for empty tuples, breaking tuple display #213

Closed
NHDaly opened this issue Feb 5, 2021 · 8 comments · Fixed by #214
Closed

Comments

@NHDaly
Copy link

NHDaly commented Feb 5, 2021

As you can see here, using StatsModels causes tuples to print as an empty string:

julia> ()
()

julia> using StatsModels
[ Info: Precompiling StatsModels [3eaba693-59b7-5ba5-a881-562e759f1c8d]

julia> ()


julia> @which show(stdout, ())
show(io::IO, terms::Tuple{Vararg{Union{AbstractTerm, Tuple{Vararg{AbstractTerm,N}} where N},N}} where N) in StatsModels at /Users/nathandaly/.julia/packages/StatsModels/UtbVD/src/terms.jl:348

This is because empty tuples are being caught in the definition of show() implemented for NTuples, since an empty tuple is a subtype of every NTuple:

julia> () isa NTuple{N,Int} where N
true

julia> () isa NTuple{N,String} where N
true

The problematic definition is here:

Base.show(io::IO, terms::TupleTerm) = join(io, terms, " + ")

@NHDaly
Copy link
Author

NHDaly commented Feb 5, 2021

I think the fix would be something like defining non-empty tuple types and using those to avoid the piracy? Something like this, perhaps:

const NonEmptyTermOrTerms = Union{AbstractTerm, Tuple{AbstractTerm, Vararg{AbstractTerm}}}
const NonEmptyTupleTerm = Tuple{NonEmptyTermOrTerms, Vararg{NonEmptyTermOrTerms}}

Base.show(io::IO, terms::NonEmptyTupleTerm) = join(io, terms, " + ")

This seems to work properly:

julia> const NonEmptyTermOrTerms = Union{AbstractTerm, Tuple{AbstractTerm, Vararg{AbstractTerm}}}
Union{Tuple{AbstractTerm,Vararg{AbstractTerm,N} where N}, AbstractTerm}

julia> const NonEmptyTupleTerm = Tuple{NonEmptyTermOrTerms, Vararg{NonEmptyTermOrTerms}}
Tuple{Union{Tuple{AbstractTerm,Vararg{AbstractTerm,N} where N}, AbstractTerm},Vararg{Union{Tuple{AbstractTerm,Vararg{AbstractTerm,N} where N}, AbstractTerm},N} where N}

julia> Base.show(io::IO, terms::NonEmptyTupleTerm) = join(io, terms, " + ")

julia> ()
()

julia> struct MyTerm <: AbstractTerm end

julia> (MyTerm())
MyTerm()

julia> (MyTerm(), MyTerm())
MyTerm() + MyTerm()

It looks like there are also a couple other pirating methods there, for + and ~. I think probably we should also then use NonEmptyTupleTerm for those as well, though it will mean that ()+() will no longer work in your system. Probably it would be better to avoid using + on Tuples for this entirely, in order to avoid the type piracy on empty tuples, but that might be a bigger change, so I think this should be left as a follow-up with a bigger discussion.

Thanks much! :)

@kleinschmidt
Copy link
Member

Yeah that's definitely Naughty™, thanks for the detailed report. I think your fix is also right, I'll try that out and see if it causes any problems but I think just replacing the TupleTerm alias should be enough.

@kleinschmidt
Copy link
Member

More impetus to move away from using Tuples to represent collections of terms...

@dpsanders
Copy link

Seems like you just need to define a new type if you want new behaviour.

@NHDaly
Copy link
Author

NHDaly commented Feb 5, 2021

:) Thanks for the quick response and for having a look at this, @kleinschmidt! Sounds great.

But yes, i agree with @dpsanders - it's probably unexpected to overload show for tuples at all, since it breaks expectations for developers. I'd advocate for a new container type here, too 👍

@kleinschmidt
Copy link
Member

I agree with both of you @dpsanders and @NHDaly. I'd actually gotten to mocking up a replacement and getting it to pass tests (I think that's in this branch : https://github.com/JuliaStats/StatsModels.jl/tree/dfk/vector) but never got around to opening a PR because it didn't seem like it was causing any real problems in practice. But now I'm feeling differently about that...

For some context, the original idea way back in 2018 was that using a Tuple as a container would allow the compiler to reason about the types of the consituents and generate very efficient code for row-wise operations that need to be repeated many many times. But the tradeoff is obviously that the compiler has to work very hard all the time and I just don't think it's worth it in hindsight. And as we've seen it causes all kinds of other problems :)

@NHDaly
Copy link
Author

NHDaly commented Feb 5, 2021

Cool! :) Thanks, yeah, that makes sense!

If it makes sense to move away from Tuples, that's an easy solution, but also i don't think that these two things are mutually exclusive. It seems to me that you could still use a custom type for this, but have that type be nothing more than a wrapper around a tuple?

struct NTerms{N}
    terms::NTuple{N,AbstractTerm}
end

or something like that, and then the compiler can still happily get Tuples everywhere, but you eliminate all the piracy concerns. :)

@matthieugomez
Copy link
Contributor

@kleinschmidt I would love tuples to be replaced by vectors — this would help making Julia feel snappy when estimating models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants