Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix segfault in State::serialize method #664

Conversation

Danielius1922
Copy link
Contributor

The method gets invoked from public API function Conn::ConnectionState but the cipherSuite pointer member might not have been initialized yet. Invoking ConnectionState too early causes a segfault. Issue is fixed by adding a nil check and returning 0 for State::CipherSuiteID if the cipherSuite has not been set.

@daenney
Copy link
Member

daenney commented Aug 2, 2024

I don't think we can do it this way. Cipher suite ID's are registered with IANA, and a suite ID of 0 would be TLS_NULL_WITH_NULL_NULL which is not actually what's happening here. The cipher suite hasn't been set yet, it's not been set to the NULL suite. Outputting the 0 here would be an incorrect representation of the connection state. This is why the cipher suite ID is a pointer to uint16, letting us represent the "not set" state.

I'm also not entirely sure how you'd trigger this condition in practice. Though it's possible to call ConnectionState before a handshake's been completed, doing so sounds like a bug.

@Sean-Der
Copy link
Member

Sean-Der commented Aug 2, 2024

This is possible now because we don't block on Handshaking anymore right?

Maybe we should return a nil state until the handshake is completed?

@daenney
Copy link
Member

daenney commented Aug 2, 2024

Good point. That's probably why this surfaces now. Returning a nil until the handshake completes sounds good to me.

@Sean-Der
Copy link
Member

Sean-Der commented Aug 2, 2024

Nice, sorry mind making that change @Danielius1922 and then should be good to merge!

Mind also adding a simple test on that? I can merge and tag today then!

@daenney
Copy link
Member

daenney commented Aug 2, 2024

Mmm, one complication, ConnectionState returns a State, not a *State. With the current API we can't return nil there, even though clone() does return a *State. We do clone state, so in theory there's no risk to giving the caller a *State instead of a State, it can't be used to modify the state.

Returning a State{} would probably be confusing, so it might be best to change it to return a *State instead.

That way we could modify it so that clone() on State returns nil instead until the handshake is completed. Then ConnectionState() can simply do return c.state.clone().

@Danielius1922
Copy link
Contributor Author

Danielius1922 commented Aug 2, 2024

@daenney The issue occured in this test case - https://github.com/plgd-dev/go-coap/pull/566/files#diff-f1064f702f8a5091e655a7a69fca82d50e0a6f095ac0d733950959310eeac4f9L161 . Going by logs the ConnectionState method gets invoked during flight1 stage.
And it gets triggered like this: Our listerner goroutine accepts the connection and creates a wrapper over Conn which also starts a new goroutine which loops on ReadWithContext (this fires off the Handshake). Once the constructor method returns the onNewConn callback gets invoked which calls Conn::ConnectionState. So this should be very early, but it's possible to trigger the segfault.

Though it seems I'll have to rework some of our types for integration with pion/dtls/v3 anyway, because we have several readers and writers running in different goroutines. For v3 this means that this can trigger multiple concurrent handshakes over the same Conn instance, which doesn't seem to be an intended use case since I get a panic on attempting to close an already closed channel (close(c.decrypted) in conn.go). I guess I'll have to lock with a mutex, since I don't see a way to wait on something if a handshake is already in progress.

@Danielius1922
Copy link
Contributor Author

Nice, sorry mind making that change @Danielius1922 and then should be good to merge!

Mind also adding a simple test on that? I can merge and tag today then!

Sure, let's see what I can do before my weekend starts!

@Danielius1922 Danielius1922 force-pushed the adam/bugfix/fix-segfault-in-state-serialize branch 2 times, most recently from 60943ff to 0069e4a Compare August 2, 2024 19:04
@Sean-Der
Copy link
Member

Sean-Der commented Aug 2, 2024

@Danielius1922 @daenney sorry last thing!

Would it be better to return Serialize, bool? Today someone might not check for a nil pointer and get a crash.

Might be better to have a empty struct. Even if someone misinterprets the values that is better then crashing.

Copy link

codecov bot commented Aug 2, 2024

Codecov Report

Attention: Patch coverage is 68.75000% with 10 lines in your changes missing coverage. Please review.

Project coverage is 80.09%. Comparing base (5a72b12) to head (d041552).

Files Patch % Lines
flight4handler.go 25.00% 5 Missing and 1 partial ⚠️
flight5handler.go 60.00% 1 Missing and 1 partial ⚠️
state.go 85.71% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #664      +/-   ##
==========================================
- Coverage   80.26%   80.09%   -0.17%     
==========================================
  Files         101      101              
  Lines        5335     5355      +20     
==========================================
+ Hits         4282     4289       +7     
- Misses        686      695       +9     
- Partials      367      371       +4     
Flag Coverage Δ
go 80.12% <68.75%> (-0.17%) ⬇️
wasm 64.26% <68.75%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Danielius1922
Copy link
Contributor Author

@Sean-Der As a user myself I would prefer a change that breaks API, but can be easily updated to new API instead of getting an unexpected segfault because of a missing nil check on a return value that has been silently changed to a pointer.

@daenney
Copy link
Member

daenney commented Aug 3, 2024

@Danielius1922 @daenney sorry last thing!

Would it be better to return Serialize, bool? Today someone might not check for a nil pointer and get a crash.

Might be better to have a empty struct. Even if someone misinterprets the values that is better then crashing.

Fair point. Since we've just done v3 we may want to avoid changing ConnectionState() entirely since that breaks major version guarantees, and instead mark it as deprecated? We can introduce a new method on Conn with a signature like func (c *Conn) State (State, bool) instead. Then folks can do the usual:

if state, ok := c.State(); !ok {
    .. loop ..
}

@Sean-Der
Copy link
Member

Sean-Der commented Aug 3, 2024

I think it would be ok to break major version in this case?

We just did a break and this API is unusable because of it. I don’t see a way we can keep it when he has such a deficiency. Crashing or returning bad values.

@daenney
Copy link
Member

daenney commented Aug 3, 2024

That's fair. The current behaviour of ConnectionState is rather broken. So then I guess we can change the signature to return Serialize, bool instead.

@Sean-Der
Copy link
Member

Sean-Der commented Aug 3, 2024

Nice, I will take that on today and get that out ASAP

@Danielius1922 Danielius1922 force-pushed the adam/bugfix/fix-segfault-in-state-serialize branch from 0069e4a to 0aea148 Compare August 5, 2024 11:43
The method gets invoked from public API function Conn::ConnectionState
but the cipherSuite pointer member might not have been initialized yet.
Invoking ConnectionState too early causes a segfault.
Issue is fixed by changing the return type of Conn::ConnectionState from
State to (State, bool) and returning (State{}, false) if the cipherSuite
has not been set.
@Danielius1922 Danielius1922 force-pushed the adam/bugfix/fix-segfault-in-state-serialize branch from 0aea148 to d041552 Compare August 5, 2024 11:44
@Danielius1922
Copy link
Contributor Author

OK, so I modified ConnectionState to return (State, bool). I modified serialize to return in case the cipherSuite is not set, but if you'd prefer the version with the completed handshake check I'll modify it today.

@Sean-Der Sean-Der merged commit f3e8a9e into pion:master Aug 6, 2024
13 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants