Mass Timeouts #2374

kevincoleman · 2023-09-21T23:16:12Z

kevincoleman
Sep 21, 2023
Collaborator

This one’s obviously a big topic, and a lot of discussion has happened on the forums, but also among devs. This discussion is an attempt to distill a lot of the complexities so we can hope to find a path forward.

The Mass Timeout UX problem

When a user times out multiple games in a short time frame (a.k.a. “mass timeout”) ranking changes from the timed out games do not take effect. This creates an incongruent experience for players who might expect to see their rank change after winning by timeout. Winning a game and not being awarded a rank boost for it undermines users’ trust in the solidity of the ranking system, and is a genuinely frustrating experience (players are often very proud of their ranks).

The current solution protects against failing on KPI number 2 (see below), by making sure a player who mass-times-out doesn’t see a huge rank hit. However, it does so at the direct expense of KPI number 1. While this current solution does solve one KPI’s output, it’s perhaps debatable from a fairness standpoint. It subtly penalizes multiple well-meaning players for the mistakes of one, or worse, the intended tactic of a devious player (imagine the case where a player knows about mass-timeouts and just strings along all the games he/she is losing until he/she then just times them all out to avoid the rank hit).

KPIs for UX success:

Number of players who are (justly) complaining about their ranks not changing after a timeout win (lower is better)
Number of cases of players who rank-sandbag, intentionally or unintentionally, as a result of a mass-timeout (lower is better)
Keeping the overall rank pool stable on average, to keep players’ ranks accurate to other outside metrics. (no change on average is ideal)

Proposed Solutions

Possible Solution Number 1

Keep things as they are, and accept a failure of KPI 1. To do this we’d need to continue to ignore the community’s complaints on the issue, and we’d accept that this UX is subtly damaging for the experience for players who receive unfair rank judgements when their opponents mass time out. In this case we may want to properly disclose the situation, with something like a flag/tag on mass timeout games, linking to a description of why ranking did not occur for the annulled game.

Possible Solution Number 2

Remove the mechanic that keeps mass-timeout games from affecting ranks. This seems like the heuristic route for the well-meaning players who would justly want to see their ranks increase when their opponent mass-times-out. It does come with the side effect of heavily affecting a single player’s rank when they mass timeout, followed by possible disengagement from OGS or rank-sandbagging as they attempt to regain their rank. While this solution feels “just”, it does damage the experience of the entire OGS ecosystem, because there would be X percentage of players who are just trying to climb back to their rightful ranks after a mass-timeout.

Possible Solution Number 3

Apply rank boons for winners in a mass-timeout situation, but leave the rank of the loser unaffected. This seems like a nice idea because it tries to both feel fair to winners while not overly penalizing losers (preventing sandbagging). However, it comes at the expense of KPI 3, because one person’s loss is not another person’s gain. It would also be possible to directly abuse this to avoid ranking down when you’re losing or to rank up other accounts (imaging keeping a 1 dan account around for intentionally mass timing out against other weaker accounts, thereby ranking them up quickly and easily). Over time this would potentially inflate the entire rank pool, making players appear on average a bit stronger than they are. And, frankly, it doesn’t properly punish someone for abandoning multiple games—there should be accountability. After all, there are people on the other end of the games.

Possible Solution Number 4

This idea could be considered a hybrid of 2 and 3, and would be more complex to implement. In this solution We would let ranks be affected as normal in the case of a mass-timeout, but instead of just heavily penalizing the player who timed out we’d give their account a special status. This status would basically be a way of letting them provisionally interact as their previous rank until they’ve sufficiently proved (through gameplay) that they deserve to keep their previous rank. This way the only player suffering consequences for the time-out is the person who made the mistake and timed out, but the consequences aren’t so dire that they abandon OGS or heavily damage the rank pool’s integrity by sand-bagging. The details of this are not even close to decided, but they may look something like this:

After a mass timeout (of say, 5 games or more), a player would be flagged as provisionally ranked at their pre-timeout rank. Let’s take for example that they are 1dan and the timeouts might collectively bring them down to 5kyu. The player would then be able to see and accept games listed for 1dan players, and if they play sufficiently well (over, say, 3 games) to show that they have 1dan ability then they could have the provisional status removed. While they’re in this state their account would be visibly flagged as having recently timed out, so other players could choose not to accept their games if they’re worried about that. The flag may serve as a deterrent for those who would attempt to grief the system somehow.

In this model, if a provisional player mass timed out again they’d just be ranked down. Likewise if they didn’t prove their rank by winning games within a reasonable time period they’d just get ranked down, not left provisional.

Moving forward

In order to make a solution for this problem become realized, we’ll probably need to first gather some baseline data about the complaint count, the sandbagging frequency, and the rank pool. That way we can substantiate that our fix has had its intended effect. That said, I think the reason this has been such a hot item is because we all suspect the problem is validated—it’s really more useful as a baseline for showing that our solution fixed something.

Testing this issue could be tricky, as it’s not something we can simply test with a few people in dev. It would need to be tested by us, functionally, and then validated by the community’s use of it—and hopefully an improvement of the above KPIs.

anoek · 2023-09-22T00:00:57Z

anoek
Sep 22, 2023
Maintainer

2 was a problem in the past. 4 sounds overly complicated.

3 is generally preferable all around I think. Before implementing that though what needs to happen is someone needs to go through the data here: https://github.com/online-go/goratings , which houses the rating system test harness along with real world data, and do some case studies and analysis on what effect that would have on the rating system. My suspicion is that it might not matter much at all these days as Glicko is already asymmetric, but it's important to go through the motions here to make sure. Basically what we're looking for is sanity checking some know affected players and looking at the overall rank distribution before and after, and in general making sure everything still looks ok and we're not seeing sustained rank inflation.

6 replies

anoek Sep 22, 2023
Maintainer

They take the hit for the first loss in the current system. Honestly, abuse hasn't really been a problem, so I'm not that concerned with it.. that said I don't mind exploration - but any idea's need to be tested in that ratings repo so we can measure the impact.

kevincoleman Sep 22, 2023
Collaborator Author

That makes sense. It’s great that abuse hasn’t been a problem. Maybe from a UX perspective we can do something that actually calls attention to the mass timeout (for the timed out player), and which spins it as positive. Something like a notification: “We noticed you let a lot of games time out recently. We protected your rank for you by only letting one game loss affect your rank. Please be more accountable to the people you’re playing games with.”

kevincoleman Sep 22, 2023
Collaborator Author

We’d probably want to denote which games were annulled but didn’t change the player’s rank due to mass timeout (only for the timeout player), as has been discussed as a previous improvement.

anoek Sep 22, 2023
Maintainer

We’d probably want to denote which games were annulled but didn’t change the player’s rank due to mass timeout (only for the timeout player), as has been discussed as a previous improvement.

Yeah that's actually already been rolled out as of a couple of days ago, it's not retroactive though.

kevincoleman Sep 22, 2023
Collaborator Author

Oh! I thought it was primarily for the benefit of the person who didn’t get awarded the rank boon so they’d be less confused. I envisioned it as only being denoted for the timer-outer.

anoek · 2023-10-24T12:27:49Z

anoek
Oct 24, 2023
Maintainer

FYI I did some testing on asymmetrically applying the rating as well as just applying all ratings and not doing the mass timeout mitigation, both definitely still produced notably worse results and in general inflated the ratings more than is comfortable, so unfortunately I think we're kind of stuck with our current solution. (Although messaging around what's going on could most certainly be improved.)

1 reply

kevincoleman Oct 30, 2023
Collaborator Author

Hmmm. Would you be willing to elaborate? I apologize for my lack of knowledge in this area, but maybe you could help explain for me?

Asymmetrical rating changes are when we award a rank boost to the timeout winner and don't demote the timeout loser (like option 3), right?

And then symmetrical rating changes are just the same thing that happens at the end of a non-mass-timeout game currently (so this is like option 2), yes?

I can see how asymmetrical changes would bloat the rank pool, but then it shouldn't be much more than if the mass-time-outer just lost those games (because the only bloat would be in letting the mass-time-outer keep their rank). This never really felt like the right solution to me anyway, as players shouldn't be able to escape their rank hit for losing games simply by stringing them along until they find a good time to mass timeout.

I'm struggling to see how symmetrical application of rank changes bloats the rank pool. Yes, MTO offenders would take big rank hits, but their opponents are just legitimately winning games. If our rank pool can't handle several people winning games in a short period then what even is going on? As I understand it the drawback to option 2 (the symmetrical rank changes) is that it can be demotivating to MTO offenders and could set them up to sandbag. Am I missing something here?

^ This brings me to my next point. If this whole problem is created because we want to decrease sandbagging maybe we should consider measures specifically aimed at decreasing sandbagging. If we could sufficiently mitigate sandbagging behavior directly then we could go back to symmetrical rank changes and everyone would be happy. Would that not work?

When I say “mitigate sandbagging” I realize this sounds kind of like punishing sandbagging behavior—but really I mean something restorative not punitive. Option 4 aims at this, but I agree it's fairly complex. What about brainstorming simpler ways to mitigate sandbagging (I really would like to explore those options)?

What brought me back to this idea was the thought of polling the community on the question: “if someone forgets and times out on a bunch of games, whose rank(s) should suffer? Theirs or their opponents’?” I think we'd get a pretty unanimous “theirs should!” I think this might even be the same answer if followed up by asking “and you know you'd get x% more sandbaggers on OGS, right?”

While I don't believe democracy always creates the best UX, I do think it's a pretty good gauge for determining expected behavior.

As an example of how this undermines OGS: just recently I found myself hoping a stronger player might timeout and thereby give me the win. Silly to admit, but there it is. Then I remembered that if they did there'd be the very real possibility it would be a mass timeout and my rank wouldn't really benefit—and it immediately felt icky. Unfair. Just knowing that can happen makes me less attached to and less proud of my rank, and therefore less invested in earning a good rank. Granted, many people don't know that can happen, but regardless it does happen and then they get a little crash course on how our rating system feels broken sometimes.

I realize I'm new here, and may just get steamrolled, but I guess I just want to say that I'm not happy with the current way it works.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mass Timeouts #2374

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 7 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Mass Timeouts #2374

kevincoleman Sep 21, 2023 Collaborator

The Mass Timeout UX problem

KPIs for UX success:

Proposed Solutions

Possible Solution Number 1

Possible Solution Number 2

Possible Solution Number 3

Possible Solution Number 4

Moving forward

Replies: 2 comments · 7 replies

anoek Sep 22, 2023 Maintainer

anoek Sep 22, 2023 Maintainer

kevincoleman Sep 22, 2023 Collaborator Author

kevincoleman Sep 22, 2023 Collaborator Author

anoek Sep 22, 2023 Maintainer

kevincoleman Sep 22, 2023 Collaborator Author

anoek Oct 24, 2023 Maintainer

kevincoleman Oct 30, 2023 Collaborator Author

kevincoleman
Sep 21, 2023
Collaborator

Replies: 2 comments 7 replies

anoek
Sep 22, 2023
Maintainer

anoek Sep 22, 2023
Maintainer

kevincoleman Sep 22, 2023
Collaborator Author

kevincoleman Sep 22, 2023
Collaborator Author

anoek Sep 22, 2023
Maintainer

kevincoleman Sep 22, 2023
Collaborator Author

anoek
Oct 24, 2023
Maintainer

kevincoleman Oct 30, 2023
Collaborator Author