WarGear / Forum - Viewing thread: Updating CP:

Pages: 1 2 3 4 56 (6 in total)

Tue 25th Feb 2014 18:11 #101 / 114

SquintGnome
Standard Member

Rank

Brigadier General

Rank Posn

#35

Join Date

Jun 11

Location

Posts

546

As Hugh mentioned before, the best way to 'visualize' the effects of a new scoring system is to make available the database of completed games. Then, you can apply a new rating algorithm(s) to the completed games and compare side by side the rankings that would be assigned by various systems. We don't need to guess what would happen, we can compare the results and agree on what seems best for the site (with Tom's input and approval, of course).
Sat 1st Mar 2014 06:17 #102 / 114

M57
Standard Member

Rank

Brigadier General

Rank Posn

#73

Join Date

Apr 10

Location

Posts

5083

I got lost by the bottom of this blog, but I found it to be a very accessible introduction to trueskill for us higher math phobic types. It starts right at the beginning.

http://www.moserware.com/2010/03/computing-your-skill.html

This kind of system is absolutely what we should be using. Things like K-factors will stop the whole "It's not fair that I can lose 100 points just because I lost to a really low ranked player." debate, etc.. Players quickly rise to equilibrium. Scores are much more stable and accurate. We just have too many geeks and nerds playing here not to use something like this.

I'm sure it's a bit of work for Tom, but I'd say the trickiest part of implementing a system like this will be explaining it to members. The math will be too complicated for most, but I found some of the graphics on the moserware site to be excellent at summarizing its functionality.

The four graphs with (Bright Beginner & Eric) and (Eric & Natalia) spoke volumes to me. That math may only show the tip of the TrueSkill iceberg, but that's all I needed to see to be sold. The Wiki could include a similar explanation with visuals of whatever system is used.

Card Membership - putting the power of factories in your hand.

Edited Sat 1st Mar 06:22 [history]
Sat 1st Mar 2014 08:40 #103 / 114

Amidon37
Prime

Rank

General

Rank Posn

#3

Join Date

Feb 10

Location

Posts

1869

The guy who wrote the page M linked to has written code for it also.

https://github.com/moserware/Skills

This sounds real cool -
Sat 1st Mar 2014 11:32 #104 / 114

berickf
Premium Member

Rank

Brigadier General

Rank Posn

#72

Join Date

Jan 12

Location

Posts

822

One problem that is stated in the article is that this algorithm is for 1v1 or one team versus another team. As quoted from the article, "Elo was explicitly designed for two players. Efforts to adapt it to work for multiple people on multiple teams have primarily been unsophisticated hacks." It doesn't even address multiple player non-team games, so, an algorithm that addresses the typical WarGear game - many non teamed players - might be preferable?

I'm all for progressiveness but we'd still need a round peg for a round hole, yes?
Sat 1st Mar 2014 11:54 #105 / 114

Amidon37
Prime

Rank

General

Rank Posn

#3

Join Date

Feb 10

Location

Posts

1869

The article is stating why Elo is not good for multi-player games as it (ELO) was designed for two players games.

Trueskill is a system for multi-player games. It (Trueskill) is a round peg for our round hole.
Sat 1st Mar 2014 12:46 #106 / 114

berickf
Premium Member

Rank

Brigadier General

Rank Posn

#72

Join Date

Jan 12

Location

Posts

822

Amidon37 wrote:
The article is stating why Elo is not good for multi-player games as it (ELO) was designed for two players games.

Trueskill is a system for multi-player games. It (Trueskill) is a round peg for our round hole.

Oh, because of all the examples he was giving they're also for 1v1 or one team versus another. So, I agree that he is saying that trueskill is a generalization of Elo and explicitly deals with the integration of newbies into the system, but, given his examples, I still don't see how it's addressing multi-player games? Can you explain this to me so I can better understand how it would work for WarGear type games?

Thanks.
Sat 1st Mar 2014 13:48 #107 / 114

M57
Standard Member

Rank

Brigadier General

Rank Posn

#73

Join Date

Apr 10

Location

Posts

5083

berickf wrote:
One problem that is stated in the article is that this algorithm is for 1v1 or one team versus another team. As quoted from the article, "Elo was explicitly designed for two players. Efforts to adapt it to work for multiple people on multiple teams have primarily been unsophisticated hacks." It doesn't even address multiple player non-team games, so, an algorithm that addresses the typical WarGear game - many non teamed players - might be preferable?

I'm all for progressiveness but we'd still need a round peg for a round hole, yes?

Here is a calculator that demonstrates that it works for multiple player games. Set for "8 player free for all" with 1 person on a team and set places 2 through 8 to draw. Move around the teams to simulate different scenarios.

Card Membership - putting the power of factories in your hand.
Sat 1st Mar 2014 14:42 #108 / 114

btilly
Standard Member

Rank

Colonel

Rank Posn

#86

Join Date

Jan 12

Location

Posts

294

The version of trueskill described in that article is only suited to games with 2 teams.

It wouldn't be hard to generalize it for full multiplayer. And I am sure people have done so. And I am sure that it works.

In fact by "wouldn't be hard" I can easily produce a version that does. What you need is a model of what happens in a multiplayer game.

ELO and trueskill both assume that if I know the ratings of the two players, then a simple normal distribution on the difference in ratings predicts the probability of success. An easy type of game that definitely works that way is to say that in a game your performance will be your skill + a random component that is based on a normal distribution. The real game doesn't look like that, but it is a simple model and is close enough to provide useful predictions. (A key idea for applying math to the real world is to create BS toy models that, while wrong, work well enough to be useful.)

That easy type of game has a trivial multiplayer version, multiple people show up with their random numbers based on luck + skill, and the best number wins. The distribution of likelihood of winning with this game is more complicated than the 2 player version, but if you know everyone's skills, we can work it out. Real games don't work like that, but it is probably close enough for argument's sake.

Now with that said, trueskill can be applied to that multiplayer version. And you'll get something that undoubtably works well. Understanding it is hard. And it is complex. But it will work well.

However I wouldn't blame Tom for deciding that it is something which HE is not willing to do. I personally don't have the energy in my life to create it either. But if Hugh and Tom agreed to let Hugh create it, I'd be happy to discuss with Hugh and review what he did. And if the two of us agreed that this rating system was created right, Tom may be willing to integrate it into his site.

So it could happen. But it really is going to take volunteering and stepping up to the plate.
Sat 1st Mar 2014 15:32 #109 / 114

berickf
Premium Member

Rank

Brigadier General

Rank Posn

#72

Join Date

Jan 12

Location

Posts

822

btilly wrote:
The version of trueskill described in that article is only suited to games with 2 teams.

It wouldn't be hard to generalize it for full multiplayer. And I am sure people have done so. And I am sure that it works.

In fact by "wouldn't be hard" I can easily produce a version that does. What you need is a model of what happens in a multiplayer game.

ELO and trueskill both assume that if I know the ratings of the two players, then a simple normal distribution on the difference in ratings predicts the probability of success. An easy type of game that definitely works that way is to say that in a game your performance will be your skill + a random component that is based on a normal distribution. The real game doesn't look like that, but it is a simple model and is close enough to provide useful predictions. (A key idea for applying math to the real world is to create BS toy models that, while wrong, work well enough to be useful.)

That easy type of game has a trivial multiplayer version, multiple people show up with their random numbers based on luck + skill, and the best number wins. The distribution of likelihood of winning with this game is more complicated than the 2 player version, but if you know everyone's skills, we can work it out. Real games don't work like that, but it is probably close enough for argument's sake.

Now with that said, trueskill can be applied to that multiplayer version. And you'll get something that undoubtably works well. Understanding it is hard. And it is complex. But it will work well.

However I wouldn't blame Tom for deciding that it is something which HE is not willing to do. I personally don't have the energy in my life to create it either. But if Hugh and Tom agreed to let Hugh create it, I'd be happy to discuss with Hugh and review what he did. And if the two of us agreed that this rating system was created right, Tom may be willing to integrate it into his site.

So it could happen. But it really is going to take volunteering and stepping up to the plate.

Thanks btilly. Your saying that "Understanding it is hard. And it is complex." made me decide that my understanding it isn't so important then. That you agreed that the article didn't address multiplayer as I thought, but, said as a mathematician that it is possible... I'll trust you... and Hugh... and any other hardcore mathematicians who might want to construct a trueskill model specific to WarGear type games.

M57, not sure if you were supposed to have a link in your response, or if you were saying that there was a link in the article that I missed for inputting variables like for multiplayer true skill? If I missed it though, I'm sorry.
Sat 1st Mar 2014 21:48 #110 / 114

btilly
Standard Member

Rank

Colonel

Rank Posn

#86

Join Date

Jan 12

Location

Posts

294

Important note. When I looked it up, I found on Wikipedia that TrueSkill is both patented and trademarked. Therefore I'd recommend avoiding the term, and probably not implementing it exactly.

The Glicko system achieves similar goals, but lacks both of those drawbacks.

Both systems rely on having ratings, and rating uncertainties. In both the uncertainty increases over time. One of the features of how Microsoft uses TrueSkill for ranking that I like is that your ranking is based on rating - 3 * uncertainty. Thus people who played just a few times do not have a good chance of getting a top ranking because their rating is considered uncertain. And inactive people have their uncertainty slowly increase and therefore their ranking decreases over time.

The one thing that TrueSkill has thought about which Glicko did not is how to divide credit/blame for team games. That's an important thing to address, but should be possible to address without stepping on the TrueSkill patent. (And definitely do not use the TrueSkill name - trademarks are a defend it or lose it proposition, so these days there are automated programs to identify small websites that infringe on trademarks. Trying to "fly under the radar" is a good way to get a threatening letter from a lawyer, so might as well save yourself the grief up front.)

Both systems handle multiplayer games through hacks.
Sat 1st Mar 2014 21:58 #111 / 114

M57
Standard Member

Rank

Brigadier General

Rank Posn

#73

Join Date

Apr 10

Location

Posts

5083

berickf wrote:
M57, not sure if you were supposed to have a link in your response, or if you were saying that there was a link in the article that I missed for inputting variables like for multiplayer true skill? If I missed it though, I'm sorry.

oops..

http://boson.research.microsoft.com/trueskill/rankcalculator.aspx

Set for "8 player free for all" with 1 person on a team and set places 2 through 8 to draw. Move around the teams to simulate different scenarios.

Card Membership - putting the power of factories in your hand.
Sun 2nd Mar 2014 21:42 #112 / 114

Hugh
Standard Member

Rank

Lieutenant General

Rank Posn

#13

Join Date

Nov 09

Location

Posts

869

The Microsoft version does have a multiplayer version. The link M57 posted is the one I saw long ago. It is proprietary, but the link is about a guy who wrote an open source implementation, presumably based on Microsoft's own description of the algorithm. AND he had no fear in mentioning it by name! Glicko is public domain and modifiable to multiplayer, so that is safer.

Previously, the only issue was getting data and permission from tom, because I certainly had the will and excitement to code, research, run tests of different systems, and make the sell showing differences between systems. But I too may not have that kind of time right now, so if btilly created it, I'd be happy to discuss it with him and review what he did.

As an aside - a different sort of volunteer might engage in the following simple project: write a "webcrawler" to compile the win/loss data of the site. In Python there's a way to get the source of a web page as a string, and string processing tools strong enough to detect the various winners and losers.

Edited Sun 2nd Mar 21:42 [history]
Sun 2nd Mar 2014 22:40 #113 / 114

Ozyman
Shelley, not Moore

Rank

Brigadier General

Rank Posn

#40

Join Date

Nov 09

Location

Posts

3448

Hugh wrote:
As an aside - a different sort of volunteer might engage in the following simple project: write a "webcrawler" to compile the win/loss data of the site. In Python there's a way to get the source of a web page as a string, and string processing tools strong enough to detect the various winners and losers.

Something like this really interests me. I've been wanting to do more detailed analysis of boards, and if this webcrawler was general enough it might be possible to do something like that. For example - in invention, I'd love to know if one civ wins more often than others, etc.

I don't think I'd have time to do it immediately, but over the course of the next couple of months it might be a possibility.
Tue 8th Apr 2014 01:46 #114 / 114

Babbalouie
Premium Member

Rank

Brigadier General

Rank Posn

#47

Join Date

Nov 13

Location

Posts

172

In support of the 4 rank system (or 5 as per Cona Chris), I would like to note that the lowest player on the Global Ranking system is a Lieutenant.