92 Open Daily games
3 Open Realtime games
    Pages:   123456   (6 in total)
  1. #101 / 114
    Standard Member SquintGnome
    Rank
    Major General
    Rank Posn
    #30
    Join Date
    Jun 11
    Location
    Posts
    546

    As Hugh mentioned before, the best way to 'visualize' the effects of a new scoring system is to make available the database of completed games.  Then, you can apply a new rating algorithm(s) to the completed games and compare side by side the rankings that would be assigned by various systems.  We don't need to guess what would happen, we can compare the results and agree on what seems best for the site (with Tom's input and approval, of course). 


  2. #102 / 114
    Brigadier General M57 M57 is offline now
    Words Above Avatar M57
    Rank
    Brigadier General
    Rank Posn
    #65
    Join Date
    Apr 10
    Location
    Posts
    4720

    I got lost by the bottom of this blog, but I found it to be a very accessible introduction to trueskill for us higher math phobic types.  It starts right at the beginning.

    http://www.moserware.com/2010/03/computing-your-skill.html

    This kind of system is absolutely what we should be using.   Things like K-factors will stop the whole  "It's not fair that I can lose 100 points just because I lost to a really low ranked player." debate, etc..  Players quickly rise to equilibrium.  Scores are much more stable and accurate.  We just have too many geeks and nerds playing here not to use something like this.

    I'm sure it's a bit of work for Tom, but I'd say the trickiest part of implementing a system like this will be explaining it to members.  The math will be too complicated for most, but I found some of the graphics on the moserware site to be excellent at summarizing its functionality.

    The four graphs with (Bright Beginner & Eric) and (Eric & Natalia) spoke volumes to me. That math may only show the tip of the TrueSkill iceberg, but that's all I needed to see to be sold.  The Wiki could include a similar explanation with visuals of whatever system is used.

    Card Membership - putting the power of factories in your hand.
    Edited Sat 1st Mar 06:22 [history]

  3. #103 / 114
    Prime Amidon37
    Rank
    General
    Rank Posn
    #3
    Join Date
    Feb 10
    Location
    Posts
    1566

    The guy who wrote the page M linked to has written code for it also.

    https://github.com/moserware/Skills

    This sounds real cool - 


  4. #104 / 114
    Premium Member berickf
    Rank
    Brigadier General
    Rank Posn
    #53
    Join Date
    Jan 12
    Location
    Posts
    822

    One problem that is stated in the article is that this algorithm is for 1v1 or one team versus another team.  As quoted from the article, "Elo was explicitly designed for two players. Efforts to adapt it to work for multiple people on multiple teams have primarily been unsophisticated hacks."  It doesn't even address multiple player non-team games, so, an algorithm that addresses the typical WarGear game - many non teamed players - might be preferable?

    I'm all for progressiveness but we'd still need a round peg for a round hole, yes?


  5. #105 / 114
    Prime Amidon37
    Rank
    General
    Rank Posn
    #3
    Join Date
    Feb 10
    Location
    Posts
    1566

    The article is stating why Elo is not good for multi-player games as it (ELO) was designed for two players games.  

    Trueskill is a system for multi-player games.  It (Trueskill) is a round peg for our round hole.


  6. #106 / 114
    Premium Member berickf
    Rank
    Brigadier General
    Rank Posn
    #53
    Join Date
    Jan 12
    Location
    Posts
    822

    Amidon37 wrote:

    The article is stating why Elo is not good for multi-player games as it (ELO) was designed for two players games.  

    Trueskill is a system for multi-player games.  It (Trueskill) is a round peg for our round hole.

    Oh, because of all the examples he was giving they're also for 1v1 or one team versus another.  So, I agree that he is saying that trueskill is a generalization of Elo and explicitly deals with the integration of newbies into the system, but, given his examples, I still don't see how it's addressing multi-player games?  Can you explain this to me so I can better understand how it would work for WarGear type games?

    Thanks.


  7. #107 / 114
    Brigadier General M57 M57 is offline now
    Words Above Avatar M57
    Rank
    Brigadier General
    Rank Posn
    #65
    Join Date
    Apr 10
    Location
    Posts
    4720

    berickf wrote:

    One problem that is stated in the article is that this algorithm is for 1v1 or one team versus another team.  As quoted from the article, "Elo was explicitly designed for two players. Efforts to adapt it to work for multiple people on multiple teams have primarily been unsophisticated hacks."  It doesn't even address multiple player non-team games, so, an algorithm that addresses the typical WarGear game - many non teamed players - might be preferable?

    I'm all for progressiveness but we'd still need a round peg for a round hole, yes?

    Here is a calculator that demonstrates that it works for multiple player games.  Set for "8 player free for all" with 1 person on a team and set places 2 through 8 to draw.  Move around the teams to simulate different scenarios.

    Card Membership - putting the power of factories in your hand.

  8. #108 / 114
    Premium Member btilly
    Rank
    Colonel
    Rank Posn
    #75
    Join Date
    Jan 12
    Location
    Posts
    293

    The version of trueskill described in that article is only suited to games with 2 teams.

    It wouldn't be hard to generalize it for full multiplayer.  And I am sure people have done so.  And I am sure that it works.

    In fact by "wouldn't be hard" I can easily produce a version that does.  What you need is a model of what happens in a multiplayer game.

    ELO and trueskill both assume that if I know the ratings of the two players, then a simple normal distribution on the difference in ratings predicts the probability of success.  An easy type of game that definitely works that way is to say that in a game your performance will be your skill + a random component that is based on a normal distribution.  The real game doesn't look like that, but it is a simple model and is close enough to provide useful predictions.  (A key idea for applying math to the real world is to create BS toy models that, while wrong, work well enough to be useful.)

    That easy type of game has a trivial multiplayer version, multiple people show up with their random numbers based on luck + skill, and the best number wins.  The distribution of likelihood of winning with this game is more complicated than the 2 player version, but if you know everyone's skills, we can work it out.  Real games don't work like that, but it is probably close enough for argument's sake.

    Now with that said, trueskill can be applied to that multiplayer version.  And you'll get something that undoubtably works well.  Understanding it is hard.  And it is complex.  But it will work well.

    However I wouldn't blame Tom for deciding that it is something which HE is not willing to do.  I personally don't have the energy in my life to create it either.  But if Hugh and Tom agreed to let Hugh create it, I'd be happy to discuss with Hugh and review what he did.  And if the two of us agreed that this rating system was created right, Tom may be willing to integrate it into his site.

    So it could happen.  But it really is going to take volunteering and stepping up to the plate.


  9. #109 / 114
    Premium Member berickf
    Rank
    Brigadier General
    Rank Posn
    #53
    Join Date
    Jan 12
    Location
    Posts
    822

    btilly wrote:

    The version of trueskill described in that article is only suited to games with 2 teams.

    It wouldn't be hard to generalize it for full multiplayer.  And I am sure people have done so.  And I am sure that it works.

    In fact by "wouldn't be hard" I can easily produce a version that does.  What you need is a model of what happens in a multiplayer game.

    ELO and trueskill both assume that if I know the ratings of the two players, then a simple normal distribution on the difference in ratings predicts the probability of success.  An easy type of game that definitely works that way is to say that in a game your performance will be your skill + a random component that is based on a normal distribution.  The real game doesn't look like that, but it is a simple model and is close enough to provide useful predictions.  (A key idea for applying math to the real world is to create BS toy models that, while wrong, work well enough to be useful.)

    That easy type of game has a trivial multiplayer version, multiple people show up with their random numbers based on luck + skill, and the best number wins.  The distribution of likelihood of winning with this game is more complicated than the 2 player version, but if you know everyone's skills, we can work it out.  Real games don't work like that, but it is probably close enough for argument's sake.

    Now with that said, trueskill can be applied to that multiplayer version.  And you'll get something that undoubtably works well.  Understanding it is hard.  And it is complex.  But it will work well.

    However I wouldn't blame Tom for deciding that it is something which HE is not willing to do.  I personally don't have the energy in my life to create it either.  But if Hugh and Tom agreed to let Hugh create it, I'd be happy to discuss with Hugh and review what he did.  And if the two of us agreed that this rating system was created right, Tom may be willing to integrate it into his site.

    So it could happen.  But it really is going to take volunteering and stepping up to the plate.

    Thanks btilly.  Your saying that "Understanding it is hard.  And it is complex." made me decide that my understanding it isn't so important then.  That you agreed that the article didn't address multiplayer as I thought, but, said as a mathematician that it is possible... I'll trust you... and Hugh... and any other hardcore mathematicians who might want to construct a trueskill model specific to WarGear type games.

    M57, not sure if you were supposed to have a link in your response, or if you were saying that there was a link in the article that I missed for inputting variables like for multiplayer true skill?  If I missed it though, I'm sorry.


  10. #110 / 114
    Premium Member btilly
    Rank
    Colonel
    Rank Posn
    #75
    Join Date
    Jan 12
    Location
    Posts
    293

    Important note.  When I looked it up, I found on Wikipedia that TrueSkill is both patented and trademarked.  Therefore I'd recommend avoiding the term, and probably not implementing it exactly.

    The Glicko system achieves similar goals, but lacks both of those drawbacks.

    Both systems rely on having ratings, and rating uncertainties.  In both the uncertainty increases over time.  One of the features of how Microsoft uses TrueSkill for ranking that I like is that your ranking is based on rating - 3 * uncertainty.  Thus people who played just a few times do not have a good chance of getting a top ranking because their rating is considered uncertain.  And inactive people have their uncertainty slowly increase and therefore their ranking decreases over time.

    The one thing that TrueSkill has thought about which Glicko did not is how to divide credit/blame for team games.  That's an important thing to address, but should be possible to address without stepping on the TrueSkill patent.  (And definitely do not use the TrueSkill name - trademarks are a defend it or lose it proposition, so these days there are automated programs to identify small websites that infringe on trademarks.  Trying to "fly under the radar" is a good way to get a threatening letter from a lawyer, so might as well save yourself the grief up front.)

    Both systems handle multiplayer games through hacks.


  11. #111 / 114
    Brigadier General M57 M57 is offline now
    Words Above Avatar M57
    Rank
    Brigadier General
    Rank Posn
    #65
    Join Date
    Apr 10
    Location
    Posts
    4720

    berickf wrote:

    M57, not sure if you were supposed to have a link in your response, or if you were saying that there was a link in the article that I missed for inputting variables like for multiplayer true skill?  If I missed it though, I'm sorry.

    oops..

    http://boson.research.microsoft.com/trueskill/rankcalculator.aspx

    Set for "8 player free for all" with 1 person on a team and set places 2 through 8 to draw.  Move around the teams to simulate different scenarios.

     

    Card Membership - putting the power of factories in your hand.

  12. #112 / 114
    Standard Member Hugh
    Rank
    Lieutenant General
    Rank Posn
    #8
    Join Date
    Nov 09
    Location
    Posts
    869

    The Microsoft version does have a multiplayer version. The link M57 posted is the one I saw long ago. It is proprietary, but the link is about a guy who wrote an open source implementation, presumably based on Microsoft's own description of the algorithm. AND he had no fear in mentioning it by name! Glicko is public domain and modifiable to multiplayer, so that is safer.

    Previously, the only issue was getting data and permission from tom, because I certainly had the will and excitement to code, research, run tests of different systems, and make the sell showing differences between systems. But I too may not have that kind of time right now, so if btilly created it, I'd be happy to discuss it with him and review what he did.

    As an aside - a different sort of volunteer might engage in the following simple project: write a "webcrawler" to compile the win/loss data of the site. In Python there's a way to get the source of a web page as a string, and string processing tools strong enough to detect the various winners and losers.

    Edited Sun 2nd Mar 21:42 [history]

  13. #113 / 114
    Premium Member Ozyman
    Rank
    Brigadier General
    Rank Posn
    #43
    Join Date
    Nov 09
    Location
    Posts
    3089

    Hugh wrote:

    As an aside - a different sort of volunteer might engage in the following simple project: write a "webcrawler" to compile the win/loss data of the site. In Python there's a way to get the source of a web page as a string, and string processing tools strong enough to detect the various winners and losers.

    Something like this really interests me.  I've been wanting to do more detailed analysis of boards, and if this webcrawler was general enough it might be possible to do something like that.  For example - in invention, I'd love to know if one civ wins more often than others, etc.

     

    I don't think I'd have time to do it immediately, but over the course of the next couple of months it might be a possibility.


  14. #114 / 114
    Premium Member Babbalouie
    Rank
    Major General
    Rank Posn
    #28
    Join Date
    Nov 13
    Location
    Posts
    156

    In support of the 4 rank system (or 5 as per Cona Chris), I would like to note that the lowest player on the Global Ranking system is a Lieutenant.


You need to log in to reply to this thread   Login | Join
 
Pages:   123456   (6 in total)