First, a question: What formula (or kind of formula) do you use for your ranking system?
I suggest using something similar to ELO (many gaming organizations and web sites use a variant of this) instead of the linear formula Warfish used. Many of the criticisms of that system can be traced back to that choice.
-Hugh
And what should be here? http://www.wargear.net/help/display/Rankings
hi Hugh
I did consider ELO after your comments on the WF board, I also looked at Microsoft's TrueSkill (http://research.microsoft.com/en-us/projects/trueskill/details.aspx) ranking systen. . The problem is ELO, Glicko, TrueSkill and most other ranking systems are geared towards two players in a game (not surprisingly as ELO etc are used to rank chess games). The maths for more than two players starts getting pretty hairy, I found a couple of examples on the net of the calculations (long discussion of the merits here: http://www.tomshardware.com/forum/75151-13-realize-multiplayer-game-anyway) and didn't really fancy implementing it. Also one complaint of some of the more complex systems like TrueSkill is it's actually possible to lose ranking points even if you win a game against certain combinations of player rating.
I then thought about what the ranking system really needed to achieve, i.e. zero sum behaviour, transparent and easily calculable, rewarding better players for beating weaker players etc etc and the system used by Warfish and Conquer Club already does exactly this.
The behaviour on WF that I wanted to prevent here was just using ranking score to determine medals as this discourages players who had achieved a certain rating from continuing to play again once they reached the top scoring threshold. To do this I have implemented a points system based on position, i.e. only the top 10 players get championship points (equivalent to medals on WF).
To get around the problem whereby the first players on a new board could get lots of points just by virtue of having played first (and being first on the top 10 list) I've added thresholds that you must exceed to get the maximum available points for your placing. For example:
The top ten players get the following championship points awarded: 20-15-12-10-8-6-4-3-2-1
But, each the top ten players must also score higher than the threshold to get these points:
1. 1500+ score - 20 points
2. 1450+ score - 15 points
3. 1400+ score - 12 points
4. 1350+ score - 10 points
5. 1300+ score - 8 points
6. 1250+ score - 6 points
7. 1200+ score - 4 points
8. 1150+ score - 3 points
9. 1100+ score - 2 points
10.1050+ score - 1 point
So, to get 20 points, the player must have a ranking score > 1500 AND be placed first on the board ranking.
Other than that there is now both a ranking score for each board and a global ranking score (i.e. a score which updated for every non-test / non-team game you play on every board).
Plus there is the 'G rating' which is a normalised rating based on how many games you expect to win on a game of a particular size. A G rating of > 1 means you win more games than average, < 1 is below average.
Anyway it's early days as regards the rating system so if you have any suggestions for better ways the above can be implemented please let me know and we can work something out.
ps Yertle - something like the above should be on the Rating help page :)
I like it, does what I think Warfish does fairly well and expands with the need to potentially keep playing a map, which is great.
Also could have a lot higher numbers which could discourage some new players from even attempting to break into there...you run any of these numbers against Warfish current rankings? What would the top 5ish there be with these numbers? (If no one finds these I may try just soon for kicks.)
Plus there is the 'G rating' which is a normalised rating based on how many games you expect to win on a game of a particular size. A G rating of > 1 means you win more games than average, < 1 is below average.
Hey neat, I did something similar with my script and called it Normalized Wins. Yours was 1.92 =]
Although of the players on the front page, TheVictor beat you with a 2.22. Hehe.
Don't ask what mine is -.-
Yertle - do you mean it should be more than a top 10 - more like top 20? I haven't run the numbers against Warfish, I suspect the top 5 would be very similar though.
IRoll11 - I didn't realize your script did that as well... I had a look the code but I got scared and had to hide behind the couch.
[edit] actually I think I misunderstood you Yertle - you mean that top players are going to end up with huge scores, presumably up to 2x for the top players and that will discourage players from trying to beat them?
What about a minimum number of completed games before you can be included in ranking?
I just clicked on the ranking tab to see that the number one players has played all of 1 game total, thus giving him a win percentage of 100% He dethroned the previous #1 ranked player who has completed 40 games, but only managed to win 26 of those (not bad, if you ask me) games; giving him a 65% win ratio.
I think a 10-game minimum should help reduce this sort of injustice.
[edit] actually I think I misunderstood you Yertle - you mean that top players are going to end up with huge scores, presumably up to 2x for the top players and that will discourage players from trying to beat them?
That. Although having the full list to see where you stand for the overall site may help that initial "fear".
tom wrote:The maths for more than two players starts getting pretty hairy,
One simplifying assumption I like about the WF rating calculation is that it treats the results of multiplayer games as sequence of 2-player games in which the winner won each game and each loser only lost one game to that winner.
Using this idea, you can take any 2-player based rating system and convert it into this style of multiplayer system. My guess is the complicated multiplayer systems you looked at weren't games of a "winner-take-all" (no props for 2nd place!) style like Risk. Such games would require something more complicated. For winner-take-all, the WF simplifying assumption is probably accurate enough.
My main objection is WF's "updating function". Every system can be viewed as:
New Rating = Old Rating + [Formula based on rating data]
For Warfish that formula is something like (Loser's Rating/Winner's Rating)*20.
There are some bad properties that result from this. The main one I object to is that it is hard to reach the point of diminishing returns against the beginners, therefore it is better for your rating to avoid anyone with skill (most people don't actively do this, but it happens "automatically" and is generally an undesirable property of a rating system).
Suppose, for example, you have a 1400 rating on a map and you have an 80% win rate against the typical 1000 rated player (this example can be extrapolated to multiplayer). Now, given someone rated 1200, they are very unlikely to be anywhere near as bad as the 1000 players, but let's say you still enjoy a 60% edge. Your expected value is to gain 1 point against the 1200 player, but it's about 6 points to beat up a beginner under the WF rating formula.
A less important objection for WF, but possibly a very serious problem for your system, is that it takes a lot of time to reach equilibrium. A quick glance at the Malta rankings reveals that the top player is 1835, with a 65% win percentage in 3-player games (161 games played!), while Ando has a 70% win percentage (20 games played!!!), mostly more than 3 players, and a 1601 rating. So, to catch up to the top spot is really more a matter of effort - I doubt that even in a 3-player setting this player is THAT much better, if at all, than Ando.
One of the goals of any ranking system is to measure skill, not effort. Now, under any system, I would hope Ando would have to play more than 20 games to keep the top spot, but it shouldn't come down to "Who can beat up the beginners AND accumulate more games played?"
Now, I'm not suggesting that you choose lightly and grab an Elo or Glicko formula from the net and plug it in for the WF formula, because likely you'd want to test against data. An Elo update formula, while hairier than WF's, is not that bad:
L = loser's rating, W = winner's rating
New Rating = Old Rating + 32*[(10 ^ (L/400)) / (10 ^ (L/400) + 10 ^ (W/400))]
(for the winner, minus that expression for the loser - note the winner can only gain)
The parameter 32 is how much variation is desired for a given game (this is WF's 20). The 10 and the 400 mean that if a player is ranked 400 points higher than another player, he has a 10 to 1 probability advantage over the lower ranked player. This makes sense for go/shogi/chess, but not for WF. My guess is that something more like 3 and 200 makes sense: A player ranked 200 points higher has a 3 to 1 probability edge. Again, data and experiments might be good here.
Anyway, sorry for the long post - my aim was to convince you that there are serious issues with the WF ranking system and that the system can be improved (though I do have more to say!)
-Hugh
A quick note on Toaster's post: A lot of organizations deal with the high rating of someone who has played a small number of games by marking them as having a "provisional rating". With provisional ratings, the player can't receive an official ranking until he has played enough games. Glicko has a built-in way of dealing with this (by keeping track of how "uncertain" a rating is and using that uncertainty in the update function's calculations). There are other ways of dealing with provisional/uncertain rankings (e.g. by altering the 32 parameter mentioned in the previous post as games accumulate) as well. -H
tom wrote:So, to get 20 points, the player must have a ranking score > 1500 AND be placed first on the board ranking.
Question:
Player A has a Ranking of 1550
Player B has a Ranking of 1525
These players are 1st and 2nd on the Ranking list for the board. Do both receive 20 CPs (Championship Points) or does Player A get 20 and Player B get 15 (even though both are above the 1500+ and in the Top 10)?
Yertle wrote:tom wrote:So, to get 20 points, the player must have a ranking score > 1500 AND be placed first on the board ranking.
Question:
Player A has a Ranking of 1550
Player B has a Ranking of 1525
These players are 1st and 2nd on the Ranking list for the board. Do both receive 20 CPs (Championship Points) or does Player A get 20 and Player B get 15 (even though both are above the 1500+ and in the Top 10)?
I think I found my answer here: katekat
So both players would recieve 20 CPs, correct? It is possible then to have 10 players with a score of 1500+ and all would be at 20 CPs, but if an 11th joined that group, then one of those players would drop back down to 0 CPs (and there would be 0 CPs handed out less than 20 CPs).
By the way, Rankings Tab of Maps aren't getting updated.
Yertle wrote:Player A has a Ranking of 1550
Player B has a Ranking of 1525
These players are 1st and 2nd on the Ranking list for the board. Do both receive 20 CPs (Championship Points) or does Player A get 20 and Player B get 15 (even though both are above the 1500+ and in the Top 10)?
No, only the top player will get 20 points. If you have 10 players above 1500 then they get 20-15-12-10 etc points depending on their relative scores.
I'll fix the rankings table, thanks for that.
tom wrote:No, only the top player will get 20 points. If you have 10 players above 1500 then they get 20-15-12-10 etc points depending on their relative scores.
What if 10 players have a score between 1450 and 1500? Will they all get 15 CPs? (If so wouldn't it be most altruistic to stay in this range and be as close to 1499 as possible? I assume this would never really occur though.)
What if all 10 are between 1200 and 1250 (like here in which both Player A and Player B are between there and both get 4 CPs)?
What if 1 player is at 1500+, and 9 are between 1450-1500, is there one player with 20 then the next 9 all get 15, or does it scale down there too?
Or does it start scaling down once more than 10 people hit the 1050+ threshold?
Thanks!
Look at it this way... if you have a score between 1450 and 1500 the MAXIMUM you can possibly get is 15 points - even if you are ranked #1.
If there is someone with a score one point higher than you then you will both get 15 points. If there are 2 people with a score higher than you then you will only get 12 points as you are then the third placed player.
Does that make sense?
tom wrote: Look at it this way... if you have a score between 1450 and 1500 the MAXIMUM you can possibly get is 15 points - even if you are ranked #1.
If there is someone with a score one point higher than you then you will both get 15 points. If there are 2 people with a score higher than you then you will only get 12 points as you are then the third placed player.
Does that make sense?
Okay, I think I've wrapped my head around this... a bit more complicated than I first thought, not sure if that's bad or still good :P.
It's just supposed to be a way to get around the problem where I wanted to introduce competition amongst top ranking players to be #1 not just above a certain threshold so that they don't just play a map then walk away and never play it again because to do so risks losing champ points.
But also it stop the problem where a new board is introduced and the first person who players and wins immediately gets a number one position on the leaderboard and 20 champ points.
Hugh - sorry for the slow reply to your suggestion regarding ELO rating. Potentially I could add ELO scores in alongside the existing ranking system and we could evaluate at a later date which one is the preferred system.
It wouldn't be hard to run through the game database in historical sequence and generate an additional set of ranking figures according to a different formula.
What if two people have the same score of 1525 (Ranked #1), do they both receive 20 CPs?
If there is someone below them at 1520 (Ranked #2) do they receive 15 or 12?
Ok so in this case the tie-breaker is based on the players G rating (= normalized win ratio) then if that is also equal, finally the last completed game played timestamp.
And the G Rating is only a Global stat correct? Meaning you don't have a G Rating per board correct?