223 Open Daily games
1 Open Realtime game
    Pages:   1   (1 in total)
  1. #1 / 19
    Standard Member SquintGnome
    Rank
    Brigadier General
    Rank Posn
    #35
    Join Date
    Jun 11
    Location
    Posts
    546

    I wanted to continue my thoughts on developing a more meaningful luck statistic.    I will start by saying that I think the current  number is meaningful but lacks full significance without relating it to something that will put it in context – perhaps as a ratio to the numbers of rolls or a ratio to an expected result.

    For example, if you roll 3v1 100 times you should get about 66 kills.  If you get 70 kills instead, this will yield a luck of +4.  This is meaningful, but what if you roll 3v1 1000 times expecting 660 kills and get 664?  This will give a +4 luck stat also, but I think we can see that in the first case you were ‘luckier’.  This is why I believe you need a ratio to put it in perspective.

    So, what should be used? To start with I would say that you should not use number of rolls, even
    though I used it in my example above. Here is why:


                                                          Attack
    Scenario               Iterations            Dice rolled           Expected kills     Actual kill             Luck

    2 v 1                       150                         300                         87               91                 +4

    3 v 1                       100                         300                         66               70                 +4

    3 v 2                       100                         300                         108              112               +4


    In the case above,  you can see that using (luck / dice rolled) to give 1.3% luck percentage will give the same answer for three very different scenarios.  This demonstrates the ineffectiveness of this ratio. 

    Instead, another ratio to consider is luck / expected kills.  This should be more meaningful since it compares expectations to results.

    Scenario               Iterations            Expected kills     Actual kill            Luck       Luck/ expectation

    2 v 1                       150                         87                   91                 +4           +4.6%

    3 v 1                       100                         66                   70                 +4           +6.0%

    3 v 2                       100                         108                112                 +4           +3.7%

     

    To me, this ratio is more substantive.  There are some other subtleties I would like to discuss later, but I want to handle one issue at a time.  Let me know what everyone thinks.



     


  2. #2 / 19
    Standard Member Toto
    Rank
    Brigadier General
    Rank Posn
    #45
    Join Date
    Jan 10
    Location
    Posts
    733

    Agreed with SG. We need something better than the mere luck stat. The ratio luck / expected kills may not be perfect for mathematicians, but it sounds like a good improvment and I would be happy to have it displayed along with the present luck stat.

    Two Eyes for An Eye, The Jaw for A Tooth
    Edited Sun 13th Nov 16:08 [history]

  3. #3 / 19
    Standard Member SquintGnome
    Rank
    Brigadier General
    Rank Posn
    #35
    Join Date
    Jun 11
    Location
    Posts
    546

    To answer your questions M,

    1. I understand that the luck stat was never meant to be a ratio.  I am not being critical of the stat, I am only suggesting an additional one.

    2. I have demonstrated why I feel that making it a ratio of iterations or number of rolls is ineffective.  If a statistic gives the same results for differents scenarios, then at best it can be improved and at worst it can be misleading.

    3. Yes, I think the second scenario is luckier than the first.  In fact I feel that luck / expected kills will properly rank the 'luckiness' of different scenarios.  I think the example demonstrates that. These games are all about making rolls based on expectations.  Using expectations in the ratio accounts for the riskiness of rolls, rolling 1v2 and 3v1 have very different probabilistic expectations.   

    Edited Sun 13th Nov 16:14 [history]

  4. #4 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    Oops! --- SORRY -- deleted my post while trying to edit it.

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home

  5. #5 / 19
    Standard Member Toto
    Rank
    Brigadier General
    Rank Posn
    #45
    Join Date
    Jan 10
    Location
    Posts
    733

    I don't want to highjack this thread but I don't think it should be possible to edit and/or delete a post once someone has answered to it. It could be indeed very misleading if you change what you have said after someone has agreed on it for example. Tom ?

    Two Eyes for An Eye, The Jaw for A Tooth
    Edited Sun 13th Nov 16:28 [history]

  6. #6 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083


    The number of attack dice thrown is a much more relevant comparison in my opinion:

                                                          Attack
    Scenario               Iterations            Dice rolled           Expected kills     Actual kill             Luck

    2 v 1                       150                         300                         87               91                 +4

    3 v 1                       100                         300                         66               70                 +4

       2 v 1                       100                          n/a                         58                62                +4

       3 v 1                       100                          n/a                         66                70                +4

    Using your proposed method this makes more sense to me, and the numbers suggest that they are comparable..

    2 v 1     6.9%

    3 v 1     7.1 %

    These are ratios as you have stated, but to present them as percentages is very misleading.  Consider that you win every battle in the above 3 v 1 scenario..

    3 v 1     50 %

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home

  7. #7 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    Toto wrote:

    I don't want to highjack this thread but I don't think it should be possible to edit and/or delete a post once someone has answered to it. It could be indeed very misleading if you change what you have said after someone has agreed on it for example. Tom ?

    Yeah, I agree it can be a problem, but I see it as a necessary evil. You get something like 20 or 30 minutes to edit your post.. I do it all the time for errors and typos.   Start another thread if you think it should be changed.  I like it the way it is.  This is the first time I've deleted a post by accident.

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home
    Edited Sun 13th Nov 16:40 [history]

  8. #8 / 19
    Standard Member SquintGnome
    Rank
    Brigadier General
    Rank Posn
    #35
    Join Date
    Jun 11
    Location
    Posts
    546

    I don't feel that a 50% result is misleading in the example you give.  50% means you had 50% more kills than expected- in other words - incredibly lucky.

    I think that in an average game anything over 5% would be very lucky.  Over the course of many games anything over 1% would be lucky.  Over hundreds of games anything over 1% might suggest something isn't working right in the random number generator.

    Edited Sun 13th Nov 16:55 [history]

  9. #9 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

     Consider a  different kind of comparison.       

                                                   Attack

    Scenario            Dice              Dice rolled        Expected kills     Actual kill         Luck

        #1                3 v 2                 300                    108               112               +4

        #2                3 v 2                 1500                   540              560               +20

     

    Imagine if the scenario #1 results occurred 5 times in a row. In other words, you played 5 games and came out with a +4 in each game.  Scenario #2 seems luckier to me.  Not 5 times luckier, but luckier nonetheless.

    Scenario          Proposed 

        #1                3.6%                 

        #2                3.6%                 

    Clearly these two scenarios are not comparable yet they produce the same ratio.  However, I do like your idea of using expected kills in the denominator because we are measuring expectations.

    I would offer that a more relevant comparison would be to use my idea of taking the square-root of the denominator (in this case, expected kills). This produces a psuedo:Z-score, where 95% of all scores should be lower than 2.0 and 68% should be lower than 1.0.  Scores over 3.2 should be very rare indeed and scores above 4.0 should be all but non-existant.

    With the above scenarios we get psuedo -Zs of 0.38 and 0.86 respectively.  These numbers suggest that while scenario 2 luckier than scenario #1, the scores are not unusual, which seem pretty reasonable to me.

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home
    Edited Sun 13th Nov 17:26 [history]

  10. #10 / 19
    Standard Member SquintGnome
    Rank
    Brigadier General
    Rank Posn
    #35
    Join Date
    Jun 11
    Location
    Posts
    546

    Yes, I agree with what you are saying in the sense that maintaining the same luck % over a longer period of time is luckier.  I think we touched on this before. 

    To get a fuller picture of how lucky you are, you need the luck % and over what period of time (or rolls) this occured.  I used the analogy before of baseball.  This concept is the same as hitting percentage.  Saying someone is batting .300 doesn't tell the whole story - it is more impressive if someone is batting .300 at the end of the season compared to the first game.

    I think your idea is a good way to add that time factor to the stat.  I need to educate myself a bit more about your proposal.  For those who may not be familiar with statistics it may be easier to say luck of 5% after 1000 rolls.  Longer presentation of same info, but may be simpler to grasp, not sure have to think on it some more.

     If you have the time and inclination can you give an example with a wider range of results building on the same scenario.  For example, start with what is above for scenario 1 and 2 and add other scenarios with more and less dice rolled adding a column with your stat to show how it would vary.


  11. #11 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    Here's an overly simple but quite clear explanation of what the z-score represents:

    http://www.youtube.com/watch?v=1xhCL5m4nI0

    Here's a graphic off of wikipedia that is more precise..

    Normal_distribution_and_scales.gif

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home

  12. #12 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    SquintGnome wrote:

     If you have the time and inclination can you give an example with a wider range of results building on the same scenario.  For example, start with what is above for scenario 1 and 2 and add other scenarios with more and less dice rolled adding a column with your stat to show how it would vary.

    You can do this easily enough.. Just find some scores that you think are very lucky or unlucky and compute the psuedo-Z.  Here's one:

    http://www.wargear.net/games/player/34732  Click on the shamrock..

    The Expected number of Kills will be the # of kills minus the luck stat

    Here's the winner's scores

                   Kills                  Deaths            Luck

     

    1386 (935 + 451)

    1231 (792 + 439)

    38.24 (6.85 + 31.39)

    Doing the math

    38/{sqrt(1386 - 38.24)} = 1.03, or approximately 1 standard deviation from the mean of 0.  Were this an actual z-score, you would expect a number greater than +38 only 16% of the time.  This was a totally random example from one of my games and it doesn't seem too far off considering there were over 2600 armies to be lost.

    N.B.   P-Z scores  of 1.07 on defense, and 0.17 while attacking.

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home
    Edited Sun 13th Nov 18:47 [history]

  13. #13 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    M57 wrote:
    38/{sqrt(1386 - 38.24)} = 1.03, or approximately 1 standard deviation from the mean of 0.  Were this an actual z-score, you would expect a number greater than +38 only 16% of the time. 

    BTW, If I calculate this score using my original proposal of using the square root of the total number at risk, I'd get a P-Z of 0.75, which doesn't seem nearly as reasonable.  This is why I like your idea of using the "expected #" in the denominator.

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home

  14. #14 / 19
    Shelley, not Moore Ozyman
    Rank
    Brigadier General
    Rank Posn
    #40
    Join Date
    Nov 09
    Location
    Posts
    3449

    I don't understand it, but I like M57's idea.


  15. #15 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    Ozyman wrote:

    I don't understand it, but I like M57's idea.

    It doesn't have to be that hard to understand. Maybe I can simplify it.

    By definition, the norm, or mean (average) of luck stats is 0, so we can say that any luck stat that is more or less than zero "deviates" from the norm.  If we assign a ratio of 4/108 to a score of +4 when there are 108 expected kills, the resulting number tells us nothing about how likely or unlikely that score is.

    Obviously, the larger the range of the population, the larger the deviation from the norm can be.   A luck stat of +20 would be more uncommon when we are only expecting 108 kills, but a luck stat of +20 is nothing special if there are 2000 expected kills.

    A Z-score would have us divide our individual luck stat score (+4) by a number below which we expect the scores of a certain percentage of the population to fall.  In this case because the sqrt of 108 = 10.3, we are going to "estimate" that 68% of the population of luck stats will fall between -10.3 and +10.3.  So we will divide +4 by 10.3. to get our Psuedo Z-Score.  Using this method a luck stat of 10.3 will yield a P-Z of 1.

    If there are 2000 expected kills, my proposed method would "estimate" that 68% of the population will have scores between -44.7 and +44.7 (the square root of 2000). This would mean that only 16% of the time would we expect to see a luck stat greater than 44.7.  A luck stat of -44.7 would give you a P-Z score of -1.0 and a luck stat of 89.4 would give you a P-Z of -2.0.  Looking at the chart, we would expect only 1% of Z-scores to be less than -2.0, so in this case we expect 1% of luck stat scores to be less than -89.4.

    Clear as mud?

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home
    Edited Mon 14th Nov 07:47 [history]

  16. #16 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    The idea of using Z-scores is not new.  It has been discussed many times and is viewed by the mathematicians on the site (Alpha and Hugh) as the best way to evaluate luck from game to game.  Here's a related thread.

    http://www.wargear.net/forum/showthread/1458p1/poor_luck_champion

    The good news is that in the above thread, Hugh suggests that it is not out of the realm of reasonability to derive an actual z-score (initial concerns were than the calculations would tax the system too much).

    Suffice it to say the distribution curve for 3v2 dice is different for that of 3v1, 2v1, etc., which means that the actual formula for coming up with a precise denominator is quite complex.   On the other hand, Alpha suspected that the differences in the curves are minuscule, meaning that in most games where 3v2 dice are much more commonly used, the z-score differences will be negligible. 

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home

  17. #17 / 19
    Standard Member SquintGnome
    Rank
    Brigadier General
    Rank Posn
    #35
    Join Date
    Jun 11
    Location
    Posts
    546

     To give an example to what has been discussed, below I have listed my last 33 Wargear Warfare games that were 1 v1.  The summary results are listed in the last row, -4.5% luck percentage as I have mentioned and -1.98 pseudo-Z as mentioned by M-57 (about 2 std deviations from the mean).  Either way you slice it, I would call that 'bad luck' for that string of games.  The 1949.5 expected hits would have been over about 6000 rolls (estimated)

          Expected Luck Luck
    Game # Luck Kills Kills % deviation
    1 -10.30 45 55.3 -18.6% -1.39
    2 -2.48 9 11.5 -21.6% -0.73
    3 -2.74 57 59.7 -4.6% -0.35
    4 -3.32 51 54.3 -6.1% -0.45
    5 -6.90 19 25.9 -26.6% -1.36
    6 3.00 38 35.0 8.6% 0.51
    7 -3.12 98 101.1 -3.1% -0.31
    8 1.71 13 11.3 15.1% 0.51
    9 -4.82 72 76.8 -6.3% -0.55
    10 -5.61 47 52.6 -10.7% -0.77
    11 0.50 82 81.5 0.6% 0.06
    12 -1.12 53 54.1 -2.1% -0.15
    13 -13.94 51 64.9 -21.5% -1.73
    14 -0.08 24 24.1 -0.3% -0.02
    15 -2.16 93 95.2 -2.3% -0.22
    16 -7.71 29 36.7 -21.0% -1.27
    17 -9.92 27 36.9 -26.9% -1.63
    18 -2.16 57 59.2 -3.7% -0.28
    19 -2.51 101 103.5 -2.4% -0.25
    20 12.11 95 82.9 14.6% 1.33
    21 -22.99 137 160.0 -14.4% -1.82
    22 -3.35 73 76.4 -4.4% -0.38
    23 -5.62 67 72.6 -7.7% -0.66
    24 4.04 40 36.0 11.2% 0.67
    25 -4.08 71 75.1 -5.4% -0.47
    26 -3.41 16 19.4 -17.6% -0.77
    27 4.83 92 87.2 5.5% 0.52
    28 9.49 116 106.5 8.9% 0.92
    29 3.48 31 27.5 12.6% 0.66
    30 -8.44 63 71.4 -11.8% -1.00
    31 -5.04 34 39.0 -12.9% -0.81
    32 6.64 20 13.4 49.7% 1.82
    33 -1.46 41 42.5 -3.4% -0.22
      -87.48 1862 1949.5 -4.5% -1.98


  18. #18 / 19
    Brigadier General M57 M57 is offline now
    Standard Member M57
    Rank
    Brigadier General
    Rank Posn
    #73
    Join Date
    Apr 10
    Location
    Posts
    5083

    Negative luck scores in 25 of 33 games. Impressive!  The standard deviation of a coin-flip with 33 tries is 4.06 [sqrt (33/2)]. In your case you had a negative luck stat 25 times. With an expected mean of 16.5 you get a -2.09 z-score in that dept as well. 

    It should be possible to play WG boards in real-time ..without the wait, regardless of how many are playing.
    https://sites.google.com/site/m57sengine/home
    Edited Mon 14th Nov 20:19 [history]

  19. #19 / 19
    Standard Member Hugh
    Rank
    Lieutenant General
    Rank Posn
    #13
    Join Date
    Nov 09
    Location
    Posts
    869

    The pseudo-z score is an interesting idea, but remember that the goal is to use the z-score to calculate and display a percentile. Everyone can interpret "your rolling is only better than 10% of the possibilities". The variation for rolling n times with a success rate of p is np(1-p). So the standard deviation is sqrt(np(1-p)). You are dividing by sqrt(n) and saying it's similar to z-score. Proportional, true, but it would give inaccurate percentile scores when you convert. Different dice give different values of p, so fixing p isn't great either. For 3v2 or 2v2 dice, the variation formula is different, so the problem of using sqrt(n) there is even worse.

    That said, I ended up asking a decent number of stats-type people and finally got a definitive and surprisingly simple answer: the distribution we are after is the sum of normal distributions. The result is a distribution whose mean is the sum of the means and whose variation is the sum of the variances. We implicitly use the sum of means bit. It wouldn't be hard to implement a sum of variance to calculate z-scores and then percentiles to show the user. (BTW, someone may have already suggested this is the way to do it, but I can't find it.)

    If I understand the technique correctly, the normal for approximating 3v2 and 2v2 rolls (3 possible outcomes per roll) has a slightly more complicated variance calculation. The 1v2, 3v1,2v1,1v1 all have two outcomes, so it's variance is np(1-p). For calculations, we'll just need the variance of a single roll and just update the variance by adding with each roll.

    I'll start a new thread later today (hopefully) with what I think the details will be on the 3v2,2v2 variance calculations and how we could implement these ideas to calculate a percentile that measures "how unlucky your rolling was". I imagine PHP has a z-score to percentile function built in.

    The probability of missing a 1/N event in N tries approaches 1/e as N gets large. I just wanted to put that in a signature.

You need to log in to reply to this thread   Login | Join
 
Pages:   1   (1 in total)