QSSIM Round 2 Robert Parker December 1997 All dates are listed in YYMMDD format ---Tourney data--- The data set consists of about 110,000 games played between about 3700 players over a four year period (921100 to 960900) ---Robots--- 32 of the top players have been replaced by robots with fixed, known abilities. The players are listed here, along with - their average opponent rating for games between 950600 and 960600. - the number of games played within this period - their replacement robots Group1: EDLEY,JOE AvgOppRating= 1960.02 n=111 Robot0 MEAD,JEREMIAH AvgOppRating= 1953.70 n= 64 Robot1 CAPPELLETTO,BR AvgOppRating= 1956.13 n=118 Robot2 LOGAN,ADAM AvgOppRating= 1939.23 n= 78 Robot3 LERMAN,JERRY AvgOppRating= 1928.56 n=171 Robot4 HALPER,EDWARD AvgOppRating= 1943.74 n= 66 Robot5 GOLDSTEIN,CHAR AvgOppRating= 1939.36 n= 94 Robot6 LUNDEGAARD,BOB AvgOppRating= 1922.26 n= 98 Robot7 LUND,RICHARD AvgOppRating= 1924.58 n= 57 Robot8 POLLOCK,ROBIN AvgOppRating= 1918.14 n= 90 Robot9 GRAHAM,MATT AvgOppRating= 1905.14 n=118 Robot10 SCHONBRUN,LEST AvgOppRating= 1911.86 n=213 Robot11 BOYS,DAVID AvgOppRating= 1910.12 n=102 Robot12 SHERMAN,JOEL AvgOppRating= 1893.26 n=200 Robot13 TIEKERT,RON AvgOppRating= 1908.90 n=101 Robot14 GIBSON,DAVID AvgOppRating= 1902.27 n= 56 Robot15 Group2: BARON,MIKE AvgOppRating= 1892.52 n= 95 Robot16 KRAMER,JIM AvgOppRating= 1881.59 n=134 Robot17 ALEXANDER,STEV AvgOppRating= 1886.61 n=144 Robot18 MORRIS,PETER AvgOppRating= 1883.74 n= 54 Robot19 HERSOM,RANDY AvgOppRating= 1885.95 n=193 Robot20 GEARY,JIM AvgOppRating= 1873.58 n=165 Robot21 DAY,DARRELL AvgOppRating= 1877.48 n=144 Robot22 ODOM,LISA AvgOppRating= 1874.81 n=135 Robot23 CROWE,ROBERT AvgOppRating= 1880.08 n= 63 Robot24 EPSTEIN,PAUL AvgOppRating= 1877.42 n=113 Robot25 LUEBKEMANN,JOH AvgOppRating= 1867.63 n= 97 Robot26 WEINSTEIN,IAN AvgOppRating= 1859.50 n=121 Robot27 PRATT,DANIEL AvgOppRating= 1857.69 n= 54 Robot28 POLATNICK,STEV AvgOppRating= 1845.47 n=130 Robot29 COHEN,IRA AvgOppRating= 1814.86 n= 97 Robot30 LENNON,CHRISTO AvgOppRating= 1772.82 n= 84 Robot31 These 32 robots are divided into two groups of 16 as indicated above. Note that Group1 generally plays a stronger schedule than Group2. Each game involving a robot is simulated, with P(Win) = CumulativeLogistic(parameter=B,diff=RobotAbility-OpponentRating). The parameter B has been estimated from the '92-'96 data to be B=1/156. ---Method--- a. Randomly assign the 16 abilities 2100, 2090, ... 1950 to the 16 robots in Group1. Do the same for Group2. This creates two "teams" of robots with teams having the same playing strength. b. In all '92-'96 games, replace 32 players with robots. c. Sequentially rate games from '92 to 950600. (That's June 0, 1995.) d. Start Qualifying Period. e. Sequentially rate games from 950600 to 960600. f. End Qualifying Period g. Calculate the following QS methods: 1. OPRmleHI (iteration for robots only, ratings curve stddev = estimated from data, Opp Strength = Max rating during Qualifying Period) 2. OPR (no iteration, ratings curve stddev = same as in ratings formula) 3. OPRmle (no iteration, ratings curve stddev = estimated from data) 4. OPRmleI (iteration for robots only, ratings curve stddev = estimated from data) 5. IOPR (iteration for all players who have #games >= 50, ratings curve stddev = same as in ratings formula) 6. IOPRmle (same as IOPR, but ratings curve stddev = estimated from data) 7. HI (Peak rating during Qualifying Period) 8. AVG3_g (Average of 3 ratings, weighted by number of games played in each period.) 9. AVG3_123 (Avg of 3 ratings, weights 1,2,3 for successive periods.) 10. PAR (Peak average rating: NSA's 1997 WSC qualifying system.) 11. RAT (Current rating at end of Qualifying Period) ---Statistics--- Three statistics are calculated: 1. Kendall's tau, a measure of the correlation between the known robot ranks and the QS-assigned ranks. Higher is better. 2. n-out-of 10: The number of the known Top 10 robots who are ranked among the Top 10 by the QS. Higher is better. 3. The Rank Sum Difference (RSD): The robots are ranked 1,..,32. The sum of the ranks for Group1 is calculated. The same is done for Group2. Then RSD = Group2 sum - Group1 sum. A lower rank sum is better, hence a positive RSD indicates that Group1 was ranked better than Group2. ---Points about the simulation--- Robot abilities are kept fixed over the entire 4-year period. Of the (8695) games involving Robots over the 4-year period, (958) were Robot vs. Robot. The Qualifying Period was 950600 to 960600. Each robot played at least 50 games within that period. (The exact numbers of games played are listed above.) The problem of assigning initial ratings for the Robots was avoided: Each was treated as a new player starting in '92. ---A note on calculation of OPR--- I've actually used the logistic distribution instead of the normal distribution as the ratings curve in the calculation of OPR. There is very little difference in the rankings from the two methods. Using the methods to calculate OPR for WSC97, we see three sets of rankings that are affected. With 80 players ranked, the differences between the two methods are: #'s 23 and 24 are switched, #'s 44, 45, and 46 are jumbled, and #'s 39 and 40 are switched. ---Conclusions--- Here are averages and 90% confidence intervals for the statistics for each of the 11 methods. tau n-out-of-10 Rank Difference Favors Upper Lower Ave Upper Lower Ave Upper Lower Ave OPRmleHI 0.591 0.581 0.586 7.163 7.019 7.091 -3.71 -8.46 -6.09 2 OPR172 0.595 0.586 0.590 7.154 7.011 7.083 2.86 -1.79 0.53 - OPRmle 0.594 0.585 0.590 7.157 7.013 7.085 11.14 6.43 8.78 1 OPRmleI 0.596 0.586 0.591 7.157 7.013 7.085 7.22 2.47 4.84 1 IPR 0.591 0.581 0.586 7.150 7.005 7.078 -6.24 -11.00 -8.62 2 IPRmle 0.591 0.581 0.586 7.151 7.006 7.079 -4.44 -9.20 -6.82 2 HI 0.590 0.580 0.585 7.158 7.015 7.086 4.77 0.04 2.40 1 AVG3_g 0.622 0.613 0.618 7.361 7.219 7.290 5.16 0.75 2.96 1 AVG3_123 0.628 0.616 0.622 7.456 7.284 7.370 5.14 -0.74 2.20 - PAR 0.619 0.606 0.612 7.352 7.158 7.255 7.64 1.06 4.35 1 RAT 0.637 0.628 0.633 7.551 7.411 7.481 9.06 4.91 6.98 1 The "Favors" columns tells which of Group1 (strong schedules) or Group2 (weaker schedules) is favored by the QSA. Comments: later.