NEW RANKING ( with the news! )

Archive of the old Parsimony forum. Some messages couldn't be restored. Limitations: Search for authors does not work, Parsimony specific formats do not work, threaded view does not work properly. Posting is disabled.

NEW RANKING ( with the news! )

Postby Carlos E.A. Drake » 07 Jul 2000, 20:37

Geschrieben von: / Posted by: Carlos E.A. Drake at 07 July 2000 21:37:26:
ONE WINBOARD RANKING

07/jul/00 by Carlos E.A. Drake
Rating DRAKE2.PGN
Drake Winboard tournaments 01 to 28 and other tests, 02/99 to 07/00
Pentium 133 ~ 16 Mb RAM and AMD K2 550 ~ 64 Mb RAM
Several levels and variously versions of programs
60 programs; 3101 games
Only freeware or shareware :
( NOT Voyager, Nimzo, Zarkov, NUTx, Patzer, Diep, JRCP, Insomniac, Stobor, etc. )

gnuchess 4.077 is different to GnuChess5 5.0 because it¥s a rewrited program and its has
differents authors
Gromit 2.1 and 2.2 are differents to Gromit30, cause Ìdem
Not yet, because techniques difficulties or several problems : Rival.

Base ELO 2200 - Minimun 0 games

ELO by ELOStat 1.1

Program Elo + - Games Score Av.Op. Draws

1 Crafty : 2436 28 43 359 66.4 % 2318 14.8 %
2 LGoliath : 2418 30 38 366 63.1 % 2324 18.6 %
3 Amy : 2386 42 45 208 59.1 % 2322 21.2 %
4 tcb : 2358 34 31 364 53.4 % 2334 20.6 %
5 AnMon : 2354 36 37 318 55.0 % 2319 14.5 %
6 Bionic : 2350 47 57 164 60.1 % 2279 12.8 %
7 Phalanx : 2343 36 33 344 51.9 % 2330 17.2 %
8 Gromit30 : 2322 192 206 15 56.7 % 2275 6.7 %
9 Sos : 2316 181 181 15 50.0 % 2316 20.0 %
10 Yace : 2316 68 65 96 43.2 % 2363 15.6 %
11 Comet : 2296 34 34 359 44.6 % 2334 16.7 %
12 LaPetite : 2287 168 224 16 65.6 % 2175 6.2 %
13 Zchess : 2279 55 56 141 54.3 % 2249 12.1 %
14 Bringer : 2266 49 56 154 49.7 % 2269 14.9 %
15 Ant : 2248 48 57 151 49.3 % 2253 17.9 %
16 Francesca : 2244 174 179 16 40.6 % 2310 18.8 %
17 gnuchess : 2229 63 60 120 51.2 % 2220 9.2 %
18 Exchess : 2228 62 71 104 57.2 % 2178 8.7 %
19 ldb : 2220 64 67 105 54.8 % 2187 10.5 %
20 Gromit : 2202 56 64 120 49.6 % 2205 14.2 %
21 Arasan : 2197 88 87 57 55.3 % 2161 15.8 %
22 inmi : 2178 71 63 98 41.3 % 2238 13.3 %
23 Chop : 2177 64 73 92 48.9 % 2185 15.2 %
24 GnuChess5 : 2175 91 87 56 42.9 % 2225 14.3 %
25 Knightx : 2143 77 109 55 67.3 % 2018 14.5 %
26 Dragon : 2118 82 89 59 59.3 % 2052 16.9 %
27 Amyan : 2101 66 84 77 64.9 % 1994 18.2 %
28 Fortress : 2101 78 74 78 44.9 % 2136 10.3 %
29 SSEchessII : 2080 80 71 82 43.3 % 2127 8.5 %
30 Gullydeckel : 2075 184 162 18 38.9 % 2153 11.1 %
31 Olithink : 2048 84 68 83 40.4 % 2116 8.4 %
32 Faile : 1984 73 61 102 41.2 % 2045 9.8 %
33 Tristram : 1978 102 60 81 29.6 % 2128 9.9 %
34 DChess : 1960 105 66 71 32.4 % 2087 8.5 %
35 Averno : 1949 76 78 76 46.7 % 1972 11.8 %
36 Monik : 1941 82 80 66 43.9 % 1983 15.2 %
37 Crux : 1930 64 56 112 38.8 % 2009 20.5 %
38 Cilian : 1919 110 57 84 27.4 % 2088 7.1 %
39 Skaki : 1897 101 73 67 37.3 % 1987 6.0 %
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0 %
41 Green Light : 1878 174 139 24 41.7 % 1937 0.0 %
42 Snail : 1874 109 101 41 41.5 % 1934 14.6 %
43 Zephyr : 1869 110 117 37 56.8 % 1821 10.8 %
44 Colchess : 1862 78 82 64 44.5 % 1900 20.3 %
45 Storm : 1839 192 237 11 40.9 % 1903 27.3 %
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1 %
47 Freyr : 1768 156 120 27 35.2 % 1874 11.1 %
48 Bestia : 1745 135 162 18 38.9 % 1823 33.3 %
49 LarsenVB : 1741 146 125 23 56.5 % 1696 26.1 %
50 Chessterfield : 1693 140 110 29 31.0 % 1832 20.7 %
51 Wincraft : 1692 191 178 17 44.1 % 1733 5.9 %
52 Mint : 1653 173 134 19 23.7 % 1856 26.3 %
53 Ozwald : 1644 206 192 15 43.3 % 1690 6.7 %
54 Mfchess : 1632 250 67 47 11.7 % 1983 2.1 %
55 Raffaela : 1624 155 74 45 18.9 % 1877 15.6 %
56 Noonian : 1608 142 153 18 33.3 % 1728 33.3 %
57 Pierre : 1576 253 61 53 9.4 % 1969 3.8 %
58 Trynyty : 1572 128 219 13 46.2 % 1599 46.2 %
59 Golem : 1462 204 149 17 26.5 % 1639 17.6 %
60 Rikus : 1191 390 154 7 7.1 % 1636 14.3 %
-- Rival
Carlos E.A. Drake
 

Re: hmm, interesting...

Postby WYx » 08 Jul 2000, 07:06

Geschrieben von: / Posted by: WYx at 08 July 2000 08:06:11:
Als Antwort auf: / As an answer to: NEW RANKING ( with the news! ) geschrieben von: / posted by: Carlos E.A. Drake at 07 July 2000 21:37:26:
Program Elo + - Games Score Av.Op. Draws
1 Crafty : 2436 28 43 359 66.4 % 2318 14.8 %
2 LGoliath : 2418 30 38 366 63.1 % 2324 18.6 %
3 Amy : 2386 42 45 208 59.1 % 2322 21.2 %
4 tcb : 2358 34 31 364 53.4 % 2334 20.6 %
5 AnMon : 2354 36 37 318 55.0 % 2319 14.5 %
6 Bionic : 2350 47 57 164 60.1 % 2279 12.8 %
7 Phalanx : 2343 36 33 344 51.9 % 2330 17.2 %
8 Gromit30 : 2322 192 206 15 56.7 % 2275 6.7 %
9 Sos : 2316 181 181 15 50.0 % 2316 20.0 %
10 Yace : 2316 68 65 96 43.2 % 2363 15.6 %
11 Comet : 2296 34 34 359 44.6 % 2334 16.7 %
12 LaPetite : 2287 168 224 16 65.6 % 2175 6.2 %
13 Zchess : 2279 55 56 141 54.3 % 2249 12.1 %
14 Bringer : 2266 49 56 154 49.7 % 2269 14.9 %
15 Ant : 2248 48 57 151 49.3 % 2253 17.9 %
16 Francesca : 2244 174 179 16 40.6 % 2310 18.8 %
17 gnuchess : 2229 63 60 120 51.2 % 2220 9.2 %
18 Exchess : 2228 62 71 104 57.2 % 2178 8.7 %
19 ldb : 2220 64 67 105 54.8 % 2187 10.5 %
20 Gromit : 2202 56 64 120 49.6 % 2205 14.2 %
21 Arasan : 2197 88 87 57 55.3 % 2161 15.8 %
22 inmi : 2178 71 63 98 41.3 % 2238 13.3 %
23 Chop : 2177 64 73 92 48.9 % 2185 15.2 %
24 GnuChess5 : 2175 91 87 56 42.9 % 2225 14.3 %
25 Knightx : 2143 77 109 55 67.3 % 2018 14.5 %
26 Dragon : 2118 82 89 59 59.3 % 2052 16.9 %
27 Amyan : 2101 66 84 77 64.9 % 1994 18.2 %
28 Fortress : 2101 78 74 78 44.9 % 2136 10.3 %
29 SSEchessII : 2080 80 71 82 43.3 % 2127 8.5 %
30 Gullydeckel : 2075 184 162 18 38.9 % 2153 11.1 %
31 Olithink : 2048 84 68 83 40.4 % 2116 8.4 %
32 Faile : 1984 73 61 102 41.2 % 2045 9.8 %
33 Tristram : 1978 102 60 81 29.6 % 2128 9.9 %
34 DChess : 1960 105 66 71 32.4 % 2087 8.5 %
35 Averno : 1949 76 78 76 46.7 % 1972 11.8 %
36 Monik : 1941 82 80 66 43.9 % 1983 15.2 %
37 Crux : 1930 64 56 112 38.8 % 2009 20.5 %
38 Cilian : 1919 110 57 84 27.4 % 2088 7.1 %
39 Skaki : 1897 101 73 67 37.3 % 1987 6.0 %
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0 %
41 Green Light : 1878 174 139 24 41.7 % 1937 0.0 %
42 Snail : 1874 109 101 41 41.5 % 1934 14.6 %
43 Zephyr : 1869 110 117 37 56.8 % 1821 10.8 %
44 Colchess : 1862 78 82 64 44.5 % 1900 20.3 %
45 Storm : 1839 192 237 11 40.9 % 1903 27.3 %
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1 %
47 Freyr : 1768 156 120 27 35.2 % 1874 11.1 %
48 Bestia : 1745 135 162 18 38.9 % 1823 33.3 %
49 LarsenVB : 1741 146 125 23 56.5 % 1696 26.1 %
50 Chessterfield : 1693 140 110 29 31.0 % 1832 20.7 %
51 Wincraft : 1692 191 178 17 44.1 % 1733 5.9 %
52 Mint : 1653 173 134 19 23.7 % 1856 26.3 %
53 Ozwald : 1644 206 192 15 43.3 % 1690 6.7 %
54 Mfchess : 1632 250 67 47 11.7 % 1983 2.1 %
55 Raffaela : 1624 155 74 45 18.9 % 1877 15.6 %
56 Noonian : 1608 142 153 18 33.3 % 1728 33.3 %
57 Pierre : 1576 253 61 53 9.4 % 1969 3.8 %
58 Trynyty : 1572 128 219 13 46.2 % 1599 46.2 %
59 Golem : 1462 204 149 17 26.5 % 1639 17.6 %
60 Rikus : 1191 390 154 7 7.1 % 1636 14.3 %
??????????? 2241 !!!
????????????? 2000 !
:????????? minimum 1900 !
??

Carlos, this list is poor IMHO!
regards
WYx
WYx
 

Re: hmm, interesting...

Postby Pedro Eckmann » 09 Jul 2000, 00:06

Geschrieben von: / Posted by: Pedro Eckmann at 09 July 2000 01:06:59:
Als Antwort auf: / As an answer to: Re: hmm, interesting... geschrieben von: / posted by: WYx at 08 July 2000 08:06:11:
Program Elo + - Games Score Av.Op. Draws
1 Crafty : 2436 28 43 359 66.4 % 2318 14.8 %
2 LGoliath : 2418 30 38 366 63.1 % 2324 18.6 %
3 Amy : 2386 42 45 208 59.1 % 2322 21.2 %
4 tcb : 2358 34 31 364 53.4 % 2334 20.6 %
5 AnMon : 2354 36 37 318 55.0 % 2319 14.5 %
6 Bionic : 2350 47 57 164 60.1 % 2279 12.8 %
7 Phalanx : 2343 36 33 344 51.9 % 2330 17.2 %
8 Gromit30 : 2322 192 206 15 56.7 % 2275 6.7 %
9 Sos : 2316 181 181 15 50.0 % 2316 20.0 %
10 Yace : 2316 68 65 96 43.2 % 2363 15.6 %
11 Comet : 2296 34 34 359 44.6 % 2334 16.7 %
12 LaPetite : 2287 168 224 16 65.6 % 2175 6.2 %
13 Zchess : 2279 55 56 141 54.3 % 2249 12.1 %
14 Bringer : 2266 49 56 154 49.7 % 2269 14.9 %
15 Ant : 2248 48 57 151 49.3 % 2253 17.9 %
16 Francesca : 2244 174 179 16 40.6 % 2310 18.8 %
17 gnuchess : 2229 63 60 120 51.2 % 2220 9.2 %
18 Exchess : 2228 62 71 104 57.2 % 2178 8.7 %
19 ldb : 2220 64 67 105 54.8 % 2187 10.5 %
20 Gromit : 2202 56 64 120 49.6 % 2205 14.2 %
21 Arasan : 2197 88 87 57 55.3 % 2161 15.8 %
22 inmi : 2178 71 63 98 41.3 % 2238 13.3 %
23 Chop : 2177 64 73 92 48.9 % 2185 15.2 %
24 GnuChess5 : 2175 91 87 56 42.9 % 2225 14.3 %
25 Knightx : 2143 77 109 55 67.3 % 2018 14.5 %
26 Dragon : 2118 82 89 59 59.3 % 2052 16.9 %
27 Amyan : 2101 66 84 77 64.9 % 1994 18.2 %
28 Fortress : 2101 78 74 78 44.9 % 2136 10.3 %
29 SSEchessII : 2080 80 71 82 43.3 % 2127 8.5 %
30 Gullydeckel : 2075 184 162 18 38.9 % 2153 11.1 %
31 Olithink : 2048 84 68 83 40.4 % 2116 8.4 %
32 Faile : 1984 73 61 102 41.2 % 2045 9.8 %
33 Tristram : 1978 102 60 81 29.6 % 2128 9.9 %
34 DChess : 1960 105 66 71 32.4 % 2087 8.5 %
35 Averno : 1949 76 78 76 46.7 % 1972 11.8 %
36 Monik : 1941 82 80 66 43.9 % 1983 15.2 %
37 Crux : 1930 64 56 112 38.8 % 2009 20.5 %
38 Cilian : 1919 110 57 84 27.4 % 2088 7.1 %
39 Skaki : 1897 101 73 67 37.3 % 1987 6.0 %
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0 %
41 Green Light : 1878 174 139 24 41.7 % 1937 0.0 %
42 Snail : 1874 109 101 41 41.5 % 1934 14.6 %
43 Zephyr : 1869 110 117 37 56.8 % 1821 10.8 %
44 Colchess : 1862 78 82 64 44.5 % 1900 20.3 %
45 Storm : 1839 192 237 11 40.9 % 1903 27.3 %
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1 %
47 Freyr : 1768 156 120 27 35.2 % 1874 11.1 %
48 Bestia : 1745 135 162 18 38.9 % 1823 33.3 %
49 LarsenVB : 1741 146 125 23 56.5 % 1696 26.1 %
50 Chessterfield : 1693 140 110 29 31.0 % 1832 20.7 %
51 Wincraft : 1692 191 178 17 44.1 % 1733 5.9 %
52 Mint : 1653 173 134 19 23.7 % 1856 26.3 %
53 Ozwald : 1644 206 192 15 43.3 % 1690 6.7 %
54 Mfchess : 1632 250 67 47 11.7 % 1983 2.1 %
55 Raffaela : 1624 155 74 45 18.9 % 1877 15.6 %
56 Noonian : 1608 142 153 18 33.3 % 1728 33.3 %
57 Pierre : 1576 253 61 53 9.4 % 1969 3.8 %
58 Trynyty : 1572 128 219 13 46.2 % 1599 46.2 %
59 Golem : 1462 204 149 17 26.5 % 1639 17.6 %
60 Rikus : 1191 390 154 7 7.1 % 1636 14.3 %
??????????? 2241 !!!
????????????? 2000 !
:????????? minimum 1900 !
??

Carlos, this list is poor IMHO!
regards
WYx
Green Light Chess seems to play fair chess, but has time-control problems. That might be the reason for a low ELO in a Winbord Tournier. Here are two examples of what I am talking about.
[Event "Computer chess game"]
[Site "HOME"]
[Date "2000.07.08"]
[Round "1"]
[White "EXchess"]
[Black "glc-wb"]
[Result "1-0"]
[TimeControl "600"]
1. e4 g6 2. d4 d6 3. Nf3 Nf6 4. Nc3 Bg7 5. Bb5+ c6 6. Bd3 O-O 7. Bg5 Bg4 8.
Be3 Qb6 9. Rb1 Bxf3 10. gxf3 Qc7 11. O-O e5 12. dxe5 dxe5 13. Qc1 Nh5 14.
Ne2 Rd8 15. Rd1 Nd7 16. f4 c5 17. Bb5 Nhf6 18. Ng3 Qa5 19. a4 a6 20. Bd2
Qc7 21. Bd3 c4 22. Bf1 Nc5 23. fxe5 Nfxe4 24. Bxc4 Bxe5 25. b3 Rxd2 26.
Rxd2 Bf4 27. Nxe4 Nxe4 28. Rd7 Bxh2+ 29. Kg2 Qxd7 30. Kxh2 Nd2 31. Ra1 Qd4
32. Kg3 Ne4+ 33. Kh3 Qxf2 34. Qg1 Qf3+ 35. Kh2 Rd8 36. Bxf7+ Qxf7 37. Rf1
Rd2+ 38. Kh3 Qe6+ 39. Qg4 Rh2+ 40. Kxh2 Qxg4 41. b4 Qe2+ 42. Kg1 Ng5 43.
Rf6 Kg7 44. Rf2 Nh3+ 45. Kg2 Qxf2+ 46. Kxh3 h5 47. c4 g5 48. c5
{White wins on time} 1-0
[Event "Computer chess game"]
[Site "HOME"]
[Date "2000.07.08"]
[Round "1"]
[White "glc-wb"]
[Black "Franwb"]
[Result "0-1"]
[TimeControl "600"]
1. Nf3 Nf6 2. e3 d5 3. Be2 Nc6 4. O-O Bg4 5. c4 dxc4 6. Bxc4 e6 7. h3 Bh5
8. Bb5 Bd6 9. Bxc6+ bxc6 10. d3 O-O 11. Qe2 Rb8 12. e4 Nd7 13. Nbd2 f5 14.
d4 c5 15. e5 Be7 16. Qe3 a6 17. Qe2 cxd4 18. Qc4 Kh8 19. Nxd4 Nxe5 20. Qxe6
Bd6 21. Qd5 Nf3+ 22. Kh1 Nxd4 23. Qxd4 Rb4 24. Qc3 Qf6 25. Qxf6 Rxf6 26.
Re1 Rb8 27. Kg1 Rg6 28. Re3 Bd1 29. Nc4 f4 30. Rc3 f3 31. g4 Be2 32. Nxd6
cxd6 33. b3 d5 34. Be3 h5 35. g5 d4 36. Bxd4 Rxg5+ 37. Kh1 Rd8 38. Be3 Rg2
39. Re1 Bb5 40. a4 Be2 41. Rc5 g6 42. Rcc1 Kh7 43. b4 Rd7 44. b5 axb5 45.
a5 Bc4 46. a6 Rd8 47. Bb6 Ra8 48. a7 g5 49. Rg1 Kg6 50. Rxg2 fxg2+ 51. Kxg2
Kf5 52. f3 Kf4 53. Bc7+ Kf5 54. Bb8 h4 55. Kf2 Kf6 56. Ke3 Be6 57. Rb1 Bxh3
58. Rxb5 Bd7 59. Rc5 h3 60. Kd4 Kg6 61. Rc7 Bf5 62. Rc6+ Kg7 63. Rc3 Kf7
64. Ke5 Bd7 65. Rc1 Kg6 66. Kd6 Bf5 67. Rc6 Kh7 68. Ke5 Bd3 69. Rc3 Bf1 70.
Rc7+ Kg6 71. Rc6+ Kh7 72. Rc2 Kg6 73. f4 Bg2 74. f5+ Kh5 75. Rc8 h2 76.
Rh8+ Kg4 77. Rxh2 Bc6 78. f6 Be8 79. Rh7 Bg6 80. f7 Bxf7 81. Rxf7 Kh4 82.
Kf5 Kh3 83. Rg7 Kg2 84. Ke4 g4 85. Rxg4+ Kf2 86. Rg5 Ke2 87. Rg2+ Ke1 88.
Rc2 Kd1 89. Rf2 Kc1 90. Rg2 Kd1 91. Rb2 Kc1 92. Rb7 Kc2 93. Be5 Kd2 94.
Rd7+ Kc2 95. Rh7 Kb3 96. Rb7+ Kc4 97. Bb8 Kc5 98. Rc7+ Kb4 99. Rc6 Ka5 100.
Kd4 Kb5 101. Kd5 Kb4 102. Rc7 Kb5 103. Rb7+ Ka6 104. Rb3 Ka5 105. Bc7+ Ka4
106. Rb7 Ka3
{Black wins on time} 0-1
Regards
Pedro
Pedro Eckmann
 

Re: NEW RANKING ( with the news! )

Postby Gian-Carlo Pascutto » 09 Jul 2000, 12:06

Geschrieben von: / Posted by: Gian-Carlo Pascutto at 09 July 2000 13:06:12:
Als Antwort auf: / As an answer to: NEW RANKING ( with the news! ) geschrieben von: / posted by: Carlos E.A. Drake at 07 July 2000 21:37:26:
Program Elo + - Games Score Av.Op. Draws
32 Faile : 1984 73 61 102 41.2 % 2045 9.8 %
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0 %
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1 %

Nonsense.
Gian-Carlo Pascutto
 

Re: NEW RANKING ( with the news! )

Postby pete » 09 Jul 2000, 12:19

Geschrieben von: / Posted by: pete at 09 July 2000 13:19:00:
Als Antwort auf: / As an answer to: Re: NEW RANKING ( with the news! ) geschrieben von: / posted by: Gian-Carlo Pascutto at 09 July 2000 13:06:12:
Program Elo + - Games Score Av.Op. Draws
32 Faile : 1984 73 61 102 41.2 % 2045 9.8 %
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0 %
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1 %

Nonsense.
To say it frankly : IMHO the only "nonsense" here is your comment .
Carlos E.A. Drake tests under very specific conditions and some engines do better than usual , a few do worse .
Most engines are in the same order as in say Mr Corbit's calibration runs btw.
"Nonsense " would only fit if you believe the test was somehow faked and this doesn't seem probable .
If it is about the "ELO" , only differences are important , not absolute value .
I think this list is quite a valid bullet/blitz list if you take out those engines who have difficulties with the little RAM Mr Drake has in his computer .
Also , not everyone has an Athlon800 ; I think it is interesting how programs do on slower computers .
And in fact I don't understand the emotional answers at all he has to bear with everytime when he posts new results .
pete
pete
 

Re: NEW RANKING ( with the news! )

Postby Mogens Larsen » 09 Jul 2000, 13:00

Geschrieben von: / Posted by: Mogens Larsen at 09 July 2000 14:00:55:
Als Antwort auf: / As an answer to: Re: NEW RANKING ( with the news! ) geschrieben von: / posted by: pete at 09 July 2000 13:19:00:
To say it frankly : IMHO the only "nonsense" here is your comment .
Carlos E.A. Drake tests under very specific conditions and some engines do better than usual , a few do worse .
Most engines are in the same order as in say Mr Corbit's calibration runs btw.
"Nonsense " would only fit if you believe the test was somehow faked and this doesn't seem probable .
If it is about the "ELO" , only differences are important , not absolute value .
I think this list is quite a valid bullet/blitz list if you take out those engines who have difficulties with the little RAM Mr Drake has in his computer .
Also , not everyone has an Athlon800 ; I think it is interesting how programs do on slower computers .
And in fact I don't understand the emotional answers at all he has to bear with everytime when he posts new results .
I agree. Comparative strength between some programs change with computer specifications, so nonsense isn't a correct evaluation of the ELO list by Mr. Drake.
Best wishes...
Mogens
Mogens Larsen
 

Re: NEW RANKING ( with the news! )

Postby Gian-Carlo Pascutto » 09 Jul 2000, 14:54

Geschrieben von: / Posted by: Gian-Carlo Pascutto at 09 July 2000 15:54:41:
Als Antwort auf: / As an answer to: Re: NEW RANKING ( with the news! ) geschrieben von: / posted by: pete at 09 July 2000 13:19:00:
Program Elo + - Games Score Av.Op. Draws
32 Faile : 1984 73 61 102 41.2 % 2045 9.8
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1

Nonsense.
To say it frankly : IMHO the only "nonsense" here is your comment .
Carlos E.A. Drake tests under very specific conditions and some engines do >better than usual , a few do worse.
If it is about the "ELO" , only differences are important , not absolute >value .
I think this list is quite a valid bullet/blitz list if you take out those >engines who have difficulties with the little RAM Mr Drake has in his >computer .
Also, not everyone has an Athlon800 ; I think it is interesting how programs >do on slower computers .
And in fact I don't understand the emotional answers at all he has to bear >with everytime when he posts new results .
pete
If 'very specific conditions' means that the configuration of some engines
was totally messed up, then yes, you are right.

I was not looking at ELO values, just the respective order of the 3 engines
I mentioned.
Then why include them, as they do nothing but skew the scores of the
other programs ?
FYI, all of my development is done on an Cyrix120. But that has nothing to
do with the point anyway.
There was a very specific reason why I chose those three programs. Sjeng is
an improved version of Faile, and is at least on par for ANY timecontrol
you will try. Both engines are considerably stronger than TSCP.
The new Faile is definetely stronger than SSEChess too, but from the list
there is _no_ indication whatsoever about which version of which engine
was used.
The reason why I am complaining is that this is the second rating list I see
here that was obviously run with several engines seriously handicapped. If
the poster of the list can't even properly set up a chess engine, how am I
supposed to trust any result he posts? We all know testing and comparing
chess engines is a tricky matter.
I spend a lot of time trying to get my program better. Objective measurements
of its strength, like *good* rating lists, help me to determine my progress.
But if I see someone posting data that has no bearing on reality, it is no
help at all, and gives a false indication of an engines progress. So I WILL
complain.
Gian-Carlo Pascutto
 

Re: NEW RANKING ( with the news! )

Postby pete » 09 Jul 2000, 15:28

Geschrieben von: / Posted by: pete at 09 July 2000 16:28:28:
Als Antwort auf: / As an answer to: Re: NEW RANKING ( with the news! ) geschrieben von: / posted by: Gian-Carlo Pascutto at 09 July 2000 15:54:41:
Program Elo + - Games Score Av.Op. Draws
32 Faile : 1984 73 61 102 41.2 % 2045 9.8
40 Tscp : 1887 76 86 61 45.9 % 1915 23.0
46 Sjeng : 1839 100 86 54 40.7 % 1904 11.1

Nonsense.
To say it frankly : IMHO the only "nonsense" here is your comment .
Carlos E.A. Drake tests under very specific conditions and some engines do >better than usual , a few do worse.
I think this list is quite a valid bullet/blitz list if you take out those >engines who have difficulties with the little RAM Mr Drake has in his >computer .
Also, not everyone has an Athlon800 ; I think it is interesting how programs >do on slower computers .
And in fact I don't understand the emotional answers at all he has to bear >with everytime when he posts new results .
pete
If 'very specific conditions' means that the configuration of some engines
was totally messed up, then yes, you are right.
Then why include them, as they do nothing but skew the scores of the
other programs ?
FYI, all of my development is done on an Cyrix120. But that has nothing to
do with the point anyway.
There was a very specific reason why I chose those three programs. Sjeng is
an improved version of Faile, and is at least on par for ANY timecontrol
you will try. Both engines are considerably stronger than TSCP.
The new Faile is definetely stronger than SSEChess too, but from the list
there is _no_ indication whatsoever about which version of which engine
was used.
The reason why I am complaining is that this is the second rating list I see
here that was obviously run with several engines seriously handicapped. If
the poster of the list can't even properly set up a chess engine, how am I
supposed to trust any result he posts? We all know testing and comparing
chess engines is a tricky matter.
I spend a lot of time trying to get my program better. Objective measurements
of its strength, like *good* rating lists, help me to determine my progress.
But if I see someone posting data that has no bearing on reality, it is no
help at all, and gives a false indication of an engines progress. So I WILL
complain.
I can't follow : I see the results are surprising when compaired to your results but see no indication that the configuration is messed up .
There is information missing in the lists which makes it less trustable as it could be ; here I agree . We don't know about the versions used , we don't know about the time control chosen , we can't have a look at the games as far as I know . This is indeed quite a few things missing . So : "nonsense" is still nonsense I think but indeed I would take it with a grain of salt too if I were you .
.
Yes , here I agree completely . I had the impression though it was not that clear which programs were handicapped very much except for Gromit3.0 ; so : if you know about a program that shouldn't be included it would probably be helpful to Mr Drake .
See above : the only reason I complained about your post was that I found it useless to say "Nonsense" without explaining _why_ and explaining what should be done better .
In most tournaments I looked at I found a few settings or decisions I didn't agree too . In fact I only take tournaments serious where I know exactly which versions played , versions not changing in the tournament ; with which settings and best is if the games are provided too ( or my own ones ;-) ) .
This is not possible most of the time , but this doesn't make the results nonsense ; it just sets up a lower level of reliability . In fact the only tournament I have heard of that follows most of my wishes is the Corbit "Battle of the Crowns" ( and even there were a few odd things in the Calibration ;-) ) .

Complaining is fine with me ; but we have someone who played many games wishing to get good results with the probable wish to help authors just with that and a poor "Nonsense" doesn't make much sense to me either :-) .
pete
pete
 

Errors?

Postby Aaron » 09 Jul 2000, 16:19

Geschrieben von: / Posted by: Aaron at 09 July 2000 17:19:52:
Als Antwort auf: / As an answer to: Re: NEW RANKING ( with the news! ) geschrieben von: / posted by: pete at 09 July 2000 16:28:28:
There is information missing in the lists which makes it less trustable as it >could be ; here I agree . We don't know about the versions used , we don't >know about the time control chosen , we can't have a look at the games as far >as I know . This is indeed quite a few things missing .
Also, not everyone has an Athlon800 ; I think it is interesting how programs >>>do on slower computers .
There was a very specific reason why I chose those three programs. Sjeng is
an improved version of Faile, and is at least on par for ANY timecontrol
you will try. Both engines are considerably stronger than TSCP.
The new Faile is definetely stronger than SSEChess too, but from the list
there is _no_ indication whatsoever about which version of which engine
was used.
The reason why I am complaining is that this is the second rating list I see
here that was obviously run with several engines seriously handicapped.
See above : the only reason I complained about your post was that I found it useless to say "Nonsense" without explaining _why_ and explaining what should be done better .
I think it's made up of results from lot of different games and time controls..Also, the reason why no versions nos are included is because, all games by a program regardless of versions are considered under one entry.,.
SO Faile0.6 to Faile 1.4 are one entry..This could explain why some of the newly upgraded engines where newer versions are much stronger show not much progress..
I have no problems with such a approach, sinc eusually the differences in strenght from version to version is not great..But I don't like mixing games with very different time controls..Eg G/5 with G/30..
Greenlight's Chess poor rating probably results from playing games at G/5 etc./.GLC can't play that..

Hmm but he has a relatively fast AMD k2 550. As well..SO there is even more mixing..

See above..But Sjeng is "supposed" to be better than TSCP..But I don't know, the top few engines order look reasonable to me (some may think Gormit3 and SOS both with large hash requirements are under-rated..but that's a small thing and GLC of course)
It's the lowerest ranked engines order , that looks a little weird . Even to someone like me, who never tests such engines..But if I wanted to quibble, i could say the same for the top 12 engines..

But from what I seen, the weaker engines have lots of problems with rule interpretions, timing and other bugs..Such that they can something lose in winning positions, make illegal moves etc etc..Depending on how you handle such things, results can get askewed a lot..
Also, notice that Dann Corbit himself stated that he has little idea, who will win the lower divisions of the BOC runs, which probably means that there is quite a lot of uncertainity there..



Second?
Aaron
 

Re: Errors?

Postby Gian-Carlo Pascutto » 09 Jul 2000, 17:04

Geschrieben von: / Posted by: Gian-Carlo Pascutto at 09 July 2000 18:04:21:
Als Antwort auf: / As an answer to: Errors? geschrieben von: / posted by: Aaron at 09 July 2000 17:19:52:
I have no problems with such a approach, sinc eusually the differences in >strenght from version to version is not great..
See above : the only reason I complained about your post was that I found it
useless to say "Nonsense" without explaining _why_ and explaining what should
be done better .
The reason why I am complaining is that this is the second rating list I see
here that was obviously run with several engines seriously handicapped.
Second?
Not for the top engines maybe, but for the bottom half this can be quite
different.
I remember seeing another 'WinBoard rating list' before that had exactly the
same flaws. So I didn't feel like repeating all of this yet another time.
Gian-Carlo Pascutto
 

Re: Errors?

Postby pete » 09 Jul 2000, 21:00

Geschrieben von: / Posted by: pete at 09 July 2000 22:00:53:
Als Antwort auf: / As an answer to: Errors? geschrieben von: / posted by: Aaron at 09 July 2000 17:19:52:
Hi Aaron,
difficult to answer your message as you created a fascinating patch work out of my post and Gian-Carlo's in your answer :-)
So briefly :
a.) I don't think anything you said is wrong . Rating lists including various versions sure can be of a lot interest .
b.) When I was talking about which results interest _me_ most , this was only my personal opinion ; in fact this was what I wanted to point out : interest is different between different people ; I am very picky for example ... ; when you see a list or a tournament that is set up in a way which is of limited interest to you , so be it .
c.) I think if someone takes much time to do testing and you think something he does is dead wrong it is better to inform the person in a constructive way so he can think about his decisions again and decide if he wants do to changes or not . I am under the impression that _all_ the people running tournaments or doing similar tests are doing the best they know and are able to .
In fact I think you and also Gian-Carlo agree here anyway :-)
Have a nice day .
pete
pete
 


Return to Archive (Old Parsimony Forum)

Who is online

Users browsing this forum: No registered users and 57 guests