Throughout the history
of this site, it has always been my intention to create a ranking system
that did what most people want it to do - show the current strengths of
teams as accurately as possible. It should be noted before I go further
that no ranking system will ever reach 100% accuracy, as this
would mean it was static and teams could never change position. The
whole idea of a ranking system is to create a dynamic list of teams
whose order changes over time based on the results of the matches they
play.
1) - A team's start
rating must be estimated - introducing a great deal of subjectivity
every time a new team enters the ranking.
In recent elo rankings,
for example, Papua New Guinea have been rated higher than Luxembourg.
It is debatable whether Papua New Guinea could hold
France or Germany to
a 3-0 or 4-0 defeat. And it is also debatable whether Luxembourg would
lose 9-0 or 10-0 against Australia. I would certainly think Luxembourg
belonged above Papua New Guinea on current strength, but the cumulative
effect of not awarding teams for 'good' defeats and awarding teams for
poor wins against lowly opposition means that teams like Luxembourg will
always be held ransom by the confederation to which they belong. If
Luxembourg played in the OFC, for example, they would have a much higher
elo rating. This kind of situation led me to seek another way to rank
the teams. |
Without discarding
the elo system altogether (it has its merits), I now use a modified
version of it on the site. The main differences between The Roon Ba
elo formula and the standard elo formula are that matches are
divided into only 2 types - competitive (for regional, continental
and world tournaments) and non-competitive (for friendly matches and
friendly tournaments). A competitive match has twice the weight of
a non-competitive match. Also, the formula has been altered
slightly to allow teams to gain points for losing 'well' and to lose
points for winning 'badly'. This ensures that the ratings change
reflects what is actually the case as shown by the result of the
match - the teams are getting closer together. The third
difference, and perhaps the part that was most difficult to
implement - is the automatic assigning of a start rating for each
team. Only one team is given a start rating (which can be any number - I have generally chosen it to be around 2500 points so as to avoid negative points totals at the bottom of the ranking). From this one team's start rating - all other start ratings can be calculated. Let's use a simple example - say Team A is given a start rating of 2500 points, and in their first match, they draw 1-1 with Team B at a neutral venue. Team B's start rating will thus be 2500, and this rating will be used in the calculation for their next match. If Team B had lost 1-0 to Team A, they would have a lower start rating (let's say 2400). In this way, there is hardly any subjective input at all. The elo system is applied to the entire database of results (from 1872 onwards) and is repeated in 101 iterations to increase the accuracy. One further modification to the assignment of start elo ratings is that the start rating is dynamic until that team has played 10 matches. That is to say, it is not fixed. The elo formula for that team does not start until after its 10th game. Until then, the team is given numerous start ratings, and these start ratings are added together and divided by the number of games that team has played. This allows a team to settle into a more realistic position before their rating is fixed.
|