Not as good as you think - Guild rankings reconsidered

With two documentaries out or upcoming on the subject of end-game raiding, I think now would be a good time to present some work on an alternative guild ranking I developed this summer with Ted Castronova's guidance and support. If you're a WoW raider, then no doubt you follow one or both sets of rankings on guildox and wowprogress.  These ranks are based on a somewhat esoteric formula, but the world first kill of the final boss usually puts your guild on top. But if we're to make team-gaming a viable arena for economics research, we need to learn how guilds do what they do, and how well they do it.

To get at this information, one needs more than a "race to the top" ranking; one needs a model of boss kill production, and a technique to measure efficiency. There is a whole literature in economics that is concerned with measuring efficiency in production. The methodology I used to develop this ranking for guilds is quite involved, even for the expert (which I do not claim to be). Since this is a blog, I hope you'll pardon my decision to present results first for the TL;DR crowd. Those who are interested can then go on to read about the methods I used several paragraphs down.

I collected data on unique boss kills, the time the guild spent raiding, and the gear of each guild member. The guilds in my sample were the top 39 US 25-man guilds as measured by their wowprogress ranking in Tier 11. The data were gathered between July 12th and July 18th. The first set of estimates regards the importance of a few key variables for progression. The values directly across from the variable names are the estimates, and the values in the parentheses are standard errors of those estimates. To understand the value next to Raiding Man-Hours, read it as "A one percent increase in the man-hours the guild spends raiding leads to a 0.658% increase in boss kills." For gear level, the implication is that a one percent increase in the average item level of raiders' gear leads to a 0.342% increase in boss kills for a guild. In general, guilds on PvP servers have 20% fewer kills compared to their competitors on PvE guilds, holding everything else constant. I'll explore this last result later on.

Ranking of 39 top-end raiding guilds, according to estimated level of technical efficiency: 7/18/2011

What I really want to talk about is the "Implied TE", where TE stands for technical efficiency. The value displayed in this first table is the average technical efficiency. It means that, on average, guilds are 85.7% as efficient as they could be, all other things equal.In the second table, you'll find the ranking of the 39 US guilds considered in this study according to their level of technical efficiency (click to enlarge in another window). At first, there's nothing obviously different here from what you might see on wowprogress for US guilds. But if you look closer, you'll see that the US's top ranked guilds are nowhere near the most efficient. Specifically, consider the positions of vodka, Premonition, and Blood Legion, respectively the 1st, 2nd, and 3rd ranked US guilds on wowpgress. It turns out that when you really dig into the data, you find that these three guilds, in particular, use a LOT of man power to get to their goal. In one week, vodka's top 30 contributors spent roughly 700 hours in the Firelands, while Premonition and Blood Legion spent roughly 600 and 550 hours, respectively. Enigma, the guild estimated to be the most efficient, spent half the time - roughly 330 hours.

If we take these results at face-value, what we find is that there are two kinds of top-tier US guilds: those that depend on man-power to brute force their way through raids, and those that finesse their way through. The amount of labor required to make up for less efficiency is nearly double that required by the most efficient guilds.

So, should you be skeptical of these findings? Absolutely. For one, this was  a very small sample of US 25-man guilds, and a snapshot in time, at that. One big snag in these data is that on July 11th or 12th, videos came out for Majordomo, the sixth boss. During the week under study, each of the four guilds at the top of the second table ranking took out Majordomo. The availability of the video may have significantly reduced the amount of raiding time required to finish him. Consequently, it may be that the efficiency rankings for these guilds are inflated compared to their top-ranked brethren, vodka, Blood Legion, and Premonition. After controlling for this possibility, Blood Legion popped to the top of my ranking, but vodka and Premonition remained in the low-middle of the pack.

A third reason for skepticism is more subtle: It could be that I'm comparing apples to oranges. Its possible that those guilds that can't realistically shoot for world first, and know it, may instead change their strategies in a way that shoots for top 10 in their region (for instance). If that's the case, then there could be some unobserved differences in the members of my sample that I'm simply assuming away. Indeed, the sample I chose was not intended to be representative of anything except for the top 40 25-man US guilds in the previous tier of raiding, so the results should not be thought to apply beyond that population.

Whatever the problems with the analysis, there was one singularly robust finding: that being on a PvP server consistently and significantly reduced progression by roughly 20% (different models gave ranges between 18% and 22%). I have two hypotheses about this finding. First, it could be that guilds on PvP servers aren't focused solely on raiding during progression. Perhaps PvE guilds and their members use non-raiding time to strategize and acquire resources, while PvP guilds spend it ganking n00bs or raiding cities. But I think the more likely reason for the difference is that, during progression, resources are more expensive on PvP servers than PvE servers, on average. That is, if gathering is less efficient due to threats from the opposing faction, progression will, on average, be reduced. I'd love to hear feedback on this finding specifically.

The analysis will continue. I'm gearing up to expand beyond the US upon release of tier 13, and I'll be gathering information at regular intervals rather than just getting a single snapshot. I'll also try combining this information with auction house data, which is now available in vast, regularly updated quantities through the new Blizzard Community API (look it up if you want to do WoW research). If you have any suggestions, comments, or questions, I'd love to hear them!

-------------------------------------------------------------------------------------------------------

Model, data, and methods

I used a stochastic frontier model, or SFM. You can start at the wikipedia page for a basic introduction. If you want to know more (a lot more), go here.

The model of production I used says that the number of boss kills, Y, is a function of labor (L) and gear (K) inputs, and of the type of server (P). Taking logs of Y, K, and L, the model is: ln(Y) = b + a[ln(L)] + (1-a)[ln(K)] + d*P + e. This is a very simple, Cobb-Douglas type production function, where labor and capital are assumed to be used in a fixed proportion. If a server is PvP, then P = 1, otherwise P = 0. I wanted to estimate the values of b, a, and d. The first parameter is a catch-all for unmeasured variation; it has no real meaning. The second paramater a answers the question "On average, what is the mixture of raid hours and gear used to produce boss kills?" Note that a is restricted to take on values between 0 and 1, implying that both more labor and more/better gear can increase boss kills, but that each does so at a diminishing rate (without a concomitant increase in the other). The value of d answers the question "On average, if a guild is on a PvP server, by what percentage is its number of boss kills increased or decreased?"

The most important part of an SFM is the error term, e, which is decomposed into two randomly distributed values so that e = v - u. v is a regular error term with mean zero, and u is drawn from a positive, half-normal distribution, so that it is always greater than or equal to zero. The idea is that by estimating the distribution of u, and by estimating each guild's position in that distribution, you can get an idea of which guilds follow best practices (u ~ 0), and which do not (u > 0). A simple exponential transformation on the value of u for each guild gives a number between 0 and 1, where 1 is no technical inefficiency and 0 is complete technical inefficiency (output would be zero). These values are what you saw in the second table, above.

The data were gathered from July 12th to July 18th, and I recorded the progression of the guild - the total number of bosses in Firelands they had killed - at the end of that period. Thanks to the new Blizzard Community API and some clumsy automation, I was able to take high-frequency snapshots of the gear and location of member-characters of 39 of the top US raiding guilds. (I didn't look at EU because of time-constraints; the next phase will include more localization). If a character was observed in the same place for two successive passes of the scanner, then I assumed they spent the interim in that zone. For each guild, I chose the 30 characters that spent the most time in Firelands and added up the total time they spent there. For those same 30, I obtained values for average equipped item level at several intervals. I took the maximum of those values for each character, and used the mean of those maximum values as a measure of overall gear quality for the guild. These two variables constituted measures of each guild's labor and capital inputs to raiding. (The reported values are based on top 30 contributors for each guild, but I found that the estimates were generally insensitive to changes in the level of aggregation). I also recorded whether the guild was on a PvP or PvE server.

Estimates were obtained using the 'frontier' command in Stata, with constraints. The model is a log-log type, which means that the coefficients are elasticities of output; that is, they measure the percentage change in the output for a 1% change in the input. Hence the interpretation proposed above.

I must emphasize that this was a preliminary analysis. There are more than a few reasons to question the model, the methods, the data, and the results. What I reported, however, had enough merit to be made public so that I could hear some ideas from the raiding crowd, who I hope read this.

To the extent that I can, I'll be happy to provide answers to methodological questions, as well.

Edit: Fixed typos in the methods section.

Not as good as you think - Guild rankings reconsidered by Isaac Knowles, unless otherwise expressly stated, is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

2 Responses to “Not as good as you think - Guild rankings reconsidered”

  1. Is it possible to control for the amount of time spent just roaming around or socializing?

    • The short answer is "no". The idea behind choosing these guilds in particular is that they are competitive and so will want to minimize roaming and socializing within the instance where they produce boss kills. A failure to do so would enter into the inefficiency term.

      Having said that, there is something to be said for time spent outside the instance producing other inputs to raiding, such as flasks. Again, I hope that, by construction, the sample of guilds is such that no guild would EVER not have flasks up. If I'm granted that, then having a variable of 'flasks used' should be systematically related with the amount of time spent in the instance by the guild members (since each flask lasts exactly 2 hours). In that case, there will be no differentiating effect of flask construction/use, and so it can be exluded from the production function.

      The one omitted variable that could be extremely important is whether a subset of the guild's members do any sort of custom programming for the guild, like a guild-specific Deadly Boss Mods-type addon to coordinate the fights. That would make a world of difference when comparing production. I'm going to make a strong effort to get self-reported information on this variable from each guild during the second phase.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>