Review Scores are Shit

Liz Finnegan Legacy Author

Published: Jun 3, 2016 09:15 pm

The video game industry houses some of the most innovative, impactful content our generation has ever seen. Each individual creation contributes to a field that consistently marries the best of technology and art with the hopes of evoking emotion – be it through the story or the gameplay, be that emotion sadness or amusement, it is the ultimate goal of creators to have their individual product mean something. To make its mark on the industry. But who, and what, determines which games are worthy of leaving such a mark?

One of my jobs here at The Escapist is to review video games. This is, without any doubt, one of my favorite “obligations.” In order to be a responsible adult, in order to pay my bills and buy my children Christmas presents and be able to afford vacations, I have to play video games and share my opinions on them with the world. There are many people who hate their jobs, who lament going to work every day. I’m hardly one of them. However, as much as I love this particular function of my job, there is one thing that I cannot stand – the moment when I have to actually assign a score to the game I’m discussing. The moment where I determine, on a scale of 1-to-5, the exact worth of someone’s sweat, work, dreams, and achievements. The moment that I, as one of a select few arbiters of numeric-based value, coldly grade the work which was slaved over, which people emotionally killed themselves to create. Why should my opinion matter more than those of the players, the gamers, the non-scorers?

Easy. It shouldn’t.

My opinion is valuable to me, and it may very well be interesting to you. I may tell you something you hope to know. Maybe you’re interested in how, in my personal assessment, the story flows, or how the game runs, or whether or not there are any technical issues that drastically impair one’s ability to enjoy the game on a specific platform. Outside of a review I may expand on my opinion in a more appropriate setting (you know, one labeled “opinion”), where I can delve into some of the specifics of what I liked, or disliked, enough to dedicate time to penning an entire separate article about it. But with reviews, there’s always that damned number. After writing 1000 words or more of my personal opinion of a specific game, I then assign a number – as if there’s any “correct” or “incorrect” way of creating such things.

Teachers will tell you if you spelled a word incorrectly, if your grammar is poor, or if you failed at correctly finding the value of x. These are absolutes – facts determined by wise people and often memorized by less wise people. There is a “right” answer, and thus, a “wrong” answer. An appropriate grade is then determined, based on an absolute scale determined by the number of correct answers out of the number of overall questions. If you take the very same test in any classroom across the United States, you will receive the same grade based off your performance. The teacher’s personal opinions should not factor into the grade you receive – and when this does happen, it makes national news and often ends with some sort of official statement or disciplinary action against aforementioned teacher. That is hardly the case in this industry.

Few publications grade video games using similar criteria. For example, Kill Screen focuses on “the interaction between games and culture,” actively avoiding the word “gameplay,” per the publication’s review policy. This is neither right nor wrong, as different people want different things out of their reviews. Here at The Escapist, we tend to focus a bit more on gameplay than, say, the presumed mental state of the protagonist. Again, this makes us neither right nor wrong. It does, however, lend to a significant discrepancy in review scores across publications.

Should a game’s quality be measured by the past or present? By what has come before, or by what sits in front of you at the moment? There’s no right answer – but it is an area in which everyone differs in their opinion.

If you follow reviews in this industry at all, it’s more likely than not that you’ve come across the word “formulaic.” It’s a pretentious word, one that serves the goal of showing the reviewer as a scholar rather than of informing readers of the merits of the game itself. Formulaic. Any franchise using similar mechanics across multiple installments is guilty of being “formulaic.” It is nearly always used as a negative – but why? If it isn’t broken, must it be fixed? This is a matter of taste – some find solace in the knowledge that they can pick up, say, Uncharted 4, and easily jump into the game, as they have played previous installments. They know they’ll be scaling mountains and killing enemies and finding treasure and repeatedly almost dying in some really amazing cutscenes. And that’s okay. Should a game’s quality be measured by the past or present? By what has come before, or by what sits in front of you at the moment? There’s no right answer – but it is an area in which everyone differs in their opinion. Again, we are likely to see a review score discrepancy.

OpenCritic recently introduced a new feature, the “Mighty Man” feature. A representative stated that gamers “have different expectations for the same general score ranges.” For example, many will view an 89 as an amazing game, while others will consider it “not that good” because it isn’t a 90. The new update breaks games down into tiers: Mighty, with an average score of 85 or higher, Strong, with an average score of 75-84, Fair, with an average score of 70 to 74, and Weak, with an average score of 69 or below. One would assume that, on a scale of 1-100, something “Fair” would fall roughly in the middle. Neither strong nor weak. The middle. And yet anything below a 70 is often deemed “poor,” despite falling well above the middle of the scoring scale, because many look at it like they do a school grade. Others do not. With so many different publications utilizing vastly different criteria for determining the score of a game, how does the process of scoring, and aggregating scores, truly benefit anyone? Does an average score have any worth?

At the start of this article, I spoke of video games as a marriage of technology and art. They are software and stories, code and emotion, framerates and discovery. To grade the technology is simple: how does the game itself operate? How does the character move? Are there any glitches? Do the controls make sense? Are you able to connect easily, or are you frequently dropped from the game? To discuss art, though, is significantly less objective and comes down to a matter of personal taste. Kill Screen likes to discuss the cultural influences that games have – and that’s fine. But when those scores are added alongside scores from publications that focus more on gameplay and technical mechanics, is the average score of a game appropriately represented? Everyone’s scale is different, therefore any aggregation is – excuse my language – absolute bullshit.

Everyone’s scale is different, therefore any aggregation is – excuse my language – absolute bullshit.

Some people will spend more time grading a game for what they wish it was than for what it actually is. Missing a health bar, for example, is an obvious omission that negatively impacts a player’s ability to strategize and fully enjoy the game. Taking issue with a female character wearing a bikini, however, comes down to personal taste on how the reviewer wants the characters to look. Is it right to deduct points based on your presumption of how characters should be portrayed, instead of how well the portrayals chosen by the creators are executed? Is it wrong? The answer will differ depending on who you ask and that, my friends, is a real problem when scores are involved.

It’s virtually impossible to remove all bias from reviewing. I tend to favor “retro-inspired” indie titles, and have to force myself to slow down and thoroughly analyze my opinion on such games. Am I forgiving a multitude of technical sins due to my appreciation of something artistic? Am I, perhaps, being too generous with the score that I’ve assigned because, despite how we score games here, many people view 3 out of 5 stars as poor? I know I can be fair in my overall assessment, but am I, perhaps, too charitable with score assignments because of the perceived quality based off a stupid number? I wish I had an honest answer to that question.

Aggregate sites that compile critic reviews rarely accurately capture the value of a game, particularly within several days, weeks, or even months of that game’s release. Once upon a time, you would purchase a cartridge and that was it – the game was good or bad, the end. Review scores today rarely reflect the dynamic nature of games – ones that add new content, and the quality of that content, or one’s with a major flaw that is patched shortly after release. The scores on The Escapist, and the majority of other websites, reflect the opinion of an individual regarding the quality of a game at the time of release. A game could grow stronger – or weaker – over time, and yet rarely are such cases re-assessed in order to reflect such changes. In order to do so, we would have to dedicate a significant amount of time replaying hundreds of games in order to see if certain issues are remedied. It’s not reasonable, but to not do so is also a bit unfair – to the creators and to the consumers alike.

It’s my job to inform, to editorialize, to critique, and I absolutely love it. But I hate – hate – assigning scores to reviews. Because, really, does my little number matter to you at all? It really shouldn’t.