The Hard Problem



Most videogames follow Elvis’s dictum to the extreme: “A little less conversation, a little more action.” With the notable exception of Bioware’s games and their branching-dialogue brethren, the player’s ability to relate to other characters in the game is limited to the weapon he’s carrying. Talk may be cheap, but conversations are hard.

Why? The first problem is content creation. Even assuming you take the Bioware style of branching dialogue trees as a given, it’s a lot of time and work to write and script all of those interactions.

A second problem is that even after all this time, we haven’t found anything better than what Bioware has done. Earlier approaches, such as the full-text conversations of Infocom/Eliza games or the hot/cold mood responses of the first X-Files videogame, haven’t really caught on.

A third problem is that many players don’t care. They want to skip through Bioware’s dialogue trees as quickly as possible and “get on with the game.” They don’t really see all that talking as gameplay and dialogue fatigue is a common complaint about this approach.

So it’s time to throw all that out and reconsider: what do we want to accomplish with conversations in games, how can we make them more interesting to more players, and is there an approach that would improve on what Bioware has done?

Why Talk in Games?
For games that include some kind of interactive conversations, there are generally two purposes: to express your character and to present the story. It’s these purposes that have led us to the kinds of good/evil responses and expository dialogue that characterize the Bioware approach.

Games are about solving problems: how to fit those Tetris blocks together, how to take out a Covenant squad in Halo without exhausting your energy shield, how to manipulate time ingeniously in Braid. Our toolkit of player actions is informed by those problems. In Tetris we have basic movement controls to guide each block. In Halo we have a large number of buttons and joysticks to give us tactical breadth. In Braid we have a joystick, a jump button, and a succession of time-control abilities.

Even in Bioware games, conversations are represented by just two controls: a button to initiate a conversation (which usually also opens doors and interacts with objects such as switches) and the ability to select one conversational response from a list provided by the game. Moreover, the player’s ability to use that interaction button is greatly restricted. You can rarely interact in combat, for example, and typically a given character’s conversation is limited to quests. Once that character’s quests are complete, they have nothing more to say.

In short, there is no arbitrary ability to use conversation as a universal part of the player’s toolkit for playing a game. Talking is never on par with shooting. That’s a glaring omission in our player toolkit and it’s time we fixed it.

I’ve got one example. A decade ago, Sierra released an innovative PC squad shooter called SWAT 3: Close Quarters Battle. (Its big innovation was that it did realtime squad command very, very well; Rainbow Six – then circa Rogue Spear – hadn’t made that leap yet.) In addition to the now-usual tactical shooter controls, SWAT 3 added something else: a Verbal Command button. When you pushed it, your character would bark out one of several contextual phrases including “Drop your weapon!”, “Hands in the air!”, “Get down on the ground!” and so on. You could hit this button whenever and as frequently as you wanted.


Amazingly, it worked. Depending on the enemy’s morale, you could get a bad guy to drop his gun, put his hands up, and kneel on the floor. Then you could press another button to handcuff him and summon backup to escort the guy out of the situation. Whenever I encountered an enemy and I had cover, I’d bark orders at him. If he didn’t comply, I’d open fire. After a few shots, or if I downed one of his partners, I’d give him another order, and sometimes then he’d comply. If I failed to handcuff him soon enough, though, he might get back up and grab his gun.

This was great. I was actually able to get bad guys to surrender instead of just shooting them. It was incredibly satisfying and also benefitted me tactically, since I could take enemies out of the picture without exposing myself or my teammates to danger. But this approach didn’t catch on. I haven’t seen this kind of verbal intimidation mechanic elsewhere.

However, that feature had its drawbacks. It did not express character, nor did it present story. Yet this always-available form of talking had a lot of potential.

Now I think we can answer the question: why talk in games? Because just like dual-wielding and leaning around a corner and every other control we’ve added over the years, talking can expand our player toolkit and give players more ways to solve problems.

How Can This Be Interesting?
I think SWAT 3 shows the way here, because the best reason to talk in a game is probably persuasion/coercion. As a player in a game trying to solve a problem, you need to marshal every resource the game provides – and other characters are definitely resources. When we add the ability to arbitrarily persuade or coerce other characters into doing useful things, you can solve more problems in more ways.

We see the persuasion/coercion model in Bioware games, where many quests require you to convince someone to agree to something. But outside of quests, the model doesn’t exist.

Here are some examples where adding talking as a core control could improve gameplay:

Action Games: This one’s easy. Halo, for example, already does a nice job with morale for the Grunts, minor enemies who will flee if they suffer too many losses. Sometimes their incidental dialog even addresses their fear of you, the legendary human warrior who has killed so many of their kind. Think of the action movie trope where the tough hero faces the cowardly no-name thugs and sends them running away with a growled threat. A more interesting example would be persuading frightened civilians to do what you want: take cover and hide, for example, or follow you, or give you a first aid pack.

Fighting Games: With all the moves and combos in fighting games, it’s surprising there aren’t controls for verbal threats and yells. Adding verbal elements to attack combos would be great, so that a flying kick with a “HII-YAH!” would actually do more damage, and a roar of defiance could help break a hold. A single button for verbals with contextual phrases would be a blast and would definitely help to express character. It could even express story: imagine that when you press the button while across the room from your opponent, the contextual phrase is, “I will avenge my father’s death!” or “How dare you attack our peaceful village!”

RPGs: Haggle with shopkeepers. Intimidate enemies in combat. Disclose plot points you’ve recently learned to persuade reluctant allies (“Soylent Green is people!”) or make your foes panic (“I am the Kwisatz Haderach!”). If nothing else, your character can make a quip or deliver a catchphrase appropriate to the context purely for entertainment or to put a fresh spin on a common situation. (If you’re playing Uncharted, wouldn’t you like a “Be a Smartass” button so Drake could crack wise when you felt like it?) In all cases, talking is how you work your will on those you meet.


How Do We Do It Better?
The Bioware approach can only be as rich as the development work put into plotting out any given conversation. Each conversation and all of its possible options have to be fully scripted.

But in the kinds of genres games typically explore, a lot of dialogue doesn’t need to be that rich. Tough-guy quips, snarled threats, or urgent requests for aid are commonplace and often don’t even need to be contextualized.

Let’s not try to simulate a conversation or pretend we’re going to return to the full-text days of Infocom/Eliza. Instead, think about it from a problem-solving, dramatic approach.

Each NPC is a problem. When you encounter them, they are not doing what you want them to be doing. Your goal is to persuade/coerce them into compliance, with a risk that they may instead defy you. You decide whether the risk is worth the reward.

This suggests that a given NPC has three states: default, compliance, and defiance. We can imagine these as points on a continuum, with “default” in the middle. Your job is to move the NPC from default to compliance. (In fact, I ripped that off from EA’s first Godfather game, where you could physically intimidate people into paying you protection money. The more you pushed them, the more money they’d give you, but if you pushed them too far they’d flip out and attack. It was a fun, quick, and universal mechanic.)

So now let’s rip off Fable II.

In Fable II, you have a large library of physical gestures you can make that are divided into different categories, such as Social, Humorous, and Romantic. If an NPC likes humor, their opinion of you improves if you dance like a chicken. If they find you attractive, they are turned on by displays of strength or flirtation.

Let’s take from Fable II the idea that you have groupings of expression forms. And let’s say there are only two: Persuasion and Coercion. Persuasion generally means you’re appealing to a sense of duty or morality or responsibility. Coercion generally means you’re using intimidation or bribery. (Please note that I don’t consider these as good/evil mappings. Both approaches should have their uses.)

Within each of those two forms, let’s say there’s ten different expressions. Unlike the gestures of Fable II, these are actual dialogue. On screen, they’re presented as Bribe, Threaten, Implore, etc., Each one has a set of stock phrases that are randomly chosen each time, although context can override these with custom phrases.

When you encounter an NPC, that NPC may have one or several different interactions available. Typical interactions available with incidental NPCs, such as a farmer in his field, might include Information, Combat Assistance, and Healing. These basic interactions would be duplicated on many NPCs and would be available in the appropriate context.

You choose an interaction type and then begin choosing expressions. Not all expressions are available for all NPCs. Behind the scenes, the NPC corresponds to one of many personality types. Each type is defined by a list of allowable expressions paired with modifiers to the NPC’s state on the compliance/default/defiance continuum. In addition, each type has a bias modifier for each type of interaction. Most personality types, for example, may be easily shifted towards Compliance when the interaction type is Information, but are easily shifted towards Defiance when the interaction type is Combat Assistance.

A really simple type, such as a Town Guard, may only have two interactions: Information and Combat Assistance. Because of his role, he is biased towards compliance in both. He’s not meant to be a very interesting character so he probably only has a couple of allowable expression types.


That helpful farmer has a few more allowed expression types. He will generally respond well to persuasion appeals for information but coercion attempts will turn him off. He’s not going to be compliant on Combat Assistance unless you can either appeal to his own sense of self-preservation (because the enemies are visible) or by threatening him to help you or die.

The gangster with the hostage has even more allowed expression types because he represents a very specific problem for the player to solve. His type is biased towards defiance for all interactions. However, there are a few key persuasion expressions that have big compliance modifiers, like self-preservation, and it may be that his personality type is very susceptible to bribery. The challenge for the player in this situation is to test different expressions on him and figure out what works before you’ve pushed him into defiance.

Imagine this scenario. A gang of thugs shows up to cause trouble. You duck behind a dumpster and exchange gunfire. While reloading, you bark out a coercion attempt at all of them using an Intimidation expression. You hear your character yell, “You guys messed with the wrong guy!” (A stock phrase, not anything custom-scripted for this scene.) Interaction meters appear above their heads and most of them shift strongly towards defiance; they yell back stuff like “Screw you!”. But one thug doesn’t: his meter slides towards compliance and he replies with something like, “Nuh-nuh-no we didn’t! We’ll get you, man!” He’s a coward. The next time you shoot one of the thugs, you initiate an interaction directly with the coward and make a persuasion attempt with a self-preservation expression. “It’s not too late to get out of here before you get killed!” The thug gets up and runs away and you’ve just helped even the odds.

Is This Practical?
Mechanically, yes. You build your lists of interaction types, expressions, and personalities and the system does its thing. That’s one of the benefits of a universal mechanic over custom interactions.

This is, however, a big new feature. It’s a long way from free. I think it would do very well situated into a personality-driven action game like Uncharted or Splinter Cell – in fact, Splinter Cell has had a fairly universal coercion mechanic to make prisoners talk, although there’s not much to it. In these games almost all the conversation content can be universal rather than custom-contextual. In this approach, conversation is a tactical tool to help you defeat enemies, manage allies and neutrals, and occasionally gain information, rather than a vast custom-context conversation system.

In a Bioware-style game, this approach could cover the tactical scenarios as well as ordinary tasks like shopkeeper haggling. For quest conversations, you may still want to take the full-text approach to better deliver characterization and exposition.

Is it Fun?
Well, I think so. I would love to crack wise to random NPCs and see how they’d react. I’d feel more immersed in the world and I would have a way to interact with it besides shooting guns. I think the more mechanical, gamey approach would make conversations more interesting and accessible to the text-averse. And once a couple of games adopted this feature and other designers started treating it as part of the player toolkit, we’d start seeing some really innovative uses for it.

I for one would rather have a little more conversation with my action . . .

John Scott Tynes was born in Memphis and while he has been to Graceland, he has also been to Graceland Too. He has never, however, had a peanut butter and banana sandwich.

About the author