The Complexity of Control Schemes

What does it take to craft a “good” control scheme for a video game?

Different people have their own subjective answer to this question.

Throughout my professional life I have heard people claim:

  • “Jump” should always be on the “Xbox A” button.
  • “Interactions” should always be handled by the “Xbox X” button.
  • “Xbox LT” should always be the aim-down sights button.

It’s not only gamers, but I’ve also heard game designers approach this topic as to what personally “feels right” for themselves. But the topic is far too complex to be approached in this manner.

In this article, I will detail the different types of techniques game designers can use to design control schemes. Let’s go through them one by one…

  • Physical Limits
  • Action Types
  • Immersion
  • Accessibility
  • Relatability

NOTE:

I will only refer to the Xbox controller layout to avoid any nomenclature confusion between other platform controllers.

This article will also focus more on donsole examples than PC or mobile, but the guidelines stated here should be helpful for all control devices available on the market.

Physical Limits

Throughout history, video game control schemes have grown by taking reference from each other.

Running, jumping, and climbing controls were all too similar with each other. But Mirrors Edge, one of the first popular parkour games, had a unique problem to solve.

Being a first-person parkour game, camera control played a major part in the player navigation experience. It is hard to take your thumb off the right analog stick and press the commonly known jump button “Xbox A” to jump around.

Mirrors Edge had an outlandishly unique solution for its time…

They changed their jump button from the traditionally known “Xbox A” to the “Xbox LB” button. With this change, players can simultaneously control their character AND camera while jumping.

Dying Light, the spiritual successor to Mirrors Edge, followed its predecessor’s example to shift the jump button to “Xbox RB”, to let players jump whenever they want while being able to use their thumbs to move the camera freely.

HAND LIMITATIONS:

Not all fingers on our hands are designed to be strong, reactionary, or precise. 

  • Thumb and index fingers are considered to be the strongest, reactionary, and precise.
  • Middle fingers are considered to be reactionary, but not so precise.
  • And finally, the ring and pinky are considered to be weak and imprecise.

This is just how able human bodies function. And we have to understand these physical limitations before we design control schemes for a said control device.

CONTROL DEVICE ERGONOMICS:

There are many different types of control devices available on the market to play video games. From classic platform controllers (Xbox / PS / Nintendo), to PC (Keyboard & Mouse), and Mobile (Touch Pads), or even third-party controllers like a Fight Stick.

If we apply the hand limitations to how a player can hold a controller, we get:

  • Controllers depend on thumb and index features for most, if not all main interactions of the game.
  • Secondary actions are usually led by the middle finger.
  • Background actions are done by adjusting the arm. 

For example, a button such as “Space Bar” on a keyboard is easier to press with a thumb because of the size and the distance of the button to the finger, as compared to a button such as “M”, which requires players to readjust their arm to hit the button.

If we apply the hand limitations to how a player can hold a keyboard & mouse, we get:

  • Main interactions depend on index and middle finger.
  • Secondary actions are usually left for thumb, ring, and pinky finger. 
  • Background actions are to be done by adjusting arm.

And if we apply the hand limitations to how a player can hold a touch pad, we get:

  • Main interactions depend on thumb fingers.
  • Different mobile games depend on different UX options to enable more types of interactions with thumbs, like gestures, or on-screen buttons (because you can’t really use any other fingers).

  • Some devices will be easier to deal with, like a keyboard, because of the abundance of keys easily available at one’s fingertips.
  • But some others might not, like a touch pad, where almost all interactions are based on thumbs exclusively.

Variables:

The control device ergonomic restrictions specified are not indefinite. These assumptions are based on how we developers think players normally hold their control devices.

Players can also shift how they hold control devices, which can improve how secondary or background actions work. But it can throw off our perception of how we can specifically design controls for them.

There’s also third-party controls which further push developers away from predicting how people with different physical skillsets and preferences like to play.

But these variations are not the norm. I usually account for these variations through accessibility options.

Action Types

Now we have seen many mentions of:

  • Main Action
  • Secondary Action
  • Background Action

But what are they? How do we define them?

We can look at them from two perspectives.

All gameplay actions need to be categorized according to their “frequency of usage” and “need”.

Frequency of Usage:

While dodge is the most frequent action players can trigger, when players are expected to dodge in the blink of a second, the controls also need to be positioned accordingly.

Platinum Games have made it so that Bayonetta can initiate dodge with the “Xbox RT” button, while being able to attack at the same time with the “Xbox X” button.

I follow these categories to understand and categorize gameplay actions based on their frequency of usage:

  • Main Type
    • These are omnipresent, and the most frequent action inputs the player can rely on.
    • For example, Movement, Shooting, or Jumping.

  • Secondary Type
    • Action inputs, used to que in between omnipresent actions.
    • For example, Grenade, Dodging, or Reload.

  • Background Type
    • Rare, the kind of actions that might require the player to readjust their arm.
    • For example, Items, or Switching Weapons.

All of these examples vary from game to game.

This is an example frequency figure. The brightest red color in the figure represents main actions, the darkest red represents background actions.

Need:

Evaluating the “need” is done on a case-by-case basis. The core rule to remember is:

For example:

  • Placing a “reactionary” dodge action nearby the pinky finger on the keyboard would be harder to execute compared to it being placed near the thumb / index fingers on a controller.
  • Placing a “hold” action would support the pinky finger, as players don’t have to be strong or precise to execute the action successfully.
    • Sprint on keyboard is good example (as it’s a big key, and players don’t need to readjust their arm).

In another example, in Super Mario Odyssey, the left stick is the king. Players will be moving their left stick 90% of the time compared to any other input.

The camera on the other hand, works around character actions. The right stick is not an input that players need to actively control, as they don’t need to move the camera too much to align jumps.

Mario’s jump is not a reactionary input like in Mirrors Edge. Players don’t need to adjust camera and press the jump button at once.

Which makes sense as different games have different challenges. What worked on Mirrors Edge might not be the best solution for every game out there.

Personal Experience:

No matter how many buttons we have on a control device, humans don’t have enough fingers to press all buttons at once simultaneously.

Compared to the controller version of the game, PC makes active use of 2 hands, increasing the number of main & secondary actions possible to execute simultaneously.

In our example, on a gamepad, players can:

  • Control the player character (Left Stick)
  • Apply boost OR turn camera (Xbox B Button OR Right Stick)

  • It’s harder to move the camera while the player is actively applying boost, because the player has to take their thumb off the boost button to control the camera (implying that both actions cannot be performed simultaneously).

  • Due to the nature of analog sticks, the ship rotation was also smoother because of how smoothly a stick moves from its starting point to the edges (even on irregular hand configurations, for example with a claw hold).

On a keyboard, players can simultaneously:

  • Control the player character (WASD)
  • Apply boost AND Turn camera (Shift AND Mouse)

  • PC can apply all interactions specified simultaneously.

  • The quick boundless snappy nature of the mouse also broke how smoothly the ship was supposed to rotate.

To make both platforms feel similar to each other, we had to readjust how the ship rotated, exclusively for PC, to avoid snapping issues (although it’s not perfect, it was still better than leaving it as it is).

But it’s still the case that – PC controls were superior in terms of how many interactions were possible to do at once, simultaneously, even after the fix.

Immersion

When Spiderman is saving a civilian from being crushed, it’s not just Spiderman struggling to save civilians, it’s the player as well.

In Death Stranding, when the player character Sam, is picking up to hold a cargo case in his right hand, the player also holds the “Xbox RT” button to feel the cargo case in hand.

It’s a game designer’s responsibility to remove the physical barrier between the controller and the screen.

How the character moves, how the camera frames the action, and the actions the character can perform all contribute to increasing immersion in the game’s atmosphere.

Controllers:

It’s not just video games, console controllers have also evolved over time to improve immersion.

Back in the day, controllers didn’t even have analog sticks.

Within the Nintendo eco-system, the Nintendo64 controller analog stick greatly improved how “responsive” Mario felt to control in Mario64.

Camera controls were still an unsolved equation at the time. In Spyro The Dragon 1998, players were able to move the camera (left / right) by pressing the trigger buttons (Xbox LT / Xbox RT).

But eventually, as time passed by, video games have aligned on providing full camera control on the right analog stick.

Like in Spyro Reignited Trilogy, the remake evolved to use the right analog stick as the camera control input.

It’s not only analog sticks, back in the day, the PS1 had blocky shoulder buttons. But with the evolution of first-person shooters and racing games, controllers have also evolved to support the trigger buttons (Xbox LT / RT).

WIth trigger shaped buttons and advanced haptics, now controllers have gotten to a point where when a character pulls the string of a bow, players will also feel the pull of the string.

And oh, here’s a good video which provides insight on why controllers are the way they are:

Accessibility

This is more of a “controls” topic than a “control scheme” topic, but I guess I can chime in a bit here as well.

Not every player can be super precise to hit frame-perfect parry windows. Games with complex controls will push away players who might be otherwise interested in playing them.

Similarly, although understated, games with simple controls will also push away players who might be looking for a good physical challenge.

For example, in Prince of Persia: The Lost Crown, the developers at some point provided players with the option to always “guard” and negate any damage incoming.

But this decision felt at odds with the overall style they were aiming for the game.

They wanted the main character, Sargon, to be aggressive, and subsequently, the players also to be more aligned with his psyche. So they removed guarding altogether in favor of an exclusive parry button.

This change, although it makes the game inaccessible to certain players, it is in line with the direction of the game.

Physical Barriers:

Accessibility can be two front.

  • Games can be made physically accessible from the get-go, so that everyone can play without any further adjustments.

  • Or, games need to provide options to let everyone enjoy the gameplay experience.

The developers of Prince of Persia: The Lost Crown have opted for the second front. They provided multiple accessibility options to enable players of multiple skill levels to still enjoy the gameplay experience.

In other examples, in Spiderman, to feel like a super hero during quick time event (QTE) sequences, not all players need to be physically capable to repeatedly tap a button. Players can also toggle an option to hold inputs instead.

There’s a treasure trove of information out there to understand how a game can be made more physically accessible to play, if it’s not already designed for it from the ground-up:

https://gameaccessibilityguidelines.com


A certain level of physical engagement is always to be expected when a player is experiencing a video game.

The objective of accessibility options should never be to outright skip, or remove physical interactions, but to smoothen the gameplay experience to a point where everyone can interact and enjoy the game.

The level of enjoyment doesn’t have to be equal for every player out there.

Different players of different skill levels will always find ways to make the gameplay experience meaningful for themselves. That’s the power of the medium. We need to embrace it, not try to normalize the playing field.

Good Example:

In Bayonetta, players can physically experience a daunting action game with one finger alone.

Bad Example:

In Skull and Bones, players can skip the crafting mini-game altogether because the feature is not accessible for all players.

Removing physical interactions altogether should be avoided. It goes against the purpose of the interactive nature of video games.


Players can take longer to learn and master input heavy games like Arma 3.

But on the other end, in Mario Odyssey, action controls are duplicated (jump & cap throw) to easily make them accessible to different type of audiences with different skill levels.

Similar to Mario Odyssey, there’s also games that focus on making the gameplay experience accessible for EVERYONE from the ground-up, like Call of Duty (COD).

The main objective of COD boils down to this basic loop:

  • You see an enemy
  • You take aim
  • You shoot

In the newer iterations of COD, with reduced time to kill (TTK), increased aim assistance, increased hip fire accuracy, and other non-control facing gameplay systems, the game became highly accessible for many players of different skill levels.

As long as players have good spatial awareness, have a basic level of skills to control the character, and move the camera, they will be able to have fun with the game.

But control schemes can also go a step below COD… like with Candy Crush:

With no time pressure to perform, the only physical skills players need are finger swipe gestures.

This discussion around this subject is filled with more nuance than what I have touched upon in this article. I’m not going to go in-depth about all of its intricacies. Maybe I’ll tackle this topic some other time.

Relatability

Throughout history, players have grown their control mental models by playing different video games.

For example, in the past players didn’t really understand how to handle both the left & right analog sticks at once.

In Resident Evil 4 from 2005, the character controls get locked when you aim to shoot an enemy. This was not only used to enhance the horror gameplay experience, but it also helped players easily aim their weapon to shoot at targets.

But in Resident Evil 4 Remake 2023, players have grown to master analog stick controls to the point where they can move the character WHILE aiming with their weapon at once.

What used to be the norm, has now become clunky and outdated.

In the Remake, moving while aiming is not only a nice to have feature, but it’s also an essential skill check. This has become the new norm.


There are many guidelines about how to design / tweak control schemes in this article, but sometimes, relatability is the most important factor in designing control schemes.

Relatable controls help players easily form physical and cognitive connection to a control scheme.

For example, if players know how to sprint in COD, players can also seamlessly transfer over their skill on how to perform the action in other FPS games which share its control scheme (like Halo).

In another Halo example, players can move the left stick to accelerate a vehicle, just as they would move the character on ground.

For example, in Dark Souls 3, players can initiate jump by pressing the “Xbox L3” button while running.

The layout makes the action hard to use, it’s very situational, and outright uncomfortable.

Elden Ring, the spiritual successor to the Dark Souls series, has shifted the jump to the “Xbox A” button, bringing it in line with other action-adventure titles.

Sometimes picking the controls players are most used to is the correct solution. The reason why game designers feel it’s “right” is because that’s how it’s always been “traditionally” done.

But if done right… your new control scheme can change the industry and inspire other games to follow suit.

Like in Ghost of Tsushima. Players can perform actions, interact with objects, and talk to NPCs with the “Xbox RT” button.

This change lets you do all of them WHILE being able to turn your camera. It has become a staple for the franchise going forward.

That’s it for this time around. Hope you folks enjoyed reading through this post. I’ll see you all on my next one ✌

Leave a comment