Not Just a Game: Why AI Is Being Taught Chess, Poker, and The Sims

September 10, 2024

146

The Sims, Dota, or poker may not be just entertainment: scientists use games to teach artificial intelligence systems to solve real-world problems. How does this happen?

About the author: Alexander Panov, PhD in Physics and Mathematics, Director of the AI Cognitive Systems Laboratory at the AIRI Institute, Director of the Center for Cognitive Modeling at MIPT.

At the dawn of artificial intelligence, scientists needed criteria to assess its level of development. It was the middle of the 20th century, the peak of chess popularity — and it was the ability to play chess that became an indicator of the “intelligence” of the technology. Of course, at that time, AI was not powerful enough to solve chess problems, so it was tested on simpler games and problems, for example, to prove theorems from plane geometry.

Any game is a small closed world with its own rules, so it is a great testing ground for experiments and testing the properties of AI. Let’s see what skills and how exactly are trained in games of different types.

Table of Contents

Chess and th

The ability to play chess has long been considered the benchmark for the development of artificial intelligence. The first time an algorithm capable of competing with humans in this game was created was in the late 1980s: it was IBM’s chess supercomputer Deep Blue. And in 1997, Deep Blue was able to win a match against the then world champion Garry Kasparov.

Chess, like Go (a logical board game that originated in Ancient China), develops and tests two key properties of AI systems: predictability, i.e. the ability to predict another player’s move in advance, and the ability to solve logical problems. These skills are needed in many other areas of AI application – in robotics, automation, for predicting the results of marketing strategies, etc. For example, the MCTS algorithm, which was tested on chess and is used in some chess engines, formed the basis of one of the systems for planning robot movements in warehouse automation. Of course, training AI on chess also has narrower objectives – training athletes and entertainment.

Most often, artificial intelligence systems are trained on a large number of played games. The algorithm analyzes them and then retrains itself by playing with itself.

There are also systems that learn from scratch, without ready-made material, such as AlphaGo Zero and MuZero from DeepMind. For this, the Monte Carlo tree search method can be used, which is based on a tree of possible consequences of each move. The algorithm “plays” any consequence until the final, remembering which actions lead to a successful scenario, and which ones lead to a negative one. This is how extensive statistics are accumulated, on the basis of which the AI makes decisions in future games.

Poker

Poker is a game of high uncertainty. That is why it is used by AI algorithms to learn how to make decisions, calculate risks, and develop successful strategies in conditions of uncertainty. Trained algorithms are used in such environments: for example, AI can solve stock trading problems — create an investment package and manage it without falling below a given level of profitability. Another example is advertising tools that choose which ad to show to a user so that he or she is most likely to click on it.

Of course, to train such algorithms, you need a very large sample of played games. But the advantage is that poker games are shorter than chess games, so their enumeration does not take much time.

Predicting human behavior is a complex and non-trivial task. It is usually done using reinforcement learning, supplemented by heuristics (a rule of thumb) to evaluate the actions of other players. In this approach, the agent learns to make decisions by receiving rewards or penalties for its actions, and improves its strategies based on this experience. The combination of reinforcement learning and heuristics allows for more adaptive and intelligent AI that can effectively operate under uncertainty and cope with the complexities of multiplayer games or situations.

The Sims

In 2023, a team of researchers from Stanford University and Google created a virtual town called Smallville inside The Sims and populated it with 25 characters generated using GPT. The characters began to behave in such a way that real people had a hard time distinguishing them from real players.

The Sims is a life simulator by its genre. It is used not so much to train AI as to evaluate its ability to imitate human behavior. For this, large language models (Large Language Models, LLM) are used, pre-trained on a large number of texts. We encounter these models, for example, in ChatGPT — they allow the algorithm to maintain a conversation and answer questions in a “human” formulation.

The Sims basically just translated the text dialogue from LLM into character actions. It’s like if we wrote ChatGPT: “Imagine you’re John Smith, you work as an insurance agent, you live in such-and-such a house, you wake up on Monday morning, what do you do?” – and it would respond: “I’ll make myself some coffee, take a shower, go to work.” And so on for every character.

The Sims helps test LLM in autonomous agent communication conditions and understand what it lacks before immersing it in a real environment. For example, to develop a voice assistant based on it that will communicate with clients over the phone. Or it will become an LLM assistant that will help us plan our actions, make a schedule for the day and solve everyday tasks. LLM is also used in robotics: with the help of text descriptions, the robot learns to perform certain actions in different situations.

StarCraft и Dota

If in chess, go and poker the AI system plays for one player, then in StarCraft you need to control several units and coordinate their actions with each other. The AI’s task here is complicated by precisely this coordination of strategies of different units – so that they shoot not at each other, but at their opponents.

In real life, there are many such tasks. A simple example is managing traffic lights in a specific location. To reduce traffic jams, each traffic light needs to be switched taking into account, firstly, the traffic situation, and secondly, the operation of other traffic lights (for example, to launch a “Green Wave” for cars).

Dota is very similar to StarCraft – there is also cooperation between agents. Dota has less multi-agent specifics, since we control one hero – but we do it in such a way as to take into account the actions of other players. If we are talking about a multiplayer game, we do not control other players, but we take into account their actions and whether they belong to our team or the enemy team. It is also interesting that the game was slightly adapted to train the AI system, developing a special simplified map.

Doom и Minecraft

At first glance, Doom is a regular shooter where the player needs to explore locations, kill opponents and upgrade their defense. But this game has a feature that makes it interesting for training AI systems: unlike all previous games, in Doom the player receives only a picture as input – what he sees in front of him.

In chess, the player is given a description of the playing field, i.e. coded numerical information. In The Sims, it is text; in StarCraft and Dota, it is a vector description of the environment. In other words, in all cases, the algorithm receives already processed information. In Doom, training is based exclusively on the picture: the system processes visual information, remembers it, and builds its behavior based on it.

This property is well transferred to navigation tasks, such as training unmanned vehicles. In addition, the Doom environment helps test methods of internal motivation – rewarding the algorithm for exploring the environment and moving around locations. This is also used in robotics.

Another popular sandbox video game among scientists is Minecraft. Unlike other games, there are no clear rules or plot, the player’s imagination and actions are limited only by certain conventions of the world. You can assemble any object, use it to extract a new resource, build something, etc. All this allows you to use the platform as a test environment for advanced reinforcement learning methods. The complexity of this “universe” allows researchers to study how an AI agent can navigate, manipulate objects and interact with the world. The skills acquired by systems in Minecraft environments can be applied primarily in robotics, navigation and decision-making in an uncertain environment.

The Future of Games for AI Training

Games are used by all teams working in reinforcement learning (RL) as benchmarks to evaluate the quality of their models. Often, these research groups do not specialize in specific games. Researchers do not even need to know the rules of the game or play it themselves, since the agent is expected to learn everything on its own.

Of course, game tasks are always simpler than real ones. The closed nature of the game world both facilitates the learning process and is its limitation, since the game cannot fully reproduce the real world, where there are more risks and uncertainties. Therefore, researchers and game developers create increasingly complex simulators that can set algorithms tasks that are as close to real ones as possible. However, for now, solving complex problems in the real world still requires human participation.

Not Just a Game: Why AI Is Being Taught Chess, Poker, and The Sims

Chess and th

Poker

The Sims

StarCraft и Dota

Doom и Minecraft

The Future of Games for AI Training

Google has revamped its iconic “G” logo with a new gradient design

How to Balance Relaxation and Study Time During the Holidays – Four Tips

What is Artificial Intelligence Assistant Claude and How to Use It?

LEAVE A REPLY Cancel reply

Most Popular

MediaTek updates 4G chipsets with Helio G200

OpenAI’s Stargate data center plan fails to convince investors

Google has revamped its iconic “G” logo with a new gradient design

US, China agree on 90-day mutual tariff reduction to ease trade war

Recent Comments

EDITOR PICKS

MediaTek updates 4G chipsets with Helio G200

OpenAI’s Stargate data center plan fails to convince investors

Google has revamped its iconic “G” logo with a new gradient design

POPULAR POSTS

MediaTek updates 4G chipsets with Helio G200

OpenAI’s Stargate data center plan fails to convince investors

Google has revamped its iconic “G” logo with a new gradient design

POPULAR CATEGORY

ABOUT US

FOLLOW US