Playing MOBA game using Deep Reinforcement Learning — part 2

Dohyeong Kim
4 min readDec 14, 2021
image from wallpapercave

In the last post, we learn how to train a simple MOBA game using Deep Reinforcement Learning. In this post, I am going to explain what we need to know before applying the same method to the Dota2.

Code for this post can be found here: https://github.com/kimbring2/MOBA_RL/blob/main/dota2/env_test.py

You just need to run the DotaService and that code together at same PC.

Training Environment

Unlike Derk training, each headless environment of Dota2 requires more than 1GB of RAM memory. Therefore, it is better to use a separate PC for running only environment because DRL training is usually better when there are many environments. Of course, GPU is unnecessary for that PC because we do not run the training code here.

My Training Setting for Dota2

For a PC of the Seed RL, GPU is needed we are going to use the TensorFlow here. Communication between the DotaService and Seed RL is possible through socket communication if we know the IP address under the same router case.

Network Structure

The first difference of Dota2 from Derk is that each hero has it’s own unique property according to ability, status. This means that we need to make different code for each hero.

Dota2 Difference of Each Heroes

In the Shadow Fiend case, there are a total of 4 non Passive type abilities. Therefore, action network for ability has 4 output.

Network of the Shadow Fiend

On the other hand, network of the Omniknight just needs 3 output.

Network of the Omniknight

In the case of observation network, there is no need to change them because all hero are under same condition for that.

Managing Item and ability

Second, Dota2 here can choose various item and ability during game time.

Dota2 example of item and ability

Furthermore, each item has different target and active method.

Tango item of Dota2

For example, the Tango is most basic item can be purchased at the store when start of game. Hero can use it on one of the near tree to regenerate the health.

The hero can purchase and use Tango items like a below video.

Example of using the Tango item

Each ability also has different target and active method.

Shadowraze ability of Dota2

The Shadowraze is basic ability of Shadow fiend hero. That ability does not require the target. Instead, hero can cast it by distance to the enemy

It would best to use that ability when an enemy hero or creep is within range of it like a below video.

Example of using the Shadowraze ability

The hero of Dota2 can upgrade low level items to high level one by using recipe system.

Item upgrade system of Dota2

The video below shows how to obtain the magic wand from recipe.

Example of obtaining the magic wand from recipe

Unlike the Derk game, where map size small, the distance of Dota2 between starting and battle point are long.

Town Portal Scroll of Dota2

Therefore, the hero should use the Town Portal Scroll to join and exit from battle quickly. Hero receives one TP scroll at the beginning of the game, and it must be purchased from the store after using.

Below video shows how to come back from battle point to starting point quickly using the TP scroll.

Example of using the Town Portal Scroll

The Town Portal Scroll is usually used to escape from an emergency situation. For quick item buying without moving to starting point every time, hero can use Courier to deliver items.

Courier of Dota2

If hero isn’t around the store and you buy an item, it’s stored in stash. The Courier can either retrieve the item here or buy the item instead at the secret shop and give it to the hero.

Below video shows how the hero at battle point obtains an item without moving to the starting point using the Courier.

Example of using the Courier

In MOBA games, there is a heroes who is mainly in charge of attack, and other heroes with support ability can assists them.

Purification ability of Omniknight

For example, Omniknight hero has ability for recovering HP of same team to fight well.

The following video shows an example of recovering HP of same team hero.

Example of using the Purification

Conclusion

In this post, we see how to use the function related to the item and ability of Dota2. In the next post, I will explain how to use such function with previous Deep Reinforcement Learning together.

--

--

Dohyeong Kim

I am a Deep Learning researcher. Currently, I am trying to make an AI agent for various situations such as MOBA, RTS, and Soccer games.