How to build the Deep Learning agent for Minecraft with code— Tutorial 1
In Minecraft game, agent is required to collect sub-materials to create tool and for collecting advanced materials for mining diamond at last step. These kind of task can be usefully used to test performance of artificial intelligence agent.
You can find every code for this blog at https://github.com/kimbring2/minecraft_ai.
Latest Research for Minecraft agent
Since the first release of MineRL, several researches had been released using Deep Learning. We are looking forward to look two of them which I can find a source code with good performance.
First, ‘Sample Efficient Reinforcement Learning through Learning From Demonstrations in Minecraft’, this study apply the Scalable Distributed Deep-RL method.
In the next study, ‘Hierarchical Deep Q-Network from Imperfect Demonstrations in Minecraft’, this research separates crafting and moving action.
The first method fails to mine diamond instead of requiring only one network means it takes little memory capacity. Second method succeed in mining diamond, but multiple network have to be created which means it takes more memory capacity than first method.
Inspired by these two studies, we can build the neural network for agent model which is responsible for the moving of agent. This network can be first trained by the supervised learning way using the observation data of each task extracted by the second study method. Then, the training method of the first study will be used to train that network more. The item crafting part of agent uses the extracted action data of each task.
Learning-Based model
First, create a network for A2C. This network receives observation as input and returns policy and value for action selection and training.
We can train the Learning-Based Model by using the human expert data via the Supervised Learning. Furthermore, Reinforcement Learning method that utilizes the rewards obtained from the environment can be used.
Rule-Based model
In order to mine diamond at Minecraft, agent have to collect and craft the item in sequence.
For example, that sequence of item of one of human expert data is log, planks, crafting_table, planks, stick, wooden_pickaxe, crafting_table, dirt, cobblestone, coal, dirt, coal, cobblestone, coal, stone_pickaxe, torch, crafting_table, iron_ore, furnace, iron_ingot, iron_pickaxe, wooden_pickaxe, iron_ingot, iron_pickaxe, iron_ingot, furnace, stone, stone, stone. What we need to know here is that agent acts differently at each subtask. If there is only one agent network, we to add some signal to distinguish between observation of subtask such as adding one-hot encoded layer.
The moving action such as “attack”, “back”, “camera”, “forward”, “jump”, “left”, “right”, “sneak”, and “sprint” need to be trained using the neural network because most of data consist of them. However, crafting action such as “place”, “equip”, “craft”, “nearbyCraft”, and “nearbySmelt” are rare and fixed than moving action. Therefore, it is better idea to use them as the order of the data.
Conclusion
In this post, we explore how to create an AI agent for Minecraft using Deep Learning. Due to the complexity of Minecraft, far more techniques is required to train agent well than a simple game like Pong.
In the next post, we will combine these techniques to create an agent that can mine diamond.