Smart Dog for Minecraft - Aalborg Universitet
[Pages:109]Smart Dog for Minecraft
Kim Arnold Thomsen Rasmus D. C. Dalhoff-Jensen
June 10, 2014
Title: Smart Dog for Minecraft Subject: Reinforcement Learning Semester: Spring Semester 2014 Project group: sw103f14
Participants:
Department of Computer Science
Aalborg University Selma Lagerl?fs Vej 300 DK-9220 Aalborg ?st Telephone +45 9940 9940 Telefax +45 9940 9798
Rasmus D.C. Dalhoff-Jensen Kim A. Thomsen Supervisor: Manfred Jaeger Number of copies: 4 Number of pages: 109 Number of numerated pages: 101 Number of appendices: 9 Pages Completed: June 10, 2014
Synopsis: This report argues for the benefit of combining tabular reinforcement learning with feature-based reinforcement learning, to make it possible for agents to have specific behaviour in specific situations in an general environment impractical to express without features. The report describes three different approaches to do so, as well as an implementation of such a system in the game Minecraft. The report describes a set of test showing that two of the three approaches shows benefit in such a scenario.
The content of this report is freely accessible. Publication (with source reference) can only happen with the acknowledgement from the authors of this report.
Contents
Chapter 1 Word Definitions and Abbreviations
1
I Introduction
2
Chapter 2 Introduction to Smart Dog
3
Chapter 3 The Minecraft World
4
3.1 The technical aspects . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.1 Minecraft Forge . . . . . . . . . . . . . . . . . . . . . . 5
Chapter 4 Problem Domain
6
4.1 Problem Specification . . . . . . . . . . . . . . . . . . . . . . 8
4.1.1 Develop an Intelligent Agent . . . . . . . . . . . . . . 8
4.1.2 Enable the Agent to Learn an Optimal Behaviour for
a Given Situation . . . . . . . . . . . . . . . . . . . . . 9
4.1.3 Enable Knowledge Transfer between Different Situations 9
Chapter 5 Our Thesis
10
II Theory
11
Chapter 6 Reinforcement Learning
12
6.1 Markov Decision Process . . . . . . . . . . . . . . . . . . . . . 12
6.2 Optimal Policy . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6.3 -greedy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6.4 Q-Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 7 Tabular Q-Learning
18
7.1 Size Factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Chapter 8 Feature-based Q-Learning
20
8.1 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.2 Q-values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
III
CONTENTS
8.3 Weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 8.3.1 Weight Adjustments . . . . . . . . . . . . . . . . . . . 21 8.3.2 Gradient Descent . . . . . . . . . . . . . . . . . . . . . 22
Chapter 9 The Combined Approach
24
9.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
9.2 Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . 25
9.3 Approaches for Q-value Updates . . . . . . . . . . . . . . . . 25
9.3.1 Separate Update . . . . . . . . . . . . . . . . . . . . . 25
9.3.2 Unified-Q Update . . . . . . . . . . . . . . . . . . . . 26
9.3.3 Unified-a Update . . . . . . . . . . . . . . . . . . . . . 26
9.4 Transferring Knowledge . . . . . . . . . . . . . . . . . . . . . 26
III Components
28
Chapter 10 Overview of the system
29
10.1 Why do we use Minecraft? . . . . . . . . . . . . . . . . . . . . 29
10.2 Conceptual System Structure . . . . . . . . . . . . . . . . . . 29
10.3 System Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . 30
10.4 The EntitySmartDog Class . . . . . . . . . . . . . . . . . . . 32
10.5 The Following Chapters . . . . . . . . . . . . . . . . . . . . . 32
10.5.1 Source Code . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter 11 State and State Space
34
11.1 State-Attributes and Features . . . . . . . . . . . . . . . . . . 34
11.2 Possible Actions . . . . . . . . . . . . . . . . . . . . . . . . . 35
11.3 Changes from previous semester . . . . . . . . . . . . . . . . . 36
Chapter 12 Virtual Real-Time Perspective
38
12.1 Using the onLivingUpdate Method . . . . . . . . . . . . . . . 39
12.2 Synchronisation . . . . . . . . . . . . . . . . . . . . . . . . . . 42
12.3 Changes in The Entity Smart Dog Class . . . . . . . . . . . . 42
Chapter 13 Sensory Module
43
13.1 Checking for State-Changes . . . . . . . . . . . . . . . . . . . 43
13.2 Obtaining a state . . . . . . . . . . . . . . . . . . . . . . . . . 44
13.2.1 Health . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
13.3 Collecting Rewards . . . . . . . . . . . . . . . . . . . . . . . . 47
13.4 Changes from Previous Semester . . . . . . . . . . . . . . . . 47
Chapter 14 Decision Making Module
49
14.1 Making Decisions . . . . . . . . . . . . . . . . . . . . . . . . . 51
IV
CONTENTS
14.2 Updating Q-values . . . . . . . . . . . . . . . . . . . . . . . . 52 14.2.1 Seperate Update . . . . . . . . . . . . . . . . . . . . . 52 14.2.2 Unified-Q Update . . . . . . . . . . . . . . . . . . . . 53 14.2.3 Unified-a Update . . . . . . . . . . . . . . . . . . . . . 54
14.3 Changes since Previous Semester . . . . . . . . . . . . . . . . 54
Chapter 15 Actuator Module
55
15.1 Actions and Possible Actions . . . . . . . . . . . . . . . . . . 55
15.1.1 Wait . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
15.1.2 Activate Button . . . . . . . . . . . . . . . . . . . . . 56
15.1.3 Move to block . . . . . . . . . . . . . . . . . . . . . . . 56
15.1.4 Pick Up Edible/Non-Edible Item . . . . . . . . . . . . 56
15.1.5 Drop Item . . . . . . . . . . . . . . . . . . . . . . . . . 57
15.1.6 Eat item . . . . . . . . . . . . . . . . . . . . . . . . . . 57
15.2 Performing an Action . . . . . . . . . . . . . . . . . . . . . . 58
15.3 Changes Since Previous Semester . . . . . . . . . . . . . . . . 61
Chapter 16 Knowledge Base
62
16.1 Q-Learning Documet Handler and Feature Document Handler 62
16.2 Changes Since Previous Semester . . . . . . . . . . . . . . . . 63
Chapter 17 Testing Component
64
17.1 Analysing the Test Results . . . . . . . . . . . . . . . . . . . 65
IV Testing
66
Chapter 18 The First Potion Test
67
18.1 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . 67
18.2 Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
18.3 Test Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
18.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
18.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Chapter 19 Second Potion Test
71
19.1 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . 71
19.2 Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
19.3 Test Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
19.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
19.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Chapter 20 Testing with Food
76
20.1 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . 76
V
CONTENTS
20.2 Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 20.3 Test Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 20.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 20.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Chapter 21 Testing Knowledge Transfer
80
21.1 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . 80
21.2 Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
21.3 Test Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
21.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
21.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Chapter 22 Test Evaluation
84
V Evaluation
86
Chapter 23 Discussion and Conclusion
87
23.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Chapter 24 Reflection
89
Bibliography
91
VI Appendex
92
VI
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
Related searches
- smart investing for beginners
- smart car for sale cargurus
- sample smart goals for educators
- smart investments for beginners
- minecraft servers for minecraft education
- smart goals for communication building
- smart goals for communication skills
- smart words for bad
- smart goals for effective communication
- smart goals for communication sample
- smart goal for communication improvement
- smart goals for vendor management