I



Localisation through supervised learning

Lagriffoul Fabien

ens03fll@cs.umu.se

January 11, 2005

20 credits

Umeå University

Department of Computing Science

SE-901 87 UMEÅ

SWEDEN

Abstract

This thesis describes an odometry method based on supervised learning. Sensory data is collected from the robot’s shaft encoders and position data from a ceiling camera. This data is then used for training a Feed Forward Neural Network. An appropriate representation of this data, that leads to better learning, is discussed. The system is then extended with additional sensory input, independent of the wheels. Experimentation shows that, properly trained with such additional sensory information, the system performs better than classical odometry, especially in situations where the wheels are slipping.

Contents

I. Introduction 5

1. Odometry and robotics 5

2. Approach taken in the present thesis 5

3. Thesis outline 6

II. Odometry basics 7

1. Odometry sensing 7

2. Systematic error 8

3. Non systematic error 9

4. Improving odometry with additional sensory input 9

5. What is different in the presented approach ? 10

III. Experimental set up 12

1. The robot 12

2. Software 12

3. The arena 13

4. The supervisor 14

5. What data is needed ? 15

IV. Extracting position data from the video camera 16

1. The Khepera’s hat 16

2. Video processing 17

3. Space warping 19

4. Accuracy 20

V. Neural networks basics 23

1. Which kind of neural net ? 23

2. Description of a Multi Layer Neural Network 24

3. Mean-squared error, training, epoch 25

4. Overfitting / Generalisation 25

5. Consistency of data 27

VI. Preliminary experiments with neural nets 28

1. Classical odometry with Khepera 28

2. Experiment 1 29

3. Further tests 32

4. Importance of uniform distribution in the training set 32

5. Experiment 2 : a new representation of data 35

6. Results 38

7. Post analysis of the mappings 39

8. Conclusion 41

VII. Experiments with the physical robot 43

1. The synchronisation problem 43

2. A new representation of data (2) 45

3. Results 46

4. conclusions 50

VIII. An additional input 51

1. K213 vision turret 51

2. Can a range-finder make odometry ? 52

3. method 54

4. Results 56

5. Experiment with slippage 61

IX. Conclusion and future work 65

X. Appendix 66

1. Forward kinematics 66

2. Experiment with different training sets 68

XI. References 70

Introduction

1 Odometry and robotics

Odometry is the general ability for any system to know its own position. It is therefore an issue of primary importance in autonomous mobile robotics. Although it’s not always necessary to know the robot’s position to reach a certain goal, it’s useful in many applications such as path following and map building.

The basic idea of odometry is to retrieve information from different sensors and process it to estimate the position of the robot. The classical example is to determine how much the wheels have turned thanks to shaft encoders, and then to combine these values with geometrical features of the vehicle (distance between wheels, diameter) to determine the movement. This basic approach provides good results in indoor environments, but becomes inaccurate in real world applications, because of irregularities of the ground and slippage of the wheels. The common approach to overcome these limitations consists in building a model of the error, and correcting the odometry using maximum likelihood (see e.g. [Chong Kleeman]).

2 Approach taken in the present thesis

In this thesis, the features of the vehicle are not assumed to be known. Instead, the approach is to find the relationship between the sensor values and the movement of the robot. This relationship can be found without any prior knowledge about the geometry of the vehicle, by supervised learning techniques. This learning will be achieved by training a neural network with data collected in real navigation situations.

This method has two major advantages. The first one is simplicity. Indeed, no complicated model of the robot is needed, whether it has two wheels or six legs, the principle remains the same. The second advantage is robustness : information from many different sensors can be combined in the neural net, so that if one of them is ineffective or noisy, the system still gives good results. The drawback of the method is that the performance of the neural network depends highly on how it is trained. There is no well defined method for this; it is more a matter of empirical rules and experimentation.

3 Thesis outline

Chapter II describes the current odometry techniques from the sensor level to the algorithmic level. It addresses also the limitations of odometry and the current methods used to overcome them. Then it shows in what the approach taken in this thesis sets apart from existing techniques.

Chapter III describes the experimental setup : hardware, software, and equipment used to conduct this work.

Chapter IV focuses on a tool of primary importance for the work, since it is used as a supervisor for all the real experiments : the video camera. Video processing and accuracy are discussed.

Chapter V gives an overall view of artificial neural networks. The aim is to make the reader understand which kind of neural network is suitable for this application, and to introduce the concepts that are useful for evaluating and improving such a system.

Chapter VI presents a first experiment conducted with the Khepera simulator. The goal was to get familiar with the neural networks before experimentating with the real robot. This first try has resulted in useful hints that saved time in the later experiments.

Chapter VII describes the experiments with the real robot, and shows how the “real world” constrains are handled.

Chapter VIII describes the extension of the system with an additional sensor. The results are presented and compared to classical odometry in normal and sliding situations (using the simulator).

Chapter IX gives the conclusion.

Chapter X gives the mathematical basis of classical odometry, and the detailed setup of experiments conducted in Chapter VI.

Odometry basics

This chapter aims at presenting the basics of odometry, and showing the most common sources of error in odometry systems. It. It also presents methods that improve odometry by using additional sensor information, and shows in what respect the approach taken in this thesis is different.

1 Odometry sensing

In its basic form, odometry uses information from wheels rotation only, it is then called “dead reckoning”. The most common technique to retrieve the rotation of the wheels is to use optical incremental encoders :

[pic]

Figure II.1 - The optical incremental encoder

The disc is coupled to the wheels or directly to the motor. Counting the resulting pulses enables to calculate the rotation of the wheels. The user manual of the Khepera (robot used in this thesis, see section III.1) explains :

“Every wheel is moved by a DC motor coupled with the wheel through a 25:1

reduction gear. An incremental encoder is placed on the motor axis and gives 24 pulses

per revolution of the motor. This allows a resolution of 600 pulses per revolution of the

wheel that corresponds to 12 pulses per millimetre of path of the robot.”

This means that for each pulse counted, the wheel moves by 1/12 ≈ 0.08 mm. From this value and from the geometric features of the robot, one can derive equations that tells how the robot moves. Those equations, known as forward kinematics equations are then different for each kind of robot. The forward kinematics equations of the Khepera robot are given in appendix 1.

2 Systematic error

The systematic error in the basic form of odometry is mainly caused by imperfections in the construction of the robot. The problem with this kind of errors is that they accumulate constantly along the path the trajectory of the robot. The origin of these error is however well identified [Borenstein & Feng, 1996] :

• Unequal wheel diameters

• Average of both wheel diameters differs from nominal diameter

• Misalignment of wheels

• Uncertainty about the effective wheelbase (due to bad wheel contact with the floor)

• Limited encoder resolution, Limited encoder sampling rate

These errors are easy to model and can be corrected by Borenstein’ s method UMBmark ([Borenstein & Feng, 1994]). It consists of making the robot move along a certain path, and measuring the difference between the theoretical arrival point and the actual arrival position of the robot. The error measured, combined with the model, enables calculation of two calibration factors (one for the diameter of the wheels and one for the wheelbase), that can directly be added in the odometry algorithm. The idea is illustrated in figure II.2.

[pic]

figure II.2 – The test path

3 Non systematic error

Non systematic error is due to unpredictable interactions between the robot and its environment ([Borenstein & Feng, 1996]) :

• Travel over uneven floors

• Travel over unexpected objects on the floor

• Wheel-slippage due to slippery floors, over-acceleration, fast turning (skidding), external forces (interaction with external bodies), non-point wheel contact with the floor

Contrary to systematic errors, non-systematic errors are not reproducible, and it is thus impossible to correct them by a calibration method. The most common approach is then to make a statistical model of the error, and reduce this error by using additional sensors. There are almost as many methods as there are sensors. For instance, [Schroeter et. Al, 2003] use orientation information from patterns on the floor to correct the heading of the robot, [Nagatani et. Al.] use optical flow information.

4 Improving odometry with additional sensory input

When systematic errors have been corrected, the only way to further improve the accuracy of odometry is to use additional sensors to enhance information from the shaft encoders. The problem is to fuse information from different sources of information i.e. given two different measurement, how to determine which one is the more reliable, and to what extent. This dilemma is addressed by considering that each sensor reading contains noise, thus the values can be statistically characterized by their mean (μ) and variance (() as shown in Figure II.3.

[pic]

Figure II.3 – Model of error

There are many methods to fuse information from different sensors : Maximum Likelihood Estimation [Nagatani et. al], Kalman Filter [Hakyoung et. Al.], Neural Network [Colla et. Al.] are common techniques. [Moshe et. Al.] present a review of these techniques.

The sensors used for additional inputs can be grouped in two categories, depending on if they get information relative to an internal frame of reference (Proprioceptive sensors) or relative to the robot itself (Exteroceptive sensors) :

Examples of proprioceptive sensors

- GPS

- Accelerometer

- Gyroscope

- Inclinometer

- Compass

Examples of exteroceptive sensors

- Range Finders (Infrared, Laser)

- Radar (US, Radio waves)

- Camera (Stereo Vision, Optical flow)

Information from the wheel encoders can also be fused with higher level information e.g. Landmarks, Beacons, or Maps. This kind of information is generally a rough estimate of the location; these methods are thus rather a way to “recalibrate” odometry at known places [Palamas et. Al.] in order to reset the accumulated drift.

5 What is different in the presented approach ?

First, the presented approach doesn’t use any mechanical model of the robot or of the error. Instead. the systematic and non-systematic errors are modelled by the neural network during learning.

Second, it also differs in the way additional sensory data is handled. Other methods (see section II.4) first calculate separately odometry information from different sensors. Those different estimations are then combined using some method (see Figure II.4). In this thesis, both calculation of odometry and fusion are performed at the same time in the neural network (Figure II.5).

Figure II.4 – Odometry calculation is done before fusion

Figure II.5 – Odometry calculation and fusion are done at the same time

Experimental set up

1 The robot

[pic]

Figure III.1 - The Khepera robot

For all experiments, a Khepera robot was used. It is a two-wheeled robot endowed with all the necessary sensors to achieve navigation. For my purpose, only the shaft encoders are used. The first idea was to use the Vision Turret (a simple camera mounted on top of the Khepera) in combination with the shaft encoders but this idea was abandoned after some time (see VIII.1)). The robot is connected to a PC via a serial connection, which allows reading the sensors values and send the navigation commands at the rate of 30 Hz (This rate becomes only 13 Hz when the Vision Turret is used).

2 Software

All the software parts are implemented with MATLAB. The programming language is easy to work with and a number of powerful tools are available, such as :

• Neural network toolbox

• Image processing toolbox

• A Khepera simulator : Kiks ([Nilsson, 2001])

These tools motivated the choice of MATLAB as developing platform, because they cover exactly the needs for this thesis. Implementing a neural network and its learning algorithms in C language is not simple, but MATLAB do it in one line of code. A Khepera simulator isn’t a real Khepera, but it enables to save time in many cases. Having those tools gathered in the same software fairly simplifies the implementation.

3 The arena

[pic]

Figure III.2 – The arena

The Khepera moves in a 50 x 80 cm arena. The walls are covered with black and white stripes in order to enable a good quality of the pictures taken by the camera. This arena provides also a flat floor suitable for the Khepera’s small wheels (The Khepera is definitively an indoor robot).

In one corner of the arena, a small PCB board with two LEDs (Light Emitting Diode) was placed. This board is wired to the Khepera, so that it can be switched on and off through a MATLAB command. The role of this LED is to help in later synchronisation of the data, since a part of the data comes from a video camera (see Figure III.3 and section VI.1)).

[pic]

Figure III.3 – The PCB with LEDs

4 The supervisor

This section describes a key component for the experiments conducted in this thesis. Indeed, as was mentioned in introduction that the aim was to find a relationship between the sensors values and the movement of the robot. The sensor values are already read via the serial connection. It is also necessary to get the position of the robot at any time, with an acceptable accuracy. The ideal solution would be to have a GPS and a digital compass embedded in the Khepera, but the Khepera is too small and GPS accuracy isn’t suitable for such a small area (furthermore they do not work indoors).

The solution chosen for this thesis was then to use a video camera. The camera used can record 25 frames/s and has a resolution of 720 x 576 pixels. As the arena is 500 x 800 millimetres, the accuracy is 1 pixel per millimetre (this is a rough estimation, it is discussed in more details in section IV.4)). Of course, video processing is required to extract the real position of the robot from the video. This processing and its accuracy are detailed out in the next section.

The setup is summarized in Figure III.4.

[pic]

Figure III.4 – The experimental setup

5 What data is needed ?

In order to train a neural net, we need a set of data called training set. The training set is made of two distinct sets called input set and target set. During training, the neural net should learn to associate those two sets, i.e. given a certain input value, it should output the corresponding target value.

In the present case, we want the neural network to learn a relationship between the sensors values and the movement of the robot. This means that somehow we need the shaft encoder values in the input set and somehow we need the position of the robot in the target set. I say “somehow” because these data can be represented in many different forms, and the choice of this form influences considerably the success of the training. This issue is explained in details in chapter VI.

Extracting position data from the video camera

Video camera images are used to provide target values for the Khepera’s exact position. This section describes the techniques and algorithms used to achieve this.

1 The Khepera’s hat

The goal of the video processing is to get the pose of the Khepera. The pose is made of three coordinates : x, y, and ( (Figure IV.1).

[pic] Figure IV.1

A special “hat“ has been built for the Khepera. It is a square piece of black cardboard, on which two patches are glued : one is green and one is red. If one can get the position of those patches, then it’s easy to calculate the pose : x and y are given by the midpoint between the patches, and ( is computed as follow :

[pic] Figure IV.2

[pic] [pic] [pic] (4.1)

Regarding colours, some tests with the video camera revealed that red and green colours appear more clearly in the movie. The video camera can also be set up in different modes corresponding to different lightning conditions. In the present case, it turns out that the “sports mode” gives the best results, i.e. when one looks carefully at the movie, the patches appear contrasted and with low noise (see Figure IV.3).

2 Video processing

To achieve the video processing, the movie is first loaded in MATLAB. The goal is to track the position of the two patches in each picture, but as the picture is large, it’s unfeasible to scan all the pixels of each frame. To cope with this problem, two windows of 60 x 60 pixels around the patches are manually defined for each patch in the beginning of processing. Then, assuming that the robot does not move that fast along the movie, the positions of the windows are updated so that the patches are in the centre of their respective windows, as shown in Figure IV.3. In this way, the algorithm searches for coloured pixels only in small areas, which make it both faster and insensitive to noise.

[pic]

Figure IV.3 – The windows used for scanning pixels

Another problem is that the Khepera itself and its wires produce a lot of red and green pixels. So I made the hat much larger than the Khepera in order to mask those unwanted pixels. As a result, the windows used by the algorithm only contain dark pixels and a coloured patch in the centre. Figure IV.4 shows what the algorithm “sees” through the windows :

[pic]

Figure VI.4 – What the algorithm “sees”

Those two techniques makes the algorithm both faster and more robust to noise. The algorithm then becomes simple :

(For each frame)

1. Change the colormap from RGB to HSV (robust to light changes)

2. Scan the red window and store the (x, y) values of each red pixel (using a threshold)

3. Calculate the mean of these values (= geometric center of the red patch)

4. Store this point in a table

5. Update the position of the red window on this point

6. Repeat 2-5 with green instead of red

7. Goto next frame

The result of this is two tables with the (x, y) coordinates of the centers of the red and green patches, from which it’s easy to get x, y, and ( (see equations (4.1)).

|Xred |Yred |

|… |… |

|… |… |

|… |… |

|Xgreen |Ygreen |

|… |… |

|… |… |

|… |… |

Figure IV.5 – Tables resulting from the algorithm

3 Space warping

At this point, we have the position of the patches in pixels, but we need it in millimetres. Actually, the picture formed in the camera is a projection of the real world, depending on the position of the camera with respect to the arena. This problem is known as perspective projection [Wolberg, 1990]. The image processing toolbox included in MATLAB allows doing this transformation easily. Basically, it uses 4 input points and 4 base points to determine the transformation (see Figure IV.6).

(From the movie) (From reality)

Input points Base points

Figure IV.6 – Perspective transformation

Four blue patches were added in the corners of the arena, and the dimensions of the resulting rectangle were measured by hand : these are the base points. Then, the positions in pixels of those patches are manually retrieved in the first frame of the movie (assuming that the camera doesn’t move during the experiment) : these are the input points. From these points, the transformation can be calculated by the built-in function cp2tform. The resulting transformation can then be applied on the tables originally from video processing (Figure IV.5), by using the MATLAB function tformfwd. In this way the real (x, y, () coordinates of the Khepera can be estimated for each frame in the recorded film.

4 Accuracy

In order to estimate the accuracy of the method described above, another method at least as accurate, is required. The problem is that the accuracy of the method described above is around 1 millimetre, so one has to find a method to compute the position of the robot along a path, with an accuracy of at least 1 millimetre, and this has to be done 25 times per second. Forward kinematics doesn’t reach this accuracy, and using a ruler while the robot is moving was out of question. Another idea is to make the robot move in straight line and check if the video processing results in a straight line as well, but one still has to measure this line (not easy), and the wheels can slide. Moreover, the Khepera never goes in a straight line.

Therefore, the following method was invented. Instead of using a video from the camera, one can use a digital image software : 3D Studio Max. With this software, it’s possible to design 3D objects, place them in a scene, add lightings and cameras, and make a video render of this. So a virtual copy of the arena was designed, the Khepera with its hat, and three lights were designed. Some noise was also added in the colour of the patches so that the final pictures are not too “perfect”. The advantage is that all the movements of objects can be controlled exactly. For instance in Figure IV.7 the parameters for the trajectory were (-100, -100) ( (100, 100) ( (100, 0) ( (0, 0)

The resulting animated video film was fed into the video processing algorithm described in section IV.2. In this way, the true trajectory and orientation of the virtual Khepera is fully known, and it is easy to compare these values with the result of the video processing. Figure IV.7 shows the true trajectory (straight line), and the result of video processing (smooth line).

[pic]

Figure IV.7 – Accuracy of video based positioning

As we can see, the accuracy is really good, but noise occurs. The noise can be removed by smoothing (i.e. applying a low pass filter) but too much smoothing is known to result in a loss of information. This is clearly shown in Figure IV.8 : accuracy is gained in the straight parts, but lost in the corners. However, such sharp corners never occurs in real trajectories. The smoothing used in Figure IV.8.b has been used for the remaining experiments.

[pic] [pic]

a) (b)

Figure IV.8 – Video processing: rough (a) and smoothed (b)

Using such a smoothing results in an accuracy between 0 and 1 mm, as shown in Figure IV.8. Of course, pictures from the real camera contain more noise, but this effect is reduced since the centre the patches is the result of the mean of about 100 pixels (see Figure VI.4). The overall accuracy should then remain of the order of magnitude of 1 millimetre or less.

[pic]

Figure IV.9 – Accuracy of video processing

Neural networks basics

1 Which kind of neural net ?

Since the first neuron model [Mc Culloch et Pitts, 1943], many kinds of neural networks with different architectures and learning methods have been developed. Because each one has its own features, a preliminary overview has been necessary to decide which one was the more suitable for the present thesis.

Neural Networks can be divided in two categories according to their learning method : in supervised learning, the network has to learns a relationship between its inputs and its outputs (Attribute-Value) through presentation of a set of several examples. In unsupervised learning, the data is presented to the network, which finds by itself some “clusters” or “classes” inside the data. Here is a list of the main types of neural networks and their domain of application [Personnaz & Rivals, 2003], [Hellström, 1998] :

Networks for supervised learning :

• Perceptron : Classification for linearly separable problems

• Multi-Layer Perceptron (MLP) : Classification and function approximation of any continuous function

• Radial Basis Network (RBF) : Similar applications than MLP but different activation function (thesis next part)

• Recurrent Neural Network : Temporal series problems

Networks for unsupervised learning :

• Competitive Learning Networks : Clustering of data

• Kohonen Self Organizing Maps (SOFM) : Clustering of data

In this thesis the aim is to learn odometry from collected data, which is typically supervised learning. The neural network will be used to approximate the odometry function, therefore I decided to use a MLP, but a RBF could have been used as well.

2 Description of a Multi Layer Neural Network

[pic] Figure V.I

Input Hidden Output

Layer Layer Layer

This kind of neural network is made of elementary units called neurons, interconnected by synapses. The neurons are organized in several layers : one input layer, one output layer, and one or several hidden layers. Different ways of connecting layers are possible but more often the layers are fully connected. The synapses are characterized by their weights.

[pic] Figure V.2

The neuron first calculates its activation A by calculating a weighted sum of its inputs. The activation function can then be written as (RBF use another function) :

[pic][pic]

The output S of the neuron is the result of applying a function called transfer function on the activation value. Thus the output of a single neuron can be written as :

[pic]

Where f (also called threshold function) can be of different type, resulting in different properties for the neural network. In the present thesis, I used the Hyperbolic Tangent sigmoid function :

Figure V.3

The neural network processes the data by propagating the values (which can be thought as a vector) from its input layer to its output layer applying this processing. One says that one simulates the network with an input vector.

I will now introduce general notions about neural nets, which are useful to understand the next parts.

3 Mean-squared error, training, epoch

The term Mean-Squared Error (MSE) is often used to evaluate the performance of a neural network. MATLAB uses by default this error function during training, but any other error function can be used. The training is an iterative process working as follow (This the Back-Propagation algorithm but other methods for this e.g. Boltzman, Cauchy) :

8. Taking the first vector of the input dataset : I1

9. Presenting I1 on the input layer

10. Propagating I1 through the neural network ( Output value = O1

11. Calculating error E = T1 - O1 (T1 is the target value)

12. Retro propagating the error i.e. adjusting the synaptic weights so that this error decreases

13. Looping to 1. until the hole training set is processed

When the training set has been processed, one says that one has achieved an epoch. It generally takes several epochs so that the network converges i.e. the MSE reaches a low value.

4 Overfitting / Generalisation

Those concepts give a more qualitative idea about the training. Indeed, a small MSE doesn’t mean that the neural net has learned properly. The following figures illustrate this (in the 1-dimensional case) :

[pic] Figure V.4

Figure V.4 represents a given training set. If one simulates the trained neural net with input data, one can observe two different behaviours : On Figure V.5, we observe that the neural network fits exactly the training data (•), but performs very least for unlearned data : that is over-fitting.

[pic] [pic]

Figure V.5 Figure V.6

Figure V.6 is the opposite behaviour. The training data is not matched perfectly, but the overall response is better, the neural network can to some extent guess the output values even for data that were not in the training set : that is generalization.

More often, over-fitting occurs when there are too many neurons in the hidden layer, or because of a too much intensive training with similar data (too many epochs). One method to avoid this is to use the “early stopping” technique. It consists in using a training set and a test set during training. For each epoch, the MSE is computed on the test set, and the variations of MSE are monitored. During training, the MSE always decreases for the training set, but for the test set comes a time when MSE starts increasing : this is an indication that over-fitting occurs i.e. the neural network generalizes less. Then the training is stopped :

[pic] Epochs

Stop training

Figure V.7

5 Consistency of data

Neural networks are fantastic tools for processing data, because they are able to find relationships between sets of data of high dimension, when sometimes even not an expert is able to formulate an empirical rule. In the bank domain, for example, how to formulate the relationship that exists between [genre, age, profession, annual income, number of children, type of car] and [Suitable for bank loan] for a customer ? A statistical analysis of historical data can lead to a mathematical modelling of the problem. A neural network can find this relationship through learning from the same data. Both methods can work out, providing only one thing : the data has to be consistent, i.e. a relationship must really exist.

An other way to understand this is to consider inconsistency : There is no relationship between the colour of the Khepera and the speed of its left wheel, for example. This example is obviously inconsistent, but sometimes the inconsistency is more subtle to detect. It can be hidden in the way the problem is formulated, i.e. in the choice of the variables to represent inputs and outputs.

There are some methods (Variance Analysis, Polynomial Approximation) to help in the choice of the variables. The next chapter addresses this problem by showing how several modifications of the problem formulation have been necessary to finally come up to a consistent representation of data.

Preliminary experiments with neural nets

In this section, results from experiments with a robot simulation are presented. A neural network is used to model the odometry function, and different mappings are tested. The “correct” position is computed using the classical odometry function.

1 Classical odometry with Khepera

The Khepera has two shaft encoders for its left and right wheels. Those two sensors increment or decrement two tick counters according to the rotation of the wheels. 1 tick corresponds to a displacement of 0.08 mm of the wheel. Implementation of odometry consists repeated use of these counters in order to update the pose of the robot.

[pic]

Figure VI.1

The pose of the robot is made of three values : (x, y, () where x and y are the absolute cartesian coordinates, and ( the orientation of the robot, as shown in Figure VI.1. Generally, the absolute values of the counters are not relevant, so we are more interested in their variations during one iteration of the algorithm. Let us denote these variations dL and dR for the left and right wheel respectively.

The implementation of odometry for a typical robotic application can be done as follow :

Global Pose

Pose = [x0, y0, (0] // initialises pose

...

[dL, dR] = read_encoders // gets encoders variation

[dx, dy, d(] = Odometry(dL, dR, () // calculates displacement

Pose = Pose + [dx, dy, d(] // updates pose

... // moving commands

where the function Odometry() is a direct implementation of the forwards kinematics equations (see Appendix 1)). The basis for the equations is that the Khepera always moves on an arc of circle, because of its geometry. The centre and radius of this arc can be found using dL, dR, and some geometric features of the Khepera (diameter of wheels, distance between wheels, shaft encoders step).

2 Experiment 1

The first experiments with modelling the kinematics are not done using the real Khepera, but the Khepera Simulator Kiks ([Nilsson, 2001]). Thus the supervisor is not the video camera, but the classical odometry function, which gives good results with the simulator.

The idea with the modeling is to replace the line :

[dx, dy, d(] = Odometry(dL, dR, () (in the program example above)

by :

[dx, dy, d(] = Simulate(MyNeuralNet, [dL, dR, (])

This means that MyNeuralNet has to be trained with (dL, dR, () as input values and (dx, dy, d() as target values. One can also say that MyNeuralNet is trained to realise the mapping : (dL, dR, () ( (dx, dy, d()

So, a program that moves the Khepera along a simple S curve and, at the same time, records the values needed for learning was developed. Around 700 values were recorded for a short run. This set of data has been used to train several neural networks with 2, 3, 4, 5, 6, 8, and 10 neurons respectively in the hidden layer. The creation and training of a neural network is easy with MATLAB :

net = newff(minmax(Input), [5 3], {'tansig' 'purelin'});

number of

neurons

in the Hidden / Output layer Activation function in the Hidden / Output layer

net = train(net, InputSet, OutputSet);

This last command runs the training, and a graph showing the current value of MSE (see section V.3) is displayed. The direct observation of this graph during training is a first indication of the “performance” of the neural net, even if it tells nothing about generalisation capability, or a possible over-fitting problem. Figure VI.2 gives an example of what can be observed when training a neural network which converges : the MSE decreases to a very small value.

[pic]

Figure VI.2

After several runs with different neural networks, it was noticed that maximum performance could be reached using 5 neurons. Beyond this number, adding neurons has no effect on the performance.

This trained 5-neurons neural network was used in order to check if it was suitable for replacing the classical odometry function, as described above. The Khepera was moved along the same kind of S curve, and the poses given by both the odometry function and by the neural network were recorded for later comparison. The result of the experiment is shown in Figure VI.3.

[pic]

Figure VI.3 – The neural net does not perform as well as classical odometry

These results are far from being enough to perform odometry, but still encouraging. It was then decided to test the same neural network on another kind of trajectory to check if it was able to generalize. The results are disastrous, as shown in Figure VI.4.

[pic]

Figure VI.4 – The neural net is not capable of generalizing for a different trajectory

3 Further tests

This first showed a need for a systematic method to work with neural nets. There are many parameters to test :

• Number of neurons in the hidden layer

• Number of epochs

• Learning method

• Quality and quantity of the training set

• Representation of the data

Testing all these parameters with all their possible values is a huge work. It will thus not be detailed here. The final choices made for real experimentation are the result of a long experiment presented in Appendix 2. It consists in 9 different data sets used both for training and testing.

4 Importance of uniform distribution in the training set

According to theory, a training set should be statistically representative of the data the neural net will encounter during normal use. A classical example in classification is a neural network trained to recognize hand-written postal codes on letters. Since the postal codes statistically contain more “1” and “0” than other digits, an ideal training set should be built with the same proportions [Personnaz & Rivals, 2003]. This principle is confirmed by the results of the experiment presented in Appendix 2, in which different curves are used as training data and compared. This experiment shows that the robot performs better by using data from a real trajectory instead of artificially generated, uniformly distributed data.

However, the shapes of the curves used for training do not fully represent the training data. One important piece of information is hidden : the speed. Indeed, the same trajectory can be run at different speeds, and the speed can change during the trajectory. So the performance of odometry may decrease in situations where the speed is different than the speed used for training.

It is therefore interesting to look at the distribution of speeds in the datasets used for training. It is possible to get the speed of the Khepera from the ticks values dL and dR of the training set by using the forward kinematics equations of the Khepera (Appendix 1).

[pic]

Figure VI.5

The forward kinematics equations give the relationship :

[pic] (1), which can be seen geometrically on Figure VI.4.

and Vl, Vr can be expressed as a function of dL, dR and the step ( of the encoders :

Vl = ( dL

Vr = ( dR (( = 0.08mm / tick)

Let T denote the cyclic time for which dL and dR are computed

Thus [pic] (2)

V has been calculated using formula (2) and the dL, dR values of the training set “A” used in Appendix 2. The resulting distribution of V is shown in Figure VI.6 : it is far from uniform.

Occurrence in the training set

[pic]

Figure VI.6 – The distribution of speeds is not uniform

The other training sets B, C, D, E lead to the same conclusion.

Explanation : This is due to the algorithm used for the navigation of the robot. It has its own range of values. We can see in Figure VI.6 that two distinct ranges of values are produced. Indeed, the navigation algorithm used to produce training data is known as “bug algorithm”, which basically switches between two behaviours :

• “Move to goal” when nothing is in between the robot and the goal. Then the robot runs quite fast towards the goal : this is the second hill observed in Figure VI.6.

• “Follow the wall” when a wall is between the robot and the goal. Then the robot carefully avoid to touch the wall, at a lower speed : this is the first hill in Figure VI.6.

This phenomena is not a problem in case one uses the same algorithm for building the training set and navigating. But if one uses a different algorithm for training and navigating, it is likely that the network will not perform well because of statistical difference in the occurrence of dL and dR.

For this reason, the previous navigation (Bug Algorithm) has been replaced by a “Potential Field” algorithm. Since it is not the topic of this thesis, the detail of this algorithm is not presented here, but Figure VI.7 depicts such a potential field. As we can see, the field changes gradually during the trajectory, resulting in smooth variations of the speed and thus, a more uniform distribution of speed.

[pic]

Figure VI.7 – Example of potential field

5 Experiment 2 : a new representation of data

The choice of data representation is the most important factor that has improved the results. It is related to the problem of consistency of data, addressed in section IV.5. As previously mentioned in this part, the inconsistency can be avoided in the choice of the variables used to formulate the problem. The following perfectly illustrates this problem.

First of all, we can observe some redundancy in the mapping used in the previous experiments :

(dL, dR, () ( (dx, dy, d()

Indeed, given a certain (dL, dR), the robot actually always performs the same arc of circle, regardless of the value of ( :

[pic]

Figure VI.8 – The movement is actually independent of (

This means that ( is not relevant regarding the length and the “shape” of the movement, but it is however needed to get the position in the absolute coordinate system (X, Y). That’s why the mapping (dL, dR) ( (dx, dy, d() can not work. A new idea is to remove the x and y components in the right part of the mapping, and compute them later, as follow :

1) (dL, dR) ( movement

2) current pose(x, y) + movement ( new pose (x, y)

These considerations lead to the possible mapping :

(dL, dR) ( (R, d() where R is the instantaneous curving radius (see Figure VI.9)

[pic]

Figure VI.9

The problem is that when the trajectory approaches a straight line, R approaches infinity, which is hard for the neural net since its output is bounded by the transfer functions of neurons. An alternative to this is to replace R by C = 1 / R, the instantaneous curvature of the trajectory, but the same problem arises when the robot turn around its centre (R ( 0). The representation of the data is still not perfect.

One solution to avoid these singular values is to come back to a vectorial representation of the movement, but independent of the absolute coordinate system (X, Y). So why not using the Khepera’s own coordinates system as shown in Figure VI.10.

[pic]

Figure VI.10 - The movement is expressed in the Khepera’s own coordinates system

The new coordinate system is given by (1) the axis of the wheels and (2) the heading of the robot. The elementary move [dU, dV] is thus always expressed with respect to the previous position. Thus, the mapping becomes (dL, dR) ( (dU, dV, d(). It is both independent of ( and has no singular values. The only drawback is that (dU, dV) has to be converted to (dx, dy) in order to update the pose of the robot. This simply consists in a vector rotation and can be done by replacing the lines

[dx, dy, d(] = Simulate(MyNeuralNet, [dL, dR, (]); (see section VI.2)

Pose = Pose + [dx, dy, d(];

by :

[dU, dV, d(] = Simulate(MyNeuralNet, [dL, dR]) ; % new mapping

P = sin(() cos(() % rotation matrix

-cos(() sin(() ;

dUV = [dU; dV]; % rotation of dU, dV

dXY = P * dUV; % to dx, dy

Pose = Pose + [dXY(1), dXY(2), d(]; % update Pose

This change in representation of the data is not just a game with mathematical symbols. As shown in the next part, it actually improves the performance of the neural network by a factor 100.

6 Results

At this point, it was decided to make a new experiment combining all these new ideas. The trajectories used for training were created with a potential fields algorithm, providing various movements at various speeds, resulting in a training set of 1500 values. The new mapping (dL, dR) ( (dU, dV, d() was used.

During the training, a huge difference could already be observed. With the previous method, a usual value for MSE at the end of training was around 0.01. But then it reached 10-6 - 10-7. When implemented in the navigation algorithm, it is impossible to distinguish the trajectory given by the classical odometry from the one given by the neural network. Moreover, only 3 neurons were needed instead of 5 to achieve this.

This new representation of data also improved the ability to generalize. When trained with poor training sets (see Appendix 2), the neural net performs almost perfectly when tested with all the other ones.

Since the high accuracy made it impossible to compare curves visually as a way of measuring performance, another error function had to be used. The function used calculates the average error of all elementary errors along the path. The elementary error is defined for each elementary step as the difference between classical odometry and neural odometry divided by the classical odometry error as shown on Figure VI.11.

[pic]

Figure VI.11 – The error is calculated by averaging elementary errors

The elementary error is defined as : [pic]

where Dneural is the vector [dU, dV] computed by the neural network.

The overall error is defined as : [pic]

The result is an average error of 0.1%, which corresponds to 1 mm on a trajectory of 1 meter (in the simulator). This error is small and zero-centred as well, which suggests that such a neural odometry could be used on long paths, with low drift.

7 Post analysis of the mappings

The results for experiment 2 are much better than for experiment 1 (see section VI.2). This points out the importance to have a good representation of data, and suggests that analysis of variables (see section V.5) earlier could have lead to an appropriate representation of data more quickly. This section presents the polynomial approximation method. This method consists in finding an approximation polynomial that fits a given set of data. The coefficients of the polynomial are a rough indication of how much each input variable is involved in the mapping. This method was applied to the mappings addressed in the previous sections :

(dL, dR, () ( (dx, dy, d() in section VI.2 (the first mapping)

(dL, dR) ( (R, d() and

(dL, dR) ( (dU, dV, d() in section VI.5 (the final mapping)

The first step is to normalize data, to remove any scale effect due to the units used. Then, we have to built a matrix X that contains the input values and their higher degree terms, and a matrix Y that contains the output values, as follow :

X = [1 dL dR ( dL2 dR2 (2 dLdR dL( dR(] (in case of degree 2)

Y = [dx dy d(]

The coefficients of the approximation polynomial are given by the matrix A according to:

Y = AX

A is then computed using the least squares method. The resulting coefficients give an indication on how much the input variables are involved in the output result. The following tables present the results for the three mappings.

Table 1 : Results for the mapping (dL, dR, () ( (dx, dy, d()

| |dx |dy |d( |

|1 |100 |21.6 |-0.2 |

|dL |-0.1 |-0.4 |-2.4 |

|dR |-1.2 |-1 |2.4 |

|( |-1.4 |-0.5 |0 |

|dL2 |0 |0 |0 |

|dR2 |0 |0 |0 |

|(2 |0 |0 |0 |

|dLdR |0 |0 |0 |

|dL( |0 |0 |0 |

|dR( |0 |0 |0 |

In Table 1, we can see that dx and dy are much more related to the constant term 1 than to dL, dR, and (. This indicates a weak relationship and confirms the analysis of section VI.5. However, d( is highly related to dL and dR, the signs of the coefficients even reveals that d( is proportional to (dL - dR), which is in agreement with the forward kinematics equations (see Appendix 1, equation 3.3).

Table 2 : Results for the mapping (dL, dR) ( (R, d()

| |R |d( |

|1 |-100 |0 |

|dL |2.3 |-0.5 |

|dR |0.6 |0.5 |

|dL2 |0 |0 |

|dR2 |0 |0 |

|dLdR |0 |0 |

In Table 2, one can notice the same relationship between d( and (dL - dR), and we see also that R is weakly related to dL and dR. This can be explained by the fact that a given radius R can be the result of different (dL, dR) pairs, as shown on Figure VI.12.

[pic]

Figure VI.12 – The mapping (dL, dR) ( R is many-to-one

Table 3 : Results for the mapping (dL, dR) ( (dU, dV, d()

| |dU |dV |d( |

|1 |10.6 |6.2 |8.6 |

|dL |-1 |17.5 |-100 |

|dR |0.6 |17.5 |99.7 |

|dL2 |-0.6 |0 |0 |

|dR2 |0.6 |0 |0 |

|dLdR |0 |0 |0 |

The final mapping shows a good correlation between the input and dV, d(. Regarding dU, it seems a bit worse, but the presence of terms of degree 2 with significant values indicates that a relationship nevertheless exists.

8 Conclusion

In this chapter, neural networks have been used to model the odometry for a simulated Khepera robot. As a conclusion, I would say that good results are obtained because a good representation of data has been chosen. Appropriate representation of data does not only improve the accuracy of the network, but also reduces the number of neurons (which reduces training and processing time), and improves generalization.

The method of polynomial approximation gives a good assistance in the choice of variables. It should be applied systematically before testing different training sets, with several neural networks. This method saves a lot of time when input variables are numerous and/or when the model is complex.

The good performances observed here are not really surprising since the convergence theorem [Cybenko, 1989] proves that a MLP can approximate any continuous function at any accuracy, providing a good training set. In the preliminary experiments, the training set provided was actually ideal, since each target value was computed by an odometry function based on forward kinematics equations, i.e. the supervisor was exact and noiseless. In the next part, things will be slightly different since the training data will come from the real world : noise in the sensors, accuracy of video acquisition, time shift due to synchronisation. How will this affect the performance of the neural network ?

Experiments with the physical robot

In this section, the real Khepera robot is used. The experimental setup described in section III is used to collect data and train the robot. The results are compared to classsical odometry.

1 The synchronisation problem

This problem can be summarized as follow : The video camera provides a movie with 25 frames per second, while the readings of the shaft encoders values are done at a frequency varying between 9 and 12 Hz (due to errors occurring during communication through the serial protocol). Because of this constantly changing rate, synchronisation can not be done only on the first frame. It has to be done all along the experiment. That’s why a visual mark has been used : a LED wired to the robot, that can be switched on and off from MATLAB, and visible in the movie. In this way, video and shaft encoders values can be synchronised through the blinks of this LED :

Figure VII.1 – Experimental Setup

In the reality :

[pic]

Figure VII.2 – One picture from the movie

The area of the LED is identified manually in the beginning of the video processing, then the average intensity of this area is evaluated for each frame. A threshold function determines if the LED is on or off, as shown on Figure VII.3.

[pic]

Figure VII.3 - light intensity of the LED

Figure VII.4 shows the data collected by MATLAB (Table 1) and the data resulting from video processing (Table 2). Both sets contain information about the state of the LED. This information enables us to match the rows where the LED is on, and the rows in between are matched using linear interpolation :

|dL |dR |Clock |LED on |

| 125 |14 |0 |1 |

|80 |22 |110 |0 |

|45 |36 |232 |0 |

|60 |58 |412 |0 |

|68 |70 |530 |1 |

|99 |84 |645 |0 |

|… |… |… |0 |

|… |… |… |0 |

|… |… |… |1 |

|… |… |… |0 |

|… |… |… |0 |

|… |… |… |0 |

|… |… |… |0 |

|… |… |… |1 |

|LED on |X |Y |( |

|1 |455 |55 |1.653 |

|0 |514 |74 |1.612 |

|0 |520 |80 |1.599 |

|0 |535 |82 |1.567 |

|0 |540 |95 |1.540 |

|0 |512 |111 |1.538 |

|0 |… |… |… |

|1 |… |… |… |

|0 |… |… |… |

|0 |… |… |… |

|0 |… |… |… |

|0 |… |… |… |

|0 |… |… |… |

|1 |… |… |… |

Table 1 : recorded during experiment Table 2 : from processing

Figure VII.4 – Tables resulting from the processing

The accuracy of the synchronisation thus depends on the highest frequency (25 Hz), that is 0.04 seconds.

2 A new representation of data (2)

These considerations about time and synchronisation lead to one more change in the representation of the data. Indeed, the order of magnitude for the values dL, dR, dU, dV, d( depends on the acquisition rate, i.e. a slow rate automatically results in higher values. This means that if the neural network is trained with data from a slow machine, and later used on a faster machine, one may encounter the problem discussed in VI.4).

To make the dataset independent of time, the previous mapping

(dL, dR) ( (dU, dV, d()

has been changed into :

• • • • •

(dL, dR) ( (dU, dV, d()

i.e. the tables above are differentiated and divided by the time elapsed between each row (that’s why there is a column “Clock”). The neural net trained with this data can then be used at any rate. The only difference is that the output has now to be multiplied by the elapsed time to get dU, dV, d(.

3 Results

The training set used for the following results comes from video processing of 9 movies. These movies contains various trajectories, and duration is between 5 and 10 seconds each. The resulting training set contains 800 rows, and is uniformly distributed since a potential field navigation algorithm is used (see section VI.4). 7 neural networks were trained and the one giving the least MSE during training was retained.

MSE

[pic] Epochs

Figure VII.5 – MSE using early stopping method

The error curves for training and test sets are shown in Figure VII.5. Early stopping was used but the MSE decreases in the same way on training set and test set, suggesting that over-fitting is not a problem here. So the training was simply stopped when the MSE became small enough.

For testing, the potential fields navigation algorithm is also used, and the poses given by the classical odometry and the neural odometry are recorded and visually compared. The result is given in Figure VII.6.a (The “correct” trajectory given by the video processing hass not been used for comparison here. Indeed, these results have been obtained at low speed for short trajectories : classical odometry gives results similar to video processing in these conditions).

[pic]

Figure VII.6.a - Results using the rough training set

Since the result was not so good, it was tried to smooth (that is applying a low pass filter) the training set, with the assumption that the noise from video processing was responsible for bad performance. But it did not change anything, as we can see on Figure VII.6.b. It seems as if the neural network has the ability to smooth input data by itself (robustness to noise).

[pic]

Figure VII.6.b – Results using the smoothed training set

Different paths always gave the same error, and the error always occurred in the same manner : the curve given by neural odometry was always above and left with respect to the curve from classical odometry. Generally, constant bias in the error indicates that something is wrong in the training set. So it was decided to plot the input training set and the target training set on the same plot, in order to compare them. I plotted (dL - dR), which is proportional to ( (see Appendix 1, equation (3.3)) and the ( given by video processing :

[pic]

Figure VII.7 – Comparison between input data and target data

A careful look indicates that there is a small delay of about 0.1s between the curves. This delay actually came from the system that switched on and off the led : it uses a photo resistor that doesn’t respond instantaneously. The datasheet of this component indicates that the response time is 0.1s, which matches with the delay observed in Figure VII.7.

In order to check if this delay was involved in the performance, the experiment was reproduced but a time shifting was introduced in the training set. The rows of the target set were shifted of 0.08s (Figure VII.8.a) and 0.04s (Figure VII.8.b). The result of this shifting is that the MSE at the end of training decreases : it becomes respectively 10% and 20% smaller, and the bias is also decreased.

[pic]

Figure VII.8.a – Results with a time shifting of 0.08s : slightly better than Figures VII.6

[pic]

Figure VII.8.b – Results with a time shifting of 0.04s : the bias is faded

4 conclusions

In this section, neural networks have been used to model the odometry for a physical Khepera robot. The results presented in this part are not as good as in Part V, which used simulated data. That’s not surprising, since real world conditions add noise to the data. But it shows that it’s possible to model the forward kinematics equations without any detailed knowledge of the vehicle construction. By using supervised learning and neural networks, sampled data from wheels encoders can be used to construct a useful approximation of kinematics equations.

The loss of performance was most likely caused by a problem in the synchronisation (although the time delay was very small). This suggests that when applying neural odometry to larger systems, special attention should be paid to time synchronisation.

The conclusion at this point is that the method proposed here does not perform as well as classical odometry so far. “so far” because unlike classical odometry, this method can be improved at will, by adding more sensory input to the neural networks. The next chapter describes two attempts of this kind.

An additional input

The first idea for this thesis was that using an additional sensory input, independent of the wheels, should allow the system to yet perform odometry in case the wheels are slipping. For this purpose, it was initially decided to use the K213 Vision Turret, without success. Another approach using a Range Finder was successful. The experiments were conducted with a Khepera simulator.

1 K213 vision turret

This module is a small linear camera mounted on the top of the Khepera (thesis Figure III.1). In ideal conditions, it provides :

• 64 pixels, 256 grey levels

• 20 frames / s

• this kind of pictures :

[pic] Figure VIII.1.a

But under real conditions, many problems occurred :

• The walls were not similarly lightened

• The Khepera’s hat induced some shadows

• The refresh rate of the camera depends on ambient light (hard to synchronise)

• Obliged to use 16 grey levels for faster transmission

• Sometimes, no picture is returned

• It is hard to smooth real-time data

Under real conditions, the pictures more likely look like this :

[pic] Figure VIII.1b

The initial idea with this camera was to compute the optical flow, and use it as an additional input to the neural network. Indeed, the optical flow could have given interesting information about the movement of the Khepera : the left/right movement of the stripes is related to (, and the spreading/shrinking of the stripes indicates a backward/forward movement.

Unfortunately, many attempts for processing such pictures, showed that good pictures were the result of a careful and fastidious tuning of the experimental setup. This means that such results will never be observed out of the little arena of the robot, which is annoying.

It was finally decided to use another sensor instead of the Vision Turret : the SHARP Infrared Telemeter GP2Y0A02, that from my experience produces better results in terms of frequency, and independence to ambient light.

[pic]

Figure VIII.2 - SHARP Telemeter GP2Y0A02

Since physical implementation takes time, and experienced from the past mistakes, it was decided to first test the feasibility of this idea with the help of the Khepera simulator.

2 Can a range-finder make odometry ?

At a first look, it may look quite strange to use a range finder to find its own position, and it is indeed impossible with only one sensor. It is necessary to use at least two of them, and to make a small pre-processing on the data.

Lets consider the case of one range finder, as shown in Figure VIII.3

[pic] [pic]

(1) (2)

Figure VIII.3

If one uses one sensor only, the change from R1 to R1 + ∆R1 doesn’t give any information about the movement of the robot, since the same variation ∆R1 can be the result of a rotation (1), or a translation (2), or even a combination of both. This means that there is no one-to-one mapping :

(R1, ∆R1) ( (dU, dV, d()

Lets consider the case of using 2 sensors and see if combining both values remove this inconsistency. Let us consider the case of pure translation :

[pic]

(1) (2) (3)

Figure VIII.4

Suppose that the robot reads its range sensors, and gets (R1, R2) as result (1).

Further, suppose that the next reading gives the result : (R1 + ∆R1, R2 + ∆R2) (2).

At first, it seems as if only one displacement can lead to such readings (2), but an infinity of other positions can actually match, if we consider all the possible translations of this position parallel to the wall (3). The same thing occurs in case of rotation, and obviously also in the general case (rotation and translation).

[pic]

(1) (2) (3)

Figure VIII.5

But if one look a bit more carefully, the situations at stake in (3) are impossible to be performed by the Khepera, unless it slides laterally (like a crab).

Of course, this is not a proof that (R1, R2, ∆R1, ∆R2) ( (dU, dV, d() is consistent, but an indication that a relationship exists. A way to prove this would be to find the real geometrical relationship.

Remark : This only works if a flat wall is in front of the Khepera (see next section).

3 method

The first thing needed is a function that returns the position of the Khepera in the simulator, in order to emulate the range finders, and also to have a supervisor for learning. This function is built-in : kiks_siminfo_robotpos. Once one have (x, y, (), it’s easy to determine intersections with line equations of the borders :

[pic]

Figure VIII.6 – The virtual Range Finder

An angle of 11˚ between the beams was chosen. It had to be narrow enough such that both beams hit the same wall at the same time, and wide enough to cause the differential information not to decrease.

A training set was built by making the robot move along several trajectories and recording the needed values (around 2500). The result is a table of values (Figure VIII.7.a), that has to be differentiated, and processed to obtain the values (Figure VIII.7.b) that are relevant for learning the relationship :

(R1, R2, ∆R1, ∆R2) ( (dU, dV, d()

|R1 |R2 |Ticks left |Ticks right |X |Y |( |

|… |… |… |… |… |… |… |

|… |… |… |… |… |… |… |

Figure VIII.7.a

|R1 |R2 |∆R1 |

|(dL, dR) ( (dU, dV, d() |6 |1.10-4 |

|(R1, R2, ∆R1, ∆R2) ( (dU, dV, d() |6 |8.10-3 |

|(dL, dR, R1, R2, ∆R1, ∆R2) ( (dU, dV, d() |8 |4.2.10-5 |

Those results are encouraging because :

• The mapping (R1, R2, ∆R1, ∆R2) ( (dU, dV, d() converges, confirming than odometry information can be drawn from range finders alone.

• The mapping using (dL, dR, R1, R2, ∆R1, ∆R2) gives better results than (dL, dR) alone, suggesting that the neural network combines multiple sensory information and thus gets higher accuracy.

Table VIII.8.b – sliding training set

|Mapping |Number of neurons |Average performance |

|(dL, dR) ( (dU, dV, d() |6 |1.47 |

|(R1, R2, ∆R1, ∆R2) ( (dU, dV, d() |6 |0.19 |

|(dL, dR, R1, R2, ∆R1, ∆R2) ( (dU, dV, d() |8 |2.4.10-4 |

The results for the dataset with sliding are even more promising, even if the two first rows point out really bad performance. In Table VIII.8.a, using visual information improve the performance by a factor 2, in Table VIII.8.b this factor becomes 104. This means that when sliding occurs, the different sensory information are not simply added to improve the result : the neural network really combines them together by selecting the most appropriate inputs depending on the sliding conditions.

Odometry results

Here is from my point of view the most interesting part, because it illustrates the feasibility of the goal initially targeted, i.e. managing odometry in (simplified) slipping situations, or at least performing fairly better than classical odometry.

The graphs in Figures VIII.9.a and VIII.9.b show a comparison between the trajectories computed by classical odometry and neural odometry, using the neural network that has shown the least MSE at the end of training. Figure VIII.9.a and Figure VIII.9.b show results for the three best networks trained with and without sliding.

Legends :

Real trajectory (given by the simulator)

Trajectory given by classical odometry

Trajectory given by the neural network

In VIII.9.a, B1 shows that when sliding both method based on shaft encoders fail in the same way. C1 and D1 show that visual information alone is not suitable in any situation, sliding or not. E1 confirms that combining sensory information, produces better results than classical odometry. In B1, D1, F1 we see that in sliding situations, the classical odometry always perform better than a neural odometry not trained for sliding.

Figure VIII.9.b shows that the neural network performs slightly less in non sliding situations, when trained for sliding situations, which is the result of the loss of performance observed in Figure VIII.8.b. The real advantage of the method arises in F2, where it outperforms the classical odometry, by using visual information. The result in F2 is confirmed for other trajectories as well (see Figure VIII.10.a, b)

| |Tested without slipping |Tested with 10% slipping |

|dL, dR|A1[pic] |B1[pic] |

|( dU, | | |

|dV, d(| | |

|R1, |C1[pic] |D1[pic] |

|R2, | | |

|∆R1, | | |

|∆R2 ( | | |

|dU, | | |

|dV, d(| | |

|dL, |E1[pic] |F1[pic] |

|dR, | | |

|R1, | | |

|R2, | | |

|∆R1, | | |

|∆R2 ( | | |

|dU, | | |

|dV, d(| | |

Figure VIII.9.a - Network trained without slipping

| |Tested without slipping |Tested with 10% slipping |

|dL, dR|A2[pic] |B2[pic] |

|( dU, | | |

|dV, d(| | |

|R1, |C2[pic] |D2[pic] |

|R2, | | |

|∆R1, | | |

|∆R2 ( | | |

|dU, | | |

|dV, d(| | |

|dL, |E2[pic] |F2[pic] |

|dR, | | |

|R1, | | |

|R2, | | |

|∆R1, | | |

|∆R2 ( | | |

|dU, | | |

|dV, d(| | |

Figure VIII.9.b - Network trained with 10% slipping

4 Experiments with varying slippage

The previous results are using a value of 10% slippage. One can wonder what happen with different values (for training and test). Too much slipping during training reduces the performance of the neural net in situations without slipping. On the contrary, too little slipping reduces the performance in situations with sliding. Therefore a trade-off should exist.

Table VIII.5.a and b show the results for 6 different neural networks, trained with 0, 10, 20, 30, 40, 50 % of slippage (columns), and tested in situations where slippage varies between 0 and 60 % (rows). Each neural network used is the best one in a population of 20 neural nets trained with the same training set.

Table VIII.5.a : Error (Average distance to the real path in mm)

|Slippage |

|during |

|training |

|Slippage |

|during |

|test |

|A |B |C |D |E |

Table X.2.2 shows the four last training sets (F, G, H, I). They are created artificially, i.e. dL, dR, ( are generated by a program, and the classical odometry function is used to calculate dx, dy, d(.

Table X.2.2 : Artificial training sets

|F |[pic] |In this training set (dL, dR, () are generated by a uniform noise function, in their respective range. |

|G | |This training set contains all the possible moves at different speeds in straight line in the right |

| | |direction (( varying from –π/2 to π/2 in 32 steps). |

|H | |Like G but in all the directions |

|I | |Like G but in the left direction (π/2 to 3 π/2) |

Results (Table X.2.3)

Each column contains the results for a particular training set. Each row corresponds to the training set used for testing (training sets are both used for training and testing). The numbers in the table are the distances (mm) measured at the arrival between the classical odometry method and the neural odometry method. The two best results for each row are in bold font (excluding the results obtained by auto-test).

The training set A (Column 1) leads to better results because it contains curves in all the possible directions (see Table X.2.1). The training sets B, C, D, E don’t have this diversity and thus perform really bad on uniform datasets like H and I. Training sets D and E perform slightly better with G, because they contain similar orientations. The random training set F and H perform quite well on average, because they contain almost all possible inputs. But A performs even better, because training data comes from a real trajectory, and therefore respects the statistical occurrence of real situations.

Table X.2.3 : Results

Dataset

used for

training

Dataset

used for test |A |B |C |D |E |F |G |H |I | |A |(0) |426 |221 |304 |377 |108 |412 |160 |315 | |B |5 |(0) |31 |23 |168 |140 |86 |149 |395 | |C |36 |219 |(5) |153 |226 |115 |229 |156 |344 | |D |6 |8 |18 |(0) |146 |126 |156 |149 |361 | |E |66 |84 |129 |105 |(47) |80 |377 |113 |375 | |F |235 |593 |361 |569 |487 |(62) |464 |165 |631 | |G |436 |530 |792 |389 |357 |832 |(85) |739 |2112 | |H |639 |1779 |1047 |1363 |1092 |856 |881 |(725) |2883 | |I |544 |2111 |713 |1731 |1833 |703 |1249 |759 |(332) | |Average |219 |639 |369 |515 |526 |336 |438 |346 |861 | |

References

[Borenstein & Feng, 1994] Johann Borenstein & Liqiang Feng, 1994,

UMBmark A Method for Measuring, Comparing, and Correcting Dead-reckoning Errors in Mobile Robots

Technical Report, The University of Michigan UM-MEAM-94-22, December 1994

[Borenstein & Feng, 1996] Johann Borenstein & Liqiang Feng, 1996

Measurement and Correction of Systematic Odometry Errors in Mobile Robots

IEEE Transactions on Robotics and Automation, Vol 12, No 6, December 1996, pp. 869-880.

[Chong & Kleeman] Kok Seng CHONG & Lindsay KLEEMAN

Accurate Odometry and Error Modelling for a Mobile Robot

Department of Electrical and Computer Systems Engineering

Monash University, Clayton Victoria 3168, Australia

[Colla et. Al.] V. Colla, M. Sgarbiy, L.M. Reyneri, A.M. Sabatini

A Neural Approach to a Sensor Fusion Problem

[Cybenko, 1989] G. Cybenko

Approximation by superposition of sigmoidal functions

Mathematics of Control, Signal and Systems, 2 (1989), 303-314.

[Hakyoung et. Al.]

Sensor fusion for Mobile Robot Dead-reckoning With a Precision-calibrated Fiber Optic Gyroscope

2001 IEEE International Conference on Robotics and Automation, Seoul, Korea, May 21-26, pp. 3588-3593

[Hellström, 1998] Thomas Hellström

Neural Networks – A Tutorial

Departement of Computing Sciences. Umeå University, Sweden

[Lundgren, 2003] Martin Lundgren

Path Tracking and Obstacle Avoidance for a Miniature Robot

Master thesis 2003 - Department of Computer Science, Umeå University

[Mc Culloch et Pitts, 1943] Mc Culloch W.S. & Pitts W.A.

A logical calculus of the ideas immanent in nervous activity.

Bulletin of Mathematical Biophysics, vol. 5, pages 115-133.

[Moshe et. Al.] Moshe Kam, Xiaoxun Zhu, & Paul Kalata

Sensor Fusion for Mobile Robot Navigation

[Nagatani et. Al.] K.Nagatani S.Tachibana M.Sofue Y.Tanaka

Improvement of Odometry for Omnidirectional Vehicle using Optical Flow Information

Systems Engineering Dept., Okayama University

[Nilsson, 2001] Theodor Nilsson

KiKS is a Khepera Simulator

Master Thesis 2001, Departement of Computing Sciences. Umeå University, Sweden

[Palamas et. Al.] George Palamas, George Papadourakis & Manolis Kavoussanos

Mobile Robot Position Estimation using unsupervised Neural Networks

Technological Educational Institute of Crete

[Personnaz & Rivals, 2003] Léon Personnaz & Isabelle Rivals

Réseaux de neurones artificiels pour modelisation, contrôle et classification

CNRS Editions, Paris, 2003

[Schroeter et. Al, 2003]

Christof Schroeter, Hans-Joachim Boehme, and Horst-Michael Gross

Extraction of orientation from floor structure for odometry correction in mobile robotics

25th Pattern Recognition Symposium, Magdeburg, Germany, September 2003

[Wolberg, 1990] Georges Wolberg.

Digital Image Warping.

IEEE Computer Society Press, 1990.

-----------------------

Test set

LED OFF

LED

Khepera’s hat

LED ON

Training set

LED

classical odometry

neural odometry

dL - dR

(

MSE

Epochs

neural odometry

classical odometry

Ground of the arena

Patch

Khepera’s hat

(Pixels)

Arena

(0, 0)

(800, 0)

(800, -500)

(0, -500)

(Milimetres)

Test set

Training set

MSE

Dneural

Dclassical

ei

(x, y, ()

V

Vl

Main program :

.

.

.

[dL,dR]= read_encoders

.

.

switch_LED(ON)

switch_LED(OFF)

.

Time = Clock

.

.

loop

Video processing :

Get_Pose

Get_LED_state

video acquisition

shaft encoder values

































Vr

Given by shaft encoders

Given by additionnal input

(μ2, (2)

(μ1, (1)

Sensor 1

Sensor 2

Odometric information 2

Odometric information 1

Fusing

Model

Pose

(x, y, ()

Sensor 2

Sensor 1

Pose

(x, y, ()

Neural Network

(dL3, dR3)

Average intensity of the led’s area

(dL2, dR2)

(dL1, dR1)

Classical odometry

Neural network

Classical odometry

Neural network

Model

Phototransistor

Window manually defined in the first frame

Figure IV.8

neural odometry

classical odometry

neural odometry

N elementary displacements

position at time t + dt

position at time t

ω

R

synchronization signals

shaft encoder values

video acquisition

synchronization signals

(

x

y

Figure IV.9

1mm

The robot gets stuck against a wall

Frames

R

R

R

classical odometry

(

Y

dV

dU

d(

trajectory

attractive field

repulsive field

V (cm/s)

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download