11.4: Graph-based SLAM

Last updated
Save as PDF

Page ID: 14840

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Usually, a robot obtains an initial estimate of where it is using some onboard sensors (odometry, optical flow, etc.) and uses this estimate to localize features (walls, corners, graphical patterns) in the environment. As soon as a robot revisits the same feature twice, it can update the estimate on its location. This is because the variance of an estimate based on two independent measurements will always be smaller than any of the variances of the individual measurements. As consecutive observations are not independent, but rather closely correlated, the refined estimate can then be propagated along the robot’s path. This is formalized in EKF-based SLAM. A more intuitive understanding is provided by a spring-mass analogy: each possible pose (mass) is constrained to its neighboring pose by a spring. The higher the uncertainty of the relative transformation between two poses (e.g., obtained using odometry), the weaker the spring. Every time a robot gains confidence on a relative pose, the spring is stiffened instead. Eventually, all poses will be pulled in place. This approach is known as Graph-based SLAM , see also (?).

11.4.1. SLAM as a Maximum-Likelihood Estimation Problem

The classical formulation of SLAM describes the problem as maximizing the posterior probability of all points on the robot’s trajectory given the odometry input and the observations. Formally,

\[p(x_{1:T},m|z_{1:T},u_{1:T})\]

where x_1:T are all discrete positions from time 1 to time T, z are the observations, and u are the odometry measurements. This formulation makes heavily use of the temporal structure of the problem. In practice, solving the SLAM problem requires

A motion update model, i.e., the probability p(x_t|x_t−1, u_t) to be at location x_t given an odometry measurement u_t and being at location x_t−1.
A sensor model, i.e., the probability p(z_t |x_t , m_t) to make observation z_t given the robot is at location x_t and the map m_t .

A possible solution to this problem is provided by the Extended Kalman Filter, which maintains a probability density function for the robot pose as well as the positions of all features on the map. Being able to uniquely identify features in the environment is of outmost importance and is known as the data association problem. Like EKF-based SLAM, graph-based SLAM does not solve this problem and will fail if features are confused.

In graph-based SLAM, a robot’s trajectory forms the nodes of a graph whose edges are transformations (translation and rotation) that have a variance associated with it. An alternative view is the spring-mass analogy mentioned above. Instead of having each spring wiggle a node into place, graph-based SLAM aims at finding those locations that maximize the joint likelihood of all observations. As such, graph-based SLAM is a maximum likelihood estimation problem.

Lets revisit the normal distribution:

\[\frac{1}{\sigma \sqrt{2\pi }}e^{\frac{-(x-\mu )^{2}}{2\sigma ^{2}}}\]

It provides the probability for a measurement to have value x given that this measurement is normal distributed with mean µ and variance σ² . We can now associate such a distribution with every node-to-node transformation, aka constraint. This can be pairs of distance and angle, e.g. In the literature the measurement of a transformation between node i and a node j is denoted z_ij. Its expected value is denoted ẑ_ij. This value is expected for example based on a map of the environment that consists of previous observations.

Formulating a normal distribution of measurements z_ij with mean ẑ_ij and a covariance matrix Σ_ij (containing all variances of the components of z_ij in its diagonal) is now straightforward. As graph-based SLAM is most often formulated as information filter, usually the inverse of the covariance matrix (aka information matrix) is used, which we denote by Ω_ij = Σ⁻¹_ij .

As we are interested in maximizing the joint probability of all measurements Πz_ij over all edge pairings ij following the maximum likelihood estimation framework, it is customary to express the PDF using the log-likelihood. By taking the natural logarithm on both sides of the PDF expression, the exponential function vanishes and lnΠz_ij becomes Σlnz_ij or Σl_ij , where l_ij is the log-likelihood distribution for z_ij .

\[l_{ij}\alpha (z_{ij}-ẑ_{ij}(x_{i},x_{j}))^{T}\Omega _{ij}(z_{ij}-ẑ_{ij}(x_{i},x_{j}))\]

Again, the log-likelihood for observation zij is directly derived from the definition of the normal distribution, but using the information matrix instead of the covariance matrix and is ridden of the exponential function by taking the logarithm on both sides.

The optimization problem can now be formulated as

\[x^{*}=\arg\min_{x} \sum_{<ij>\in C}^{}e_{ij}^{T}\Omega _{ij}e_{ij} \]

with e_ij (x_i , x_j ) = z_ij − ẑ_ij (x_i , x_j) the error between measurement and expected value. Note that the sum actually needs to be minimized as the individual terms are technically the negative log-likelihood.

11.4.2. Numerical Techniques for Graph-based SLAM

Solving the MLE problem is non-trivial, especially if the number of constraints provided, i.e., observations that relate one feature to another, is large. A classical approach is to linearize the problem at the current configuration and reducing it to a problem of the form Ax = b. The intuition here is to calculate the impact of small changes in the positions of all nodes on all e_ij . After performing this motion, linearization and optimization can be repeated until convergence.

Recently, more powerful numerical methods have been developed. Instead of solving the MLE, one can employ a stochastic gradient descent algorithm. A gradient descent algorithm is an iterative approach to find the optimum of a function by moving along its gradient. Whereas a gradient descent algorithm would calculate the gradient on a fitness landscape from all available constraints, a stochastic gradient descent picks only a (non-necessarily random) subset. Intuitive examples are fitting a line to a set of n points, but taking only a subset of these points when calculating the next best guess. As gradient descent works iteratively, the hope is that the algorithm takes a large part of the constraints into account. For solving Graph-based SLAM, a stochastic gradient descent algorithm would not take into account all constraints available to the robot, but iteratively work on one constraint after the other. Here, constraints are observations on the mutual pose of nodes i and j. Optimizing these constraints now requires moving both nodes i and j so that the error between where the robot thinks the nodes should be and what it actually sees gets reduced. As this is a trade-off between multiple, maybe conflicting observations, the result will approximate a Maximum Likelihood estimate.

More specifically, with e_ij the error between an observation and what the robot expects to see, based on its previous observation and sensor model, one can distribute the error along the entire trajectory between both features that are involved in the constraint. That is, if the constraint involves features i and j, not only i and j’s pose will be updated but all points in between will be moved a tiny bit.

In Graph-based SLAM, edges encode the relative translation and rotation from one node to the other. Thus, altering a relationship between two nodes will automatically propagate to all nodes in the network. This is because the graph is essentially a chain of nodes whose edges consist of odometry measurements. This chain then becomes a graph whenever observations (using any sensor) introduce additional constraints. Whenever such a “loop-closure” occurs, the resulting error will be distributed over the entire trajectory that connects the two nodes. This is not always necessary, for example when considering the robot driving a figure-8 pattern. If a loop-closure occurs in one half of the 8, the nodes in the other half of the 8 are probably not involved.

This can be addressed by constructing a minimum spanning tree (MST) of the constraint graph. The MST is constructed by doing a Depth-First Search (DFS) on the constraint graph following odometry constraints. At a loop-closure, i.e., an edge in the graph that imposes a constraint to a previously seen pose, the DFS backtracks to this node and continues from there to construct the spanning tree. Updating all poses affected by this new constraint still requires modifying all nodes along the path between the two features that are involved, but inserting additional constraints is greatly simplified. Whenever a robot observes new relationships between any two nodes, only the nodes on the shortest path between the two features on the MST need to be updated.