Kalman Filter in Hindsight

A 20/20 on commands found in the Kalman Filter Cookbook


Let’s say that you have a set of equations that model your system assuming ideal objects. That is, when you accelerate, the wheels don’t slip and the ground isn’t uneven and the engine doesn’t sputter. This doesn’t happen often in the real world, so you add sensors to measure your system’s state and attempt to get a better idea of what state the system truly is in. How do you do this? well one way is to use a Kalman Filter to incorporate your sensor readings into your system and to manage the inherit uncertainty of the system due to many factors. A Kalman filter can take several measurements with statistical noise and produce a more accurate estimation of what your system’s real state is than for a single measurement. It does this by estimating a joint probability distribution over all of the system’s state variables.

For robots, there is a time delay between issuing motor commands and receiving sensory feedback. In systems like these, there is a natural 2 step process of issuing the commands to change the system’s state and then another step of gathering sensor data to see how the system changed. Again by using a Kalman filter, you can get a more realistic model over time for making estimates of the actual position and state of the motor system and thus better at issuing updated commands.

The basic model

First we should note that if conditions were ideal, we’d have an equation \( y = f(x) \) that represents the signal in its most general form. The basic Kalman Filter deals with linear systems, so we can write our general equation as:

\( [y] = [F][x] \) where \( [y] \) is the output vector,

\( [F] \) is some matrix we apply to the input vector [x]. Since we’re dealing with discrete functions, we can rewrite this as a series of \( x(k) \) inputs e.g. \( x(0),x(1),…x(k-1),x(k),…. \) and we’ll assume that

\( [x(k)] = [F(k)][x(k-1)] \)

Note that we’ve parameterized our matrix \( F(k) \) with \( k \) to allow it to change with each iteration.

Now let’s just drop the [] and know that \( x(k) \) is a vector and \( F(k) \) is a matrix. So in the ideal world we’d have a series defined as:

\( x(k) = F(k) x(k-1) \)

That is to get the next value \( x(k) \), we’ll apply the matrix \( F(k) \) to the previous value \( x(k-1) \).

Unfortunately \( x(k) \) is only an estimate and can get out of date quickly. So Kalman filters model this as having a true value PLUS some noise.

\( x(k) = F(k) x(k-1) + w(k) \) where \( w(k) \) is noise .

Now assume that we sent some commands to a motor based on this calculation and then read some sensor that told us what \( x(k) \) really was. Let’s call this measurement \( z(k) \). We’d have a separate reading. In a perfect world, our measured value would be some function of \( x(k) \). Why aren’t they exactly the same? Keep in mind that this is a sensor reading and will likely be the voltage output measuring some aspect of the state. Ideally this voltage output is strictly linear across the operational range of x. At any rate, we’ll assume that we have a mapping between the reading z and the actual variable x.

i.e. \( z(k) = H(k)x(k) \) this means that we can determine what our sensor reading should be if we are at position \( x(k) \). It’s just some linear transformation on \( x(k) \). E.g. say \( x(k) \) is the angle of a wheel and \( z(k) \) is the reading from a sensor measuring that wheel angle. \( H(k) \) would be our ideal measurement function that tells us what the sensor reading would be given our current state \( x(k) \).

if our sensors are highly accurate and are calibrated to read out the actual state values, \( H(k) \) would just be the identity matrix

i.e \( z(k) = x(k) \)

But this is the real world and we have measurement noise as well. So we model this as being:

\( z(k) = H(k)x(k) + v(k) \) where \( v(k) \) is noise

Onward to the Kalman Filter

There are several formulations of the Kalman Filter but the most common formulation has 2 steps:

  1. Prediction step: estimates the current state variables and their uncertainties given the previous estimates, control-input measurements and a “process noise”
  2. Update step: once the new observations are obtained, the estimates are updated using a weighted average.

We have these equations:

  • \( x(k) = F(k)x(k-1) + B(k)u(k) + w(k) \) := prediction step aka time update
  • \( z(k) = H(k)x(k) + v(k) \) := update step – aka measurement update aka correction

WHERE the symbols mean:

  • \( x(k) \) := measurement at time k.
  • \( w(k) \) := process noise – multi-variant normal distribution with covariance \( Q(k) \)
  • \( v(k) \) := measurement noise – multi-variant normal distribution with covariance R(k)
  • \( z(k) \) := an observation or measurement of the true state of \( x(k) \). e.g. a GPS reading of the exact position.
  • \( p(k|k-1) \) := is the uncertainty – its the probability of the \( x(k) \) given \( x(k-1) \)

NOTE: Alternate symbols and formulations. Some papers use \( A(k) \) instead of \( F(k) \) so the equations become:

  • \( x(k) = A(k)x(k-1) + B(k)u(k) + w(k) \) := prediction step
  • \( z(k) = H(k)x(k) + v(k) \) := update step


  • \( F(k) \) := the state transition model that is applied to the previous state
  • \( x(k-1) \) when estimating the predicted current state. This is your ideal function with no noise
  • \( H(k) \) := the observation model. This tells you how your ideal sensor readings related to the measured values. Its probably the identity matrix.
  • \( B(k) \) := the control-input model.
  • \( Q(k) \) := covariance of the process noise
  • \( R(k) \) := covariance of the observation noise

  • Extended Kalman Filter – for non-linear systems – used by nav sys and GPS works by taking matrix of partial derivatives. The idea is to approximate our non-linear systems with a linear approximation that we calculate at each iteration. We use this matrix of partial derivatives called the Jacobian to calculate a linear map assuming the system is in a certain state. These Jacobian matrices can be used in the normal Kalman filter equations. For non-linear systems then, you assume that you have a smooth, meaning differentiable, multi-variate function f of the input \( x(k-1), u(k-1) \) and \( w(k-1) \) that describes the output. so we have:
\( x(k)=f(x(k-1),u(k-1),w(k-1)) \)

The Jacobian \( F \) (remember this is a matrix of partial derivatives of the non-linear multi-varate function f, will change at each step. Hence we will label it as \( F(k) \) where that means the Jacobian of \( f \) at time k. This allows us to approximate the result using the Jacobian.

  • unscented Kalman filter – non-linear