Quaternion-Based EKF for Attitude and Bias Estimation

This page describes a method to estimate orientation given gyroscope and accelerometer data. Extended Kalman filter (EKF) is used with quaternion and gyro bias as state vector. A working Python code is also provided.

Overview

Gyroscope

Gyroscope data are not treated as observations. Instead, they are used in the prediction step. This is done because modeling the system dynamics accurately is difficult [1]. The measured angular velocity has bias applied to the true angular velocity:

Accelerometer

Accelerometer data are treated as observations. The device acceleration is assumed to be small enough so that the accelerometer vector always points up relative to the ground.

Coordinates

Front-right-down for body frame (sensor frame). NED for world frame. For example, accelerometer measures

Notation

Not to be confused with some papers [2] that define the product in reverse order.

State, x

The first four parameters are components of a unit quaternion and represents the orientation of the device. It is normalized every time it is modified. The last three parameters are gyro bias vector in rad/sec.

State Prediction, f(x)

Gyro bias vector stays the same in the prediction step. Orientation (quaternion) is modified with gyroscope data as follows:

State Transition Matrix, F

State transition matrix F is needed to predict the covariance P. F is a 7x7 matrix given by:

Axis-angle representation for a rotation (θ, û) is equivalent to the unit quaternion representation:

Solving the jacobian of the above with respect to the state vector results in the state transition matrix shown at the beginning of this section.

Process Noise

Gyroscope noise appears as state noise and not observation noise. Process noise is a combination of gyro noise and gyro bias drift and is given by,

Derivation of W

Solving the jacobian of the above with respect to the angular velocity results in W matrix shown at the beginning of this section.

Observation, z

where g = 9.80665m/s^2 and \vec{a} is accelerometer measurement. Only the accelerometer measurement vector direction is used.

Observation model, h(x)

where R_q is the rotation matrix representing the orientation. The above converts the gravity vector in world frame to body frame. Quaternion to rotation matrix is given by

Not to be confused with some papers [2] that define the rotation matrix as the transpose of the above.

Observation model, H

Observation matrix is determined by taking the jacobian of the observation function:

Observation noise

Accelerometer noise appears as observation noise. Observation noise is given by,

Derivation of V

Python Implementation

import numpy as np


g = 9.80665


def normalize(v):
    return v / np.linalg.norm(v)


def quaternion_identity():
    return np.c_[[1, 0, 0, 0]]


def quaternion_to_matrix(q_unit):
    w, x, y, z = q_unit.flat
    w2 = w * w
    x2 = x * x
    y2 = y * y
    z2 = z * z
    r11 = w2 + x2 - y2 - z2
    r12 = 2 * (x * y - w * z)
    r13 = 2 * (w * y + x * z)
    r21 = 2 * (x * y + w * z)
    r22 = w2 - x2 + y2 - z2
    r23 = 2 * (y * z - w * x)
    r31 = 2 * (x * z - w * y)
    r32 = 2 * (y * z + w * x)
    r33 = w2 - x2 - y2 + z2
    return np.array([
        [r11, r12, r13],
        [r21, r22, r23],
        [r31, r32, r33]
    ])


def quaternion_multiply(p, q):
    pw, px, py, pz = p.flat
    qw, qx, qy, qz = q.flat
    rw = pw * qw - px * qx - py * qy - pz * qz
    rx = pw * qx + px * qw + py * qz - pz * qy
    ry = pw * qy - px * qz + py * qw + pz * qx
    rz = pw * qz + px * qy - py * qx + pz * qw
    return np.c_[[rw, rx, ry, rz]]


def quaternion_from_axis_angle(unit_axis, angle_rad):
    ux, uy, uz = unit_axis.flat
    sin_half = np.sin(angle_rad / 2)
    qw = np.cos(angle_rad / 2)
    qx = ux * sin_half
    qy = uy * sin_half
    qz = uz * sin_half
    return np.c_[[qw, qx, qy, qz]]


def quaternion_from_rotation_vector(v, eps=0):
    angle_rad = np.linalg.norm(v)

    # Guard against division by zero
    if angle_rad > eps:
        unit_axis = v / angle_rad
        q_unit = quaternion_from_axis_angle(unit_axis, angle_rad)
    else:
        q_unit = quaternion_identity()

    return q_unit


def get_F(x, w, dt):
    qw, qx, qy, qz, bx, by, bz = x.flat
    wx, wy, wz = w.flat
    return np.array([
        [             1, dt*(-wx + bx)/2, dt*(-wy + by)/2, dt*(-wz + bz)/2,  dt*qx/2,  dt*qy/2,  dt*qz/2],
        [dt*(wx - bx)/2,               1, dt*( wz - bz)/2, dt*(-wy + by)/2, -dt*qw/2,  dt*qz/2, -dt*qy/2],
        [dt*(wy - by)/2, dt*(-wz + bz)/2,               1, dt*( wx - bx)/2, -dt*qz/2, -dt*qw/2,  dt*qx/2],
        [dt*(wz - bz)/2, dt*( wy - by)/2, dt*(-wx + bx)/2,               1,  dt*qy/2, -dt*qx/2, -dt*qw/2],
        [             0,               0,               0,               0,        1,        0,        0],
        [             0,               0,               0,               0,        0,        1,        0],
        [             0,               0,               0,               0,        0,        0,        1]
    ])


def get_W(x, dt):
    qw, qx, qy, qz = x.flat[0:4]
    return dt/2 * np.array([
        [-qx, -qy, -qz],
        [ qw, -qz,  qy],
        [ qz,  qw, -qx],
        [-qy,  qx,  qw],
        [  0,   0,   0],
        [  0,   0,   0],
        [  0,   0,   0]
    ])


def get_H(x):
   qw, qx, qy, qz = x.flat[0:4]
   return 2 * g * np.array([
        [ qy, -qz,  qw, -qx, 0, 0, 0],
        [-qx, -qw, -qz, -qy, 0, 0, 0],
        [-qw,  qx,  qy, -qz, 0, 0, 0]
   ])


def f(x, w, dt):
    q = x[0:4]
    b = x[4:7]

    d_ang = (w - b) * dt
    dq = quaternion_from_rotation_vector(d_ang)
    q = quaternion_multiply(q, dq)
    q = normalize(q)

    return np.r_[q, b]


def h(x):
    q = x[0:4]
    R_from_body = quaternion_to_matrix(q)
    return R_from_body.T @ np.c_[[0, 0, -g]]


class EKF:
    def __init__(self, q0=[1,0,0,0], b0=[0,0,0], init_gyro_bias_err=0.1,
                 gyro_noise=0.015, gyro_bias_noise=0.002, accelerometer_noise=1.0):
        '''
        q0 -- initial orientation (unit quaternion) [qw, qx, qy, qz]
        b0 -- initial gyro bias [rad/sec] [bx, by, bz]
        init_gyro_bias_err -- initial gyro bias uncertainty (1 standard deviation) [rad/sec]
        gyro_noise -- Gyro noise (1 standard deviation) [rad/sec]
        gyro_bias_noise -- Gyro bias noise (1 standard deviation) [rad/sec]
        accelerometer_noise -- Accelerometer measurement noise (1 standard deviation) [m/s^2]
        '''
        self.x = np.c_[q0 + b0] # State
        self.P = np.zeros((7, 7))
        self.P[0:4, 0:4] = np.identity(4) * 0.01
        self.P[4:7, 4:7] = np.identity(3) * (init_gyro_bias_err ** 2)
        self.Q = np.identity(3) * (gyro_noise ** 2)
        self.Q_bias = np.zeros((7, 7))
        self.Q_bias[4:7, 4:7] = np.identity(3) * (gyro_bias_noise ** 2)
        self.R = np.identity(3) * (accelerometer_noise ** 2)
    
    def predict(self, w, dt):
        '''
        w -- gyro measurement in front-right-down coordinates [rad/sec]
        dt -- [sec]
        '''
        w = np.c_[w]
        F = get_F(self.x, w, dt)
        W = get_W(self.x, dt)
        self.x = f(self.x, w, dt)
        self.P = F @ self.P @ F.T + W @ self.Q @ W.T + self.Q_bias

    def update(self, a):
        '''
        a -- accelerometer measurement in front-right-down coordinates [m/s^2]
        '''
        a = np.c_[a]
        z = g * normalize(a)
        y = z - h(self.x)
        H = get_H(self.x)
        S = H @ self.P @ H.T + self.R
        K = (self.P @ H.T) @ np.linalg.inv(S)
        self.x = self.x + (K @ y)
        self.x[0:4] = normalize(self.x[0:4])
        self.P = (np.identity(7) - K @ H) @ self.P

Simulation Test

True vs estimated attitudes are compared using simulated sensor data. Euler angles are z-y-x intrinsic Tait-Bryan rotation (e.g. first rotate about Z-axis by angle yaw, next rotate about the newly created Y-axis by angle pitch, finally rotate about the newly created X-axis by angle roll).

Test 1: static case

IMU is held fixed (non-rotating) at roll 25 degrees, pitch 0 degrees, yaw 0 degrees with gryo bias in x-direction of 0.1 rad/sec. Gyro bias in x-direction converges to the true value after about 1 second. Orientation (roll) is correctly tracked even with gyro bias.

Test 2: constant angular rate

IMU is rotated around the x-axis at constant angular rate of 90 degrees/sec with gyro bias in x-direction of 0.1 rad/sec. Roll is estimated correctly even with gyro bias.

Test 3: rotating all axes

IMU is rotated all axes at the same time with gyro bias of (0.1, 0.2, -0.1) rad/sec. It can be seen that the estimated orientation closely matches the true value.

Source code for the above tests

import matplotlib.pyplot as plt
import numpy as np
from filter import EKF
from filter import quaternion_identity
from filter import quaternion_multiply
from filter import quaternion_from_axis_angle
from filter import normalize
from filter import g


random_generator = np.random.default_rng(0)


def quaternion_from_euler(roll, pitch, yaw):
    c_roll   = np.cos(roll / 2)
    s_roll   = np.sin(roll / 2)
    c_pitch = np.cos(pitch / 2)
    s_pitch = np.sin(pitch / 2)
    c_yaw   = np.cos(yaw / 2)
    s_yaw   = np.sin(yaw / 2)

    qw = c_roll * c_pitch * c_yaw + s_roll * s_pitch * s_yaw
    qx = s_roll * c_pitch * c_yaw - c_roll * s_pitch * s_yaw
    qy = c_roll * s_pitch * c_yaw + s_roll * c_pitch * s_yaw
    qz = c_roll * c_pitch * s_yaw - s_roll * s_pitch * c_yaw
    return np.c_[[qw, qx, qy, qz]]


def quaternion_to_euler(q):
    qw, qx, qy, qz = q.flat
    roll  = np.arctan2(2 * (qw * qx + qy * qz),
                       1 - 2 * (qx * qx + qy * qy))
    pitch = np.arcsin(2 * (qw * qy - qz * qx))
    yaw   = np.arctan2(2 * (qw * qz + qx * qy),
                       1 - 2 * (qy * qy + qz * qz))
    return roll, pitch, yaw


def quaternion_inv(q_unit):
    w, x, y, z = q_unit.flat
    return np.c_[[w, -x, -y, -z]]


def rotate_vector(q_unit, v):
    p = np.r_[[[0]], v]
    # q * p * q^-1
    r = quaternion_multiply(
        quaternion_multiply(q_unit, p),
        quaternion_inv(q_unit))
    return r[1:4]


def plot(ts, w, a, rpy, estimated_rpy, b, estimated_b, b_err):
    bx = np.array(estimated_b[0])
    by = np.array(estimated_b[1])
    bz = np.array(estimated_b[2])
    bx_std = np.array(b_err[0])
    by_std = np.array(b_err[1])
    bz_std = np.array(b_err[2])

    fig, axs = plt.subplots(8, 1)

    axs[0].plot(ts, w[0], label='x')
    axs[0].plot(ts, w[1], label='y')
    axs[0].plot(ts, w[2], label='z')
    axs[0].grid(True)
    axs[0].legend(loc='upper right')
    axs[0].set_ylabel('w\n(rad/s)')
    axs[0].set_xticklabels([])

    axs[1].plot(ts, a[0], label='x')
    axs[1].plot(ts, a[1], label='y')
    axs[1].plot(ts, a[2], label='z')
    axs[1].grid(True)
    axs[1].legend(loc='upper right')
    axs[1].set_ylabel('a\n(m/s^2)')
    axs[1].set_xticklabels([])

    axs[2].plot(ts, rpy[0], label='true')
    axs[2].plot(ts, estimated_rpy[0], '--', label='estimated')
    axs[2].grid(True)
    axs[2].legend(loc='upper right')
    axs[2].set_ylabel('roll\n(deg)')
    axs[2].set_xticklabels([])

    axs[3].plot(ts, rpy[1], label='true')
    axs[3].plot(ts, estimated_rpy[1], '--', label='estimated')
    axs[3].grid(True)
    axs[3].legend(loc='upper right')
    axs[3].set_ylabel('pitch\n(deg)')
    axs[3].set_xticklabels([])

    axs[4].plot(ts, rpy[2], label='true')
    axs[4].plot(ts, estimated_rpy[2], '--', label='estimated')
    axs[4].grid(True)
    axs[4].legend(loc='upper right')
    axs[4].set_ylabel('yaw\n(deg)')
    axs[4].set_xticklabels([])

    axs[5].plot(ts, b[0], label='true')
    axs[5].plot(ts, bx, '--', label='estimated')
    axs[5].fill_between(t, 
                        bx-bx_std,
                        bx+bx_std,
                        color='gray', alpha=0.5)
    axs[5].grid(True)
    axs[5].legend(loc='upper right')
    axs[5].set_ylabel('bias x\n(rad/sec)')
    axs[5].set_xticklabels([])

    axs[6].plot(ts, b[1], label='true')
    axs[6].plot(ts, by, '--', label='estimated')
    axs[6].fill_between(t, 
                        by-by_std,
                        by+by_std,
                        color='gray', alpha=0.5)
    axs[6].grid(True)
    axs[6].legend(loc='upper right')
    axs[6].set_ylabel('bias y\n(rad/sec)')
    axs[6].set_xticklabels([])

    axs[7].plot(ts, b[2], label='true')
    axs[7].plot(ts, bz, '--', label='estimated')
    axs[7].fill_between(t, 
                        bz-bz_std,
                        bz+bz_std,
                        color='gray', alpha=0.5)
    axs[7].grid(True)
    axs[7].legend(loc='upper right')
    axs[7].set_ylabel('bias z\n(rad/sec)')

    plt.show()


def get_gyro_data(q1, q2, bias, dt, noise):
    q1_w, q1_x, q1_y, q1_z = q1.flat
    q2_w, q2_x, q2_y, q2_z = q2.flat
    mean = np.array([
        2/dt * (q1_w*q2_x - q1_x*q2_w - q1_y*q2_z + q1_z*q2_y),
        2/dt * (q1_w*q2_y + q1_x*q2_z - q1_y*q2_w - q1_z*q2_x),
        2/dt * (q1_w*q2_z - q1_x*q2_y + q1_y*q2_x - q1_z*q2_w)
    ])
    w = random_generator.normal(mean + bias, noise)
    return w


def get_accelerometer_data(q, noise):
    q_to_body = quaternion_inv(q)
    up_world = np.c_[[0, 0, -g]]
    up_body = rotate_vector(q_to_body, up_world)
    a = random_generator.normal(up_body, noise)
    return a


def generate_accelerometer_data_array(q, acc_noise):
    N = len(q)
    ax, ay, az = [None] * N, [None] * N, [None] * N

    for i in range(N):
        a = get_accelerometer_data(q[i], acc_noise)
        ax[i], ay[i], az[i] = a.flat
    
    return ax, ay, az


def generate_gyro_data_array(t, q, b, gyro_noise):
    N = len(t)
    wx, wy, wz = [None] * N, [None] * N, [None] * N

    for i in range(1, N):
        dt = t[i] - t[i-1]
        bi = np.array([b[0][i], b[1][i], b[2][i]])
        w = get_gyro_data(q[i-1], q[i], bi, dt, gyro_noise)
        wx[i], wy[i], wz[i] = w.flat
    
    return wx, wy, wz


def estimate(t, w, a):
    N = len(t)
    q = [None] * N
    bx, by, bz = [None] * N, [None] * N, [None] * N
    bx_err, by_err, bz_err = [None] * N, [None] * N, [None] * N
    ekf = EKF()

    q[0] = ekf.x[0:4]
    bx[0], by[0], bz[0] = ekf.x.flat[4:7]
    bx_err[0] = np.sqrt(ekf.P[4, 4])
    by_err[0] = np.sqrt(ekf.P[5, 5])
    bz_err[0] = np.sqrt(ekf.P[6, 6])
    for i in range(1, N):
        dt = t[i] - t[i-1]
        wi = [w[0][i], w[1][i], w[2][i]]
        ai = [a[0][i], a[1][i], a[2][i]]
        ekf.predict(wi, dt)
        ekf.update(ai)
        
        q[i] = ekf.x[0:4]
        bx[i], by[i], bz[i] = ekf.x.flat[4:7]
        bx_err[i] = np.sqrt(ekf.P[4, 4])
        by_err[i] = np.sqrt(ekf.P[5, 5])
        bz_err[i] = np.sqrt(ekf.P[6, 6])

    return q, [bx, by, bz], [bx_err, by_err, bz_err]


def quaternion_to_euler_degrees(q):
    N = len(q)
    roll = [None] * N
    pitch = [None] * N
    yaw = [None] * N

    for i in range(N):
        if q[i] is not None:
            r, p, y = quaternion_to_euler(q[i])
            roll[i] = np.degrees(r)
            pitch[i] = np.degrees(p)
            yaw[i] = np.degrees(y)
    
    return roll, pitch, yaw


def euler_degrees_to_quaternion(rpy):
    roll = rpy[0]
    pitch = rpy[1]
    yaw = rpy[2]
    N = len(roll)
    q = [None] * N

    for i in range(N):
        if roll[i] is not None:
            q[i] = quaternion_from_euler(
                np.radians(roll[i]),
                np.radians(pitch[i]),
                np.radians(yaw[i]))
    
    return q


# Fixed (non-rotating) at roll 25 degrees, pitch 0 degrees, yaw 0 degrees
# Gyro bias in x-direction of 0.1 rad/sec
def generate_ground_truth_data1(t):
    N = len(t)
    roll = np.radians(25)
    pitch = np.radians(0)
    yaw = np.radians(0)
    q0 = quaternion_from_euler(roll, pitch, yaw)

    q = [q0] * N
    bx = [0.1] * N
    by = [0] * N
    bz = [0] * N
    
    return q, [bx, by, bz]


# Rotating at constant angular rate [90 degrees/sec, 0, 0]
# Gyro bias in x-direction of 0.1 rad/sec
def generate_ground_truth_data2(t):
    N = len(t)
    q = [None] * N
    bx = [0.1] * N
    by = [0] * N
    bz = [0] * N

    q[0] = quaternion_identity()
    
    rate = 90
    w = np.radians([rate, 0, 0])
    abs_w = np.linalg.norm(w)

    for i in range(1, N):
        dt = t[i] - t[i-1]
        
        dtheta = abs_w * dt
        unit_axis = w / abs_w
        dq = quaternion_from_axis_angle(unit_axis, dtheta)

        q[i] = quaternion_multiply(q[i-1], dq)
        q[i] = normalize(q[i])
    
    return q, [bx, by, bz]


# Rotating all axes at the same time
def generate_ground_truth_data3(t):
    N = len(t)
    q = [None] * N
    bx = [0.1] * N
    by = [0.2] * N
    bz = [-0.1] * N

    q[0] = quaternion_identity()

    wx = 500 * np.sin(2 * np.pi * 0.5 * t)
    wy = 200 * np.sin(2 * np.pi * 1.0 * t)
    wz = 300 * np.sin(2 * np.pi * 2.0 * t)

    for i in range(1, N):
        dt = t[i] - t[i-1]

        w = np.radians([wx[i], wy[i], wz[i]])
        abs_w = np.linalg.norm(w)

        if abs_w > 0:
            dtheta = abs_w * dt
            unit_axis = w / abs_w
            dq = quaternion_from_axis_angle(unit_axis, dtheta)
        else:
            dq = quaternion_identity()

        dq = quaternion_from_axis_angle(unit_axis, dtheta)
        q[i] = quaternion_multiply(q[i-1], dq)
        q[i] = normalize(q[i])

    return q, [bx, by, bz]


GYRO_NOISE = 0.015 # rad/sec
ACC_NOISE = 1.0 # m/s^2

# Generate time
t0 = 0
tf = 10
dt = 0.01
t = np.arange(t0, tf, dt)

# Generate ground truth orientation and bias
#q, b = generate_ground_truth_data1(t)
#q, b = generate_ground_truth_data2(t)
q, b = generate_ground_truth_data3(t)

# Generate sensor data
a = generate_accelerometer_data_array(q, ACC_NOISE)
w = generate_gyro_data_array(t, q, b, GYRO_NOISE)

# EKF estimation
estimated_q, estimated_b, b_err = estimate(t, w, a)

# Check results
rpy = quaternion_to_euler_degrees(q)
estimated_rpy = quaternion_to_euler_degrees(estimated_q)
plot(t, w, a, rpy, estimated_rpy, b, estimated_b, b_err)

1.	Prediction: state
2.	Prediction: covariance
3.	Innovation
4.	Innovation: covariance
5.	Kalman gain
6.	Update: state
7.	Update: state