ArduPilot EKF Lane Switch, Variance Failure, and Yaw Reset — Unified
ardupilotekfekf3log-analysislane-switchfailsafe

ArduPilot EKF Lane Switch, Variance Failure, and Yaw Reset — Unified

LogHat Engineering TeamJune 9, 20268 min read

Key Takeaway

ArduPilot EKF self-protects through three related events: Lane Switch (ERR Subsys=24, XKF4.PI flip), Variance Failure (ERR Subsys=16 ECode=2, test ratios above 1), and Yaw Reset (MSG line, XKF3.IYAW climbing before). One in isolation is the safety mechanism; cascading sequences mean an underlying sensor cause — GPS multipath, magnetometer interference, severe vibration, or baro drift.

TL;DR: ArduPilot’s EKF self-protects through three closely-related events you’ll see in logs and forum threads: EKF Lane Switch (ERR Subsys=24, the autopilot promoting a healthier filter core), EKF Variance Failure (ERR Subsys=16, the EKF check declaring measurement variance too high), and EKF Yaw Reset (an internal reset of the yaw estimate when compass and inertial integration disagree by too much). One of each in isolation is the safety net working; repeated occurrences mean an underlying sensor problem — usually GPS multipath, magnetometer interference, or vibration corrupting the IMU.

The three events at a glance

  • EKF Lane Switch — the autopilot has multiple EKF cores running in parallel and the active primary’s error score climbed past the secondary’s. The autopilot promoted the healthier core. Logged as ERR Subsys=24 ECode=1; visible as XKF4.PI flipping between core indices.
  • EKF Variance Failure — the EKF check (running at 10 Hz) saw measurement test ratios above 1 for long enough that it lost confidence in the position estimate. Logged as ERR Subsys=16 ECode=2. If unresolved, escalates to EKF failsafe (Subsys=17 ECode=1) and a mode change.
  • EKF Yaw Reset — the EKF detected its yaw estimate has diverged from compass and motion-derived heading and reset itself. Logged as a MSG line ("EKF3 yaw reset to X degrees" or similar). XKF3.IYAW climbs before the reset; XKF3.IMX/IMY/IMZ may also show innovation spikes coincident.

How they relate to each other

The three events live on a cascade. Most cascades start with one degraded sensor:

  1. A sensor (GPS, compass, or IMU) starts producing measurements the EKF’s model wasn’t expecting.
  2. Innovations climb. XKF3.IPN/IPE/IPD for position, XKF3.IMX/IMY/IMZ for compass, XKF3.IYAW for yaw.
  3. The EKF’s test ratios (XKF4.SV/SP/SH/SM) cross 1, meaning measurements are being rejected.
  4. The EKF check writes ERR Subsys=16 ECode=2 (variance failure).
  5. If a different core (or a yaw reset) recovers the estimate, the autopilot does a lane switch (Subsys=24) or a yaw reset.
  6. If nothing recovers, the autopilot fires EKF failsafe (Subsys=17 ECode=1) and changes mode.

The single isolated event is the safety mechanism working. The cascade is the failure of redundancy.

Lane switch deep dive

For the full per-core diagnosis — how to read XKF4.PI transitions, which core is the “losing” one, and what the error score (XKF3.ErSc) is telling you — see our EKF primary change post. The short version: one lane switch per long mission is fine; three in a minute or one followed by a failsafe is the autopilot running out of options.

Variance deep dive

Variance failures show up as the EKF rejecting measurements through the gates. For the per-measurement breakdown — SV for velocity, SP for position, SH for height, SM for magnetometer — see our EKF innovation post and our compass variance post. The short version: above 1 means rejected for one update; sustained above 0.5 means heading toward failsafe.

Yaw reset — the under-discussed one

EKF yaw resets are quieter than lane switches and variance failures — they don’t write an ERR line, just a MSG. But they’re a strong signal that compass and inertial integration are disagreeing.

What triggers a yaw reset

  • Sustained XKF3.IYAW innovations above the autopilot’s yaw gate.
  • Sustained disagreement between compass-derived heading and the EKF’s internal estimate.
  • On takeoff, the EKF often performs an early yaw alignment which logs the same way; that’s normal and self-resolving.

What you should worry about

  • Multiple yaw resets during a single flight. The first might be takeoff alignment; the second or third means the compass and inertial sensors are disagreeing repeatedly.
  • Yaw reset followed by a heading swing in the next 5–10 seconds. The reset itself was the autopilot accepting a bad compass reading.
  • Yaw reset coincident with high VIBE.VibeZ. Vibration is corrupting the IMU’s yaw integration; fix the mechanical problem before chasing magnetometer.

How to confirm a yaw reset is real and not just takeoff alignment

Plot XKF3.IYAW and MAG.MagX/Y/Z:

MAV> graph XKF3.IYAW
MAV> graph MAG.MagX MAG.MagY MAG.MagZ
MAV> messages MSG

Look at the MSG line to see when the reset was emitted. If it’s within 30 seconds of arming and IYAW went small immediately afterward, that’s alignment. If it’s mid-flight and the surrounding minutes show climbing IYAW, that’s a real disagreement.

Why the cascade happens — ranked by what we see

  1. GPS multipath or RFI. Position innovations climb; XKF4.SP crosses 1; eventually the EKF check fires variance failure. See our GPS glitch post.
  2. Magnetometer interference. Current-driven bias on the compass forces yaw innovations and ultimately a yaw reset. XKF4.SM climbs alongside. See our CompassMot post.
  3. Severe vibration. IMU integration degrades; the EKF’s prediction grows wrong; everything else looks like the sensors are lying. VIBE.VibeZ > 30 m/s² in the affected window. See our VIBE post.
  4. Barometer drift. Sun on the unshielded baro or prop wash on a QuadPlane biases altitude. Height innovations climb, XKF4.SH rises.
  5. Wrong sensor source configuration. EK3_SRC1_POSXY = 0 or EK3_SRC1_YAW set to an unavailable source produces ongoing variance from the moment the EKF tries to use it.

Preventing the cascade

The shared answer for all three event types: fix the underlying sensor cause. Raising gates (EK3_*_I_GATE) makes the autopilot tolerate noisier sensors but lets bad measurements through. Disabling cores or compasses removes the redundancy that was the only thing holding the system together. The durable fixes are mechanical and electromagnetic, not software:

  • Antenna elevation, multi-constellation, dual GPS.
  • Compass on a mast away from current-carrying wires; CompassMot calibration.
  • Mechanical vibration sources hunted down before flight.
  • Barometer foamed and out of prop wash.
  • Correct EK3_SRC*_* configuration for the available sensor suite.

When LogHat helps — and when it doesn’t

LogHat surfaces all three event types in a unified timeline, annotates which sensor family triggered each, and shows the cascade order so you can see what tripped first. What we can’t do is identify which physical sensor on your build is failing — that needs a bench swap or hover-stand measurement. The log narrows the search to GPS, compass, IMU, or baro; the hardware identification is yours.

About the author

LE

LogHat Engineering Team

The LogHat engineering team — drone-systems engineers who build and operate the LogHat flight analytics platform. Posts in this byline are written and reviewed by team members working on the parsers, analysis engine, and Vector AI that the post describes.

Tagged

ardupilotekfekf3log-analysislane-switchfailsafe

Try LogHat

Analyze your flight logs in seconds

Upload a .bin, .tlog, .log, or .ulg file. Get AI crash analysis, 3D replay, and forensic PDF reports instantly.

Try LogHat Free