I fixed the repeated data problem on the flight computer disks!  Technical details below, but the short description is that we made some unwarranted assumptions about how fast different processes would execute relative to each other.  A small “wait until your partner is ready” flag was enough to fix it.  This is a problem I’ve known about since last September and spent a lot of time sweating over, so it feels great to have it resolved!

Technical:  Again, we have two buffers (bins, roughly), one of which fills with new data while the other is written to disk.  When the buffer gets filled, the two get swapped and the process repeats.  Given the speed of our disks and our data rates, the original author of this code and I both expected the disk writes to happen much faster than the filling of the new data buffer.  We had code that reserved each buffer for only one of filling or writing at a given time, but nothing other than that protection kept the two tasks in sync.  The problem was, due to mysteries of scheduling known only to the Linux kernel, the new data buffer was sometimes filling repeatedly before the write task would even start!  Data was getting overwritten that was never commited to disk.  Then the write task would start, see that it had several buffers to commit to disk, and end up writing a single buffer several times in a row.  The fix was to require that the fill task wait to swap until the write task is ready for a new buffer.

I am greatly relieved to have this resolved!  The joy is overshadowed, though:  CSBF isn’t happy with our rotor, and wouldn’t certify it even to fly at the weight we had in 2005 (eliminating any chance we have of cutting down weight to meet requirements).  Sounds like we are going to have to fly the rotor back to Berkeley for examination by our engineers and/or a “pull test.”  Yikes!


One response to “YesYesYes!

  1. Good work on the buffer fix! Was wondering what it would be…