Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are techniques for allowing safe software upgrades in embedded systems [closed]

Upgrading software for embedded devices often has the possibility of "bricking" the device, e.g. if power should happen to fail while in the midst of writing software to FLASH. Two questions:

  1. What are some best practices for implementing the upgrade mechanism so as to minimize the probability that the device will be "bricked"?
  2. What are some best practices for making the upgrade process fail-safe, so that events like power failures while installing software to FLASH can be recovered from?
like image 414
Lance Richardson Avatar asked May 18 '09 12:05

Lance Richardson


3 Answers

It all depends on how critical the application is. The two basic approaches (backup and bootloader) are also combined sometimes.

Many systems have a read only bootloader (like redboot), and then two banks of flash memory (on the same chip, most often). The bootloader then has a flag to choose which bank to boot from. The flag will then change based on events like upgrades (failed or successful), and so on.

So, when upgrading, the running version copies the new load into the backup bank, checks the checksum, toggles the boot flag, and then reboots the device. The device reboots on the new bank, with the new load. After the reboot, the new load can copy itself into the backup bank.

Often there is also a watchdog timer with a hardware reset. This way, if the firmware goes insane, it fails to kick the watchdog, the hardware reset will reboot the device, and the bootloader will look for a sane load.

The Open Mesh project is a good example of this approach.

like image 74
Sean Cavanagh Avatar answered Sep 23 '22 00:09

Sean Cavanagh


More specifically...

Download the replacement image to an area of memory without overwriting ANY of the current program space. Wait until the download is complete, THEN compute and compare CRCs.

If space is really a problem, you can do the 'default backup' AKA 'recovery mode' sort of thing, but it's much slicker to not do this destructively.

If you're -really- slick... you can do a single write update to FLASH to direct the device to boot from the new code location. This will ping/pong between two totally seperate code sections. This is about the safest way you can do this:

  • ALWAYS have a non-updatable recovery bootloader (Nano-loader) which can be signaled to load new code somehow if everything goes wrong.
  • Two seperate program spaces
  • Each program space has a "CRC" field, a "burn number" (higher than the other code page's number), and an "invalid" word (all Fs - don't require an erase to update the "invalid" marker)
  • Once a download is complete, verify the CRC. If it's good, burn the 'invalid' marker on the old version's program space.
  • The Nano-loader checks the 'invalid' marker to know which to boot to. In the case that they're both valid, do a CRC check. If they're still both valid, then take the higher burn number entry

Oh, and when people say checksum... don't 'check the sum'... Do a proper CRC.

like image 21
darron Avatar answered Sep 21 '22 00:09

darron


checksums are good but only save you from flashing in corrupted data. what if you flash in an image file with a valid checksum but for a different product model. a read only default boot loader that can be accessed in case of emergency corruptions is the best thing I have seen.

like image 24
ThePosey Avatar answered Sep 23 '22 00:09

ThePosey