Installing OMAP3 images on a SD card

This article and screen-cast is a continuation of that last couple posts describing the BEC OE build template.  The purpose again for a build system is to automate tedious manual tasks, and in doing so, we end up documenting how the build system works.  Having a good build system is important during product development so that you have an easy, repeatable process for building and deploying images.  One of these tasks is setting up a SD card and installing images to the SD card that an OMAP system can boot from.  This screencast goes over the script and makefile targets that are used to automate this process.

In summary, the steps to set up a SD card and install an image using the build template are:

  1. cd <build template directory>
  2. determine the SD card device name (/dev/sdX)
  3. unmount any file systems mounted on the SD card
  4. sudo ./scripts/ /dev/sdX  (make sure you have the right device!!!)
  5. make omap-install-boot
  6. make omap-install-<image name>-rootfs

Previous articles in this series:


Gumstix Overo review

Based on the interest and number of embedded modules currently available, it appears that the OMAP3 CPU from TI will be very popular in the general purpose embedded Linux market.  One of the OMAP3 modules available is the Overo from Gumstix.  As the company name suggests, this module looks about like a stick of gum, but smaller, as shown in the photo below.  The Gumstix module provides a lot of functionality in a very small package.  It is reasonably priced, and is an excellent way to quickly create a product that has advanced functionality such as a high powered CPU (OMAP3), high speed USB, DSP, 3-D graphics acceleration, etc.  For an example of the OMAP3 performance compared to previous generations of ARM processors, see this article.

Overo size comparison to a US dime
Overo size comparison to a US dime

The module contains all the core components, and can then be attached to a custom baseboard using the two high density connectors shown in the above photo.  There are 70 signals on each of the two connectors.  BEC provides a spreadsheet of the Overo signals to assist in designing a custom baseboard.

Due to the high level of component integration, the Overo has very few components on the module.  As shown in the below photo, there are really only 3 major visible components.  The flash and RAM is stacked on top of the CPU.  The PMIC contains all the required power managment circuitry, as well as Audio and USB interfacing functionality.

Major components on the Gumstix Overo
Major components on the Gumstix Overo

The Overo module must be used with a baseboard.  Gumstix provides several development baseboards.  For production, a custom baseboard is typically created.  The Gumstix baseboards are reasonable cost, so they could be used for prototyping or low volume production if they have the required functionality.  Below is a photo of the Summit baseboard that provides DVI video out, audio, USB host and OTG, and an expansion connector with a number of signals.

Gumstix Summit baseboard (overo is not installed)
Gumstix Summit baseboard (overo is not installed)

Below is the Palo baseboard from Gumstix that can be used to interface with a LCD display.  The Palo board contains a resistive touch screen controller, and circuitry required to interface with a LCD.

Palo baseboard with Overo installed
Palo baseboard with Overo installed
Display connected to Palo baseboard.  Note the tape applied to the display to keep from shorting to components on baseboard.
Display connected to Palo baseboard. Note the tape applied to the display to keep from shorting to components on baseboard.

The Overo does not include an Ethernet controller, so you typically use a USB-Ethernet adapter for development.  Below shows a typical development setup.

Typical USB-Ethernet connection through a USB hub.
Typical USB-Ethernet connection through a USB hub.

One of the things about the OMAP3 that makes it very attractive for embedded development is the amount of software support, and the number of development and production solutions available.  The OMAP3 is very well supported by the OpenEmbedded project, and TI is very active in making contributions to various open source projects.  This greatly increases the quality and availability of advanced software functionality needed to support a complex system like the OMAP3.

Size comparison of several OMAP3 systems
Size comparison of several OMAP3 systems
WIFI and BT antennas connected to Overo module
WIFI and BT antennas connected to Overo module

There are many options for software development with the OMAP3 systems.  Below is a photo of a very simple Qt application.  GTK+ and Enlightenment are other popular GUI toolkits.  With 256MB of both flash and RAM, this is more than enough memory for most embedded applications, and provides plenty of headroom for adding features for future product revisions.  The Overo has the CPU processing capability to run advanced GUI applications with 3-D affects, as well as advanced web application frameworks like web2py that previous generations of ARM CPUs could not effectively run.

A simple Qt application that controls LEDs and reads user switch
A simple Qt application that controls LEDs and reads user switch

The Gumstix Overo provides an excellent value for developing advanced products.  Especially for low volume products, when you compare the effort and time required to develop a full custom OMAP3 solution versus a simple Overo baseboard, using a module can provide significant advantages in terms of time to market, and development cost.


OMAP3 Resume Timing

One of the most common power management modes for ARM processors is the suspend mode.  In this mode, peripherals are shut down when possible, the SDRAM is put into self-refresh, and the CPU is placed in a low power mode.  A useful bit of information is to know how soon the system can respond to a resume (wake) event.

It turns out the answer to this question depends on where you want to do the work.  In the first test, I created a simple application that continuously toggled a GPIO.  This is an easy way to tell determine with a scope when the application was running.  I then measured the time between the wake event, and when this application signal started toggling.  It was a consistent 180ms.

For the next test I simply toggled a gpio in the omap3_pm_suspend() first thing after resume.  This tells me roughly how fast I can execute code in the kernel after a wake event.  This turned out to be 250uS.

So the good news is we can run kernel code very quickly after a resume event (250uS), but it currently takes a long time until user space processes start running again (180mS).  The 180ms can likely be reduced with some work, but this at least gives us a baseline.  180ms is fairly quick from a human perspective (seems pretty much instant), but for industrial devices that are responding to real-world events, then 180mS can be a long time.


Linux PM: OMAP3 Suspend Support

This article provides an overview of the Linux kernel support for the suspend state in the TI OMAP3.  Power management has always been one of the more difficult parts of a system to get right.  The OMAP3 power management is quite extensive.  There are many levels of very granular control over the entire system.  Initially, we are going to focus on a simple power state where we want to put the CPU in a low power mode (basically not doing anything), but resume very quickly once an event occurs and continue running applications where they left off.  This is typically called “Suspend”.

Power States

We must first match up terminology between the Linux kernel and the OMAP documentation.  The Linux kernel supports several suspend power states:

  • PM_SUSPEND_ON: system is running
  • PM_SUSPEND_STANDBY: system is in a standby state.  This is typically implemented where the CPU is in a somewhat static state.  After all the peripherals have been shut down, and the memory put into self refresh, the CPU executes an instruction that puts the CPU into a low power state.  After the next wake-event occurs, the CPU resumes operating at the next instruction.  The CPU state is static in this case.
  • PM_SUSPEND_MEM: This goes one step beyond STANDBY in that the CPU is completely powered down and loses its state.  Any CPU state that needs to be saved is stored to SDRAM (which is in self refresh), or some other static memory.  When the system resumes, it has to boot through the reset vector.  The bootloader determines that the system was sleeping, and re-initializes the system, enables the SDRAM, restores whatever state is needed, and then jumps to a vector in the kernel to finish the resume sequence.  This mode is fairly difficult to implement as it involves the bootloader, and a lot of complex initialization code that is difficult to debug.
  • PM_SUSPEND_MAX: From browsing the kernel source code, it appears this state is used for “Off” or perhaps suspend to disk.

The OMAP3 documentation uses the following terms:

  • SLM: Static Leakage Management.  Includes “suspend” functionality.
  • Standby:  OMAP3x retains internal memory and logic.  (uses 7mW)
  • Device Off: System state is saved to external memory(0.590mW).  This could theoretically map to the PM_SUSPEND_MEM state.
  • WFI: Wait for Interrupt.  This is an ARM instruction that puts the CPU in the standby state.

Based on the source code in the Linux kernel, there is currently only support for the Standby state:

static int omap3_pm_enter(suspend_state_t unused)
	int ret = 0;

	switch (suspend_state) {
		ret = omap3_pm_suspend();
		ret = -EINVAL;

	return ret;

What kernel branches/versions support PM

Kevin Hilman has been maintaining an OMAP3 power management branch at  Bits are being merged upstream.  Some of the things that don’t seem to be merged yet are SmartReflex support, and a number of other details.  But, it seems that basic suspend/resume support is available in the upcoming 2.6.32 kernel.

More Details on the Standby Mechanism

Due to the granularity and complexity of the the OMAP3 power management, relevant pieces of documentation are scattered throughout the OMAP35x Technical Reference Manual.  The following were helpful:

  • Table 3-17: details the MPU subsystem operation power modes
  • Table 3-18: Power Mode Allowable Transitions.  Standby
    mode can enter from active mode only by executing the wait for interruption (WFI) instruction.
  • SDRC, “Self Refresh Management”.  The OMAP3 can automatically put the SDRAM in self refresh mode when the CPU enters the idle state.
  • MPU Power Mode Transistions: The ARM core initiates entering into standby via software only (CP15 – WFI).

The omap34xx_cpu_suspend function is called to put the CPU into standby mode.  The actual instruction that stops the CPU is the ARM WFI (wait for interrupt) instruction.  This is a little different than previous generations of ARM CPUs that used coprocessor registers.


With the 2.6.32-rcX kernels, suspend/resume appears to work at least from the console:

root@overo:~# echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.00 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.00 seconds) done.
Suspending console(s) (use no_console_suspend to debug)

<after key press on serial console ...>

omapfb omapfb: timeout waiting for FRAME DONE
CLIFF: going into suspend
Wake up daisy chain activation failed.
CLIFF: returning from suspend
Successfully put all powerdomains to target state
Restarting tasks ...

The CLIFF messages are debug statements placed in the low-level suspend code to make sure we were getting to that point.  The console is disabled early in the suspend process, so we don’t get the debug messages until after resume.  The no_console_suspend kernel option can be used to leave the console enabled during suspend, but that feature does not seem to work.

Summary/Future work

This covers the basics of Linux Suspend/Resume support for the OMAP3 CPU.  There are many more options and details such as configuring wake sources, dynamic power management while running, etc.


Gumstix Overo Pinout Spreadsheet updated

The Gumstix Overo Pinout Spreadsheet has been updated with the Palo board connector pinouts, and a few mistakes have been fixed.




Embedded Linux versus Windows CE

Occasionally I am asked how Embedded Linux compares with Windows CE.  I have spent the past 5 years doing mostly embedded Linux development, and the previous 5 years doing mostly WinCE development with a few exceptions, so my thoughts are no doubt a little biased toward what I understand best.  So take this with a grain of salt 🙂  In my experience, the choice is often made largely on perception and culture, rather than concrete data.  And, making a choice based on concrete data is difficult when you consider the complexity of a modern OS, all the issues associated with porting it to custom hardware, and unknown future requirements.  Even from an application perspective, things change over the life of a project.  Requirements come and go.  You find yourself doing things you never thought you would, especially if they are possible.  The ubiquitous USB and network ports open a lot of possibilities — for example adding Cell modem support or printer support. Flash based storage makes in-field software updates the standard mode of operation.  And in the end, each solution has its strengths and weaknesses — there is no magic bullet that is the best in all cases.

When considering Embedded Linux development, I often use the iceberg analogy; what you see going into a project is the part above the water.  These are the pieces your application interacts with, drivers you need to customize, the part you understand.  The other 90% is under water, and herein lies a great deal of variability.  Quality issues with drivers or not being able to find a driver for something you may want to support in the future can easily swamp known parts of the project.  There are very few people who have a lot of experience with both WinCE and Linux solutions, hence the tendency to go with what is comfortable (or what managers are comfortable with), or what we have experience with.  Below are thoughts on a number of aspects to consider:


Questions in this realm include CPU support, driver quality, in field software updates, filesystem support, driver availability, etc.  One of the changes that has happened in the past two years, is CPU vendors are now porting Linux to their new chips as the first OS.  Before, the OS porting was typically done by Linux software companies such as MontaVista, or community efforts.  As a result, the Linux kernel now supports most mainstream embedded cpus with few additional patches.  This is radically different than the situation 5 years ago.  Because many people are using the same source code, issues get fixed, and often are contributed back to the mainstream source.  With WinCE, the BSP/driver support tends to be more of a reference implementation, and then OEM/users take it, fix any issues, and that is where the fixes tend to stay.

From a system perspective, it is very important to consider flexibility for future needs.  Just because it is not a requirement now does not mean it will not be a requirement in the future.  Obtaining driver support for a peripheral may be nearly impossible, or be too large an effort to make it practical.

Most people give very little thought to the build system, or never look much beyond the thought that “if there is a nice gui wrapped around the tool, it must be easy”.  OpenEmbedded is very popular way to build embedded Linux products, and has recently been endorsed as the technology base of MontaVista’s Linux 6 product, and is generally considered “hard to use” by new users.  While WinCE build tools look simpler on the surface (the 10% above water), you still have the problem of what happens when I need to customize something, implement complex features such as software updates, etc.  To build a production system with production grade features, you still need someone on your team who understands the OS and can work at the detail level of both the operating system, and the build system.  With either WinCE or Embedded Linux, this generally means companies either need to have experienced developers in house, or hire experts to do portions of the system software development.  System software development is not the same as application development, and is generally not something you want to take on with no experience unless you have a lot of time.  It is quite common for companies to hire expert help for the first couple projects, and then do follow-on projects in-house.  Another feature to consider is parallel build support.  With quad core workstations becoming the standard, is it a big deal that a full build can be done in 1.2 hours versus 8?  How flexible is the build system at pulling and building source code from various sources such as diverse revision control systems, etc.

Embedded processors are becoming increasingly complex.  It is no longer good enough to just have the cpu running.  If you consider the OMAP3 cpu family from TI, then you have to ask the following questions: are there libraries available for the 3D acceleration engine, and can I even get them without committing to millions of units per year?  Is there support for the DSP bridge?  What is the cost of all this?  On a recent project I was involved in, a basic WinCE BSP for the Atmel AT91SAM9260 cost $7000.  In terms of developer time, this is not much, but you have to also consider the on-going costs of maintenance, upgrading to new versions of the operating system, etc.


Both Embedded Linux and WinCE support a range of application libraries and programming languages.  C and C++ are well supported.  Most business type applications are moving to C# in the WinCE world.  Linux has Mono, which provides extensive support for .NET technologies and runs very well in embedded Linux systems.  There are numerous Java development environments available for Embedded Linux.  One area where you do run into differences is graphics libraries.  Generally the Microsoft graphical APIs are not well supported on Linux, so if you have a large application team that are die-hard windows GUI programmers, then perhaps WinCE makes sense.  However, there are many options for GUI toolkits that run on both Windows PCs and Embedded Linux devices.  Some examples include GTK+, Qt, wxWidgets, etc.  The Gimp is an example of a GTK+ application that runs on windows, plus there are many others.  The are C# bindings to GTK+ and Qt.  Another feature that seems to be coming on strong in the WinCE space is the Windows Communication Foundation (WCF).  But again, there are projects to bring WCF to Mono, depending what portions you need.  Embedded Linux support for scripting languages like Python is very good, and Python runs very well on 200MHz ARM processors.

There is often the perception that WinCE is realtime, and Linux is not.  Linux realtime support is decent in the stock kernels with the CONFIG_PREEMPT option, and real-time support is excellent with the addition of a relatively small real-time patch.  You can easily attain sub millisecond timing with Linux.  This is something that has changed in the past couple years with the merging of real-time functionality into the stock kernel.


In a productive environment, most advanced embedded applications are developed and debugged on a PC, not the target hardware.  Even in setups where remote debugging on a target system works well, debugging an application on a workstation works better.  So the fact that one solution has nice on-target debugging, where the other does not is not really relevant.  For data centric systems, it is common to have simulation modes where the application can be tested without connection to real I/O.  With both Linux and WinCE applications, application programing for an embedded device is similar to programming for a PC.  Embedded Linux takes this a step further.  Because embedded Linux technology is the same as desktop, and server Linux technology, almost everything developed for desktop/server (including system software) is available for embedded for free.  This means very complete driver support (see USB cell modem and printer examples above), robust file system support, memory management, etc.  The breadth of options for Linux is astounding, but some may consider this a negative point, and would prefer a more integrated solution like Windows CE where everything comes from one place.  There is a loss of flexibility, but in some cases, the tradeoff might be worth it.  For an example of the number of packages that can be build for Embedded Linux systems using Openembedded, see


It is important to consider trends for embedded devices with small displays being driven by Cell Phones (iPhone, Palm Pre, etc).  Standard GUI widgets that are common in desktop systems (dialog boxes, check boxes, pull down lists, etc) do not cut it for modern embedded systems.  So, it will be important to consider support for 3D effects, and widget libraries designed to be used by touch screen devices.  The Clutter library is an example of this.


Going back to the issue of debugging tools, most people stop at the scenario where the device is setting next to a workstation in the lab.  But what about when you need to troubleshoot a device that is being beta-tested half-way around the world?  That is where a command-line debugger like Gdb is an advantage, and not a disadvantage.  And how do you connect to the device if you don’t have support for cell modems in New Zealand, or an efficient connection mechanism like ssh for shell access and transferring files?


Selecting any advanced technology is not a simple task, and is fairly difficult to do even with experience.  So it is important to be asking the right questions, and looking at the decision from many angles.  Hopefully this article can help in that.  For additional assistance, please do not hesitate to contact BEC Systems — we’re here to help.


Memory Performance on various Embedded Systems

Marcin just published an interesting article about memory performance on various embedded systems using the hdparm -T as a simple benchmarq.  This test gives a pretty good indicator of memory performance in the system.  From the hdparm man page:

Perform timings of cache reads for benchmark and comparison purposes.  For meaningful results, this operation should be repeated 2-3  times on an otherwise inactive system (no other active processes) with at least a couple of megabytes of free memory.  This displays the speed of reading directly from the Linux buffer cache without disk access.  This measurement is essentially an indication of the throughput  of  the processor, cache, and memory of the system under test.

A few results I find interesting:

  • modern desktop systems have an order of magnitude more memory bandwidth than ARM systems.
  • the i.MX31 is the highest performing ARM device tested
  • the i.MX31 performs better than the OMAP3 in this test — why is this?  As the ratio is 2, I’m guessing the bus is twice as wide?

Intel Atom vs TI OMAP3

As we look at new projects, both the Intel Atom and the TI OMAP3 processors generate considerable interest.  As we have already shown, the OMAP3 does offer a considerable performance improvement over earlier generations of ARM CPUs.  The following video I found on YouTube shows a similar comparison of a OMAP3 and Atom systems rendering web pages:

As one would expect, the Atom does perform better (about 14%), but considering the power differences, the OMAP does surprisingly well.  It is also unknown in this demo if the screen size would make a significant difference in the results.  Like most things, the choice depends on the application, and no two applications are the same, and each solution has advantages.  Some things to think about:

  1. Power: OMAP3 platform consumes on the order of <1-2W while the Atom is more in the range of 2-5W.
  2. High Speed I/O Interfaces: Atom supports PCI and PCI expansion interfaces where OMAP3 is limited to more special purpose user interfaces such as SD, Camera, Asynchronous bus, etc.  Both Atom and OMAP support High Speed USB.
  3. Packaging: OMAP3 packaging is very aggressive with the stacked Package-on-Package.  To get an idea how much space an OMAP3 solution takes, check out the module from Gumstix.  There are basically only two chips in the system: the OMAP3+stacked RAM/Flash and a power management+I/O chip.  This is very high integration!
  4. Module availability: for many embedded systems with volumes in the 1000’s of units per year, a module solution is very attractive compared to a full custom design.  This drastically reduces the engineering effort and time to market.  A sampling of the modules available include:
  5. Software support:  TI and the open source community have done a remarkable job of supporting the OMAP3 with the BeagleBoard effort.  Gumstix maintains open source software for their devices, and has a very active development community.  Intel also has invested significantly in software with their Moblin project.  Other factors to consider is the boot software (bootloader vs BIOS, is it open?), are there 3D graphics libraries available, etc.
  6. Multimedia processing:  The OMAP3 is available with an on-chip DSP.  Intel has traditionally offered extensions for multimedia processing such as SIMD.



GTK performance on PXA270 vs. OMAP3

Several of my customers have built applications using the GTK+ tookit.  While GTK+ works fairly well for what we have done, I have been wondering how the performance compares on the new Omap3 processors from TI. As we are evaluating the OMAP3 for several projects, I did a simple comparison with an existing application.  Below is a video that shows a fairly complex application running on both a PXA270, and a OMAP3530.  While the PXA270 gets the job done, the result on the OMAP3 is much more pleasing.  With that advent of a OMAP3 module available for $117 in volume, it seems like the OMAP3 will be a popular solution for upcoming Embedded Linux projects.