Are you considering implementing a web UI in your Zephyr RTOS application and perplexed by the endless number of web technologies available today? This article traces my journey through trying several options and presents some of the tradeoffs.

Modern microcontrollers (MCUs) have many connectivity options (Ethernet, WiFi, Bluetooth, Cellular, etc.), and when coupled with the Zephyr OS, implementing a web application on these devices often makes sense. Once you start doing this, you soon realize that this is not Ruby-on-Rails. Rather, Zephyr is still a very constrained environment (see my article on the differences between MPUs and MCUs). While we can still leverage modern web technologies, we need to be selective because of the constraints of MCU systems. An example of a web UI in the Zephyr SimpleIoT project is shown below:

Primary concerns in an MCU system

Size

The memory resources on MCU and MPU systems are vastly different — a difference of over 1,000x.

	MCU (Internal)	MCU (External)	MPU (Server)
Flash	0.5-2MB	16MB	500GB
RAM	0.25 – 1MB	16MB	16GB

While it is possible to use larger memory devices with MCUs, like an SD card or SDRAM, the above table is for the more common scenarios.

If we want to serve the web application entirely from a Zephyr device, then the web assets must fit in flash. The RAM size also limits the number of connections we can handle and the size of the data payloads we can transmit and receive.

Stability (includes maintainability and security)

The constraints of an MCU drive us toward doing more in the browser (frontend) because resources there are free and plentiful. This typically ends up being a Single-Page Application (SPA) architecture, where the web application is a free-running application that fetches and sends data to the MCU (backend). All rendering happens in the browser instead of the MCU. Because of this, the web application needs to be stable. Extra care must be taken so that the application does not crash and freeze or pop up annoying messages telling the user to reload the page, which can easily happen with JavaScript.

We also need to consider how the application is hosted. A typical web application is hosted in the cloud in one place. If there is a problem, we have complete control over updating and fixing it as needed. Embedded applications are a completely different matter — there may be thousands of devices. If there are problems, it is not as simple as updating a single server. We may have a firmware update mechanism in the device, but it is often up to the user to decide when to update. Users expect embedded devices to be stable. Many of them are used in critical applications. An instability in the web UI of the device does not give a positive impression, even though it may only be a web frontend crash, and the device is still functioning fine.

It seems simple to write an initial prototype, but maintaining a web application over time is a different matter. Security can also be a concern as many web frameworks pull in mountains of JavaScript dependencies via NPM, which are difficult to audit for security problems.

State Management

One of the hardest parts of a web application is maintaining state. The reason for this is that user interfaces can be deeply nested and complex. The naive approach is to distribute the state into each UI component. The difficulty with this is keeping everything in sync. There has been a continual stream of new frontend state management libraries (Redux, MobX, VueX, React Context, XState, react-easy-state, etc.). This is not an easy problem.

Options

Serve site/assets from external server

One way to work around the asset size constraint in MCUs is to serve large assets (like a Bootstrap CSS file) from an external server.

There are several potential issues with this:

The external asset must always be available, and if it goes away or changes, then the web application on the device must be updated to point to the new location.

If the entire site is served from an external server, then you must host all versions of the web application or be very careful to maintain backward compatibility to all previous versions of firmware running on the Zephyr device.

You may need the Zephyr UI to function if the Internet is down or unavailable (example: air-gapped secure setups or during device setup). In many systems, device state is mirrored to an upstream server, so the only time you would use the local UI is during setup or when the Internet is down.

The most reliable way to host a Zephyr web application is on the device itself. You certainly have many more options if you host it externally, but there are tradeoffs that must be considered.

No Build, just HTML, JS, CSS

DHH and others are advocating No-Build web solutions. However, they have Rails for a backend, and we have Zephyr. I investigated this approach, but manually mutating HTML elements in JavaScript in a complex app is messy and scattered — you’ll eventually create a framework anyway, so you may as well start with something close to meeting your needs.

HTML/JS/CSS development is powerful and flexible, but is tailored to design, not programming. In its raw state, it is like developing a program where everything is a global variable. If you have a dynamic web application responding to real-world events, this quickly becomes a programming problem that is difficult to manage with 200 global variables. You need an abstraction on top of this. Again, a powerful templating engine on the backend helps a lot, which we do not have in Zephyr applications.

React, Vue, Angular (non-compiled frameworks)

Solutions like React are too large for many MCU applications as they typically require a megabyte or more of web assets to be transferred from the host. Additionally, you have the issue of JavaScript crashes and the continual bloat and churn of the NPM package ecosystem. Updating to new versions of packages can be a lot of work as APIs sometimes change. While this solution can work for large-scale cloud-hosted apps, it is not ideal for embedded web UIs where you often don’t have control over the device and can’t easily patch the system if there is a problem.

I’ve been writing/maintaining a 50K SLOC Meteor/React app for the past 10 years — it works and has been very successful. However, we’ve had a handful of frontend crashes. Being a cloud app, we were able to deploy fixes for this quickly, but it is still an inconvenience for users. In another project developed by a very experienced JavaScript developer, we also had a number of JavaScript crashes in the frontend. There are likely ways to reduce or avoid these stability problems, but it is not inherent in the system/language. It requires a high degree of carefulness and discipline. It is not easy to refactor JavaScript and know that you have fixed all the issues caused by your code change.

There are slimmer solutions like Preact which may solve the size issue, but you likely lose some of the integration with all the libraries written for React.

htmx

htmx is the cool kid on the block right now — supposedly simple and easy to use. I tried this for the first iteration of a Zephyr web application and got a prototype running, but concluded it works best if you have a powerful backend that can do most of your rendering, and the frontend just handles a little bit of reactivity. Additionally, with htmx, you typically open up a network connection for every widget to get updates, and this requires a lot of connection resources on the backend, which are expensive in a Zephyr system.

Svelte

Svelte is a compiled language that generates small assets, so it works well from a size perspective. It also appears to be a well-maintained project with good tooling. However, there are two downsides: 1) it is still JavaScript, so you have potential stability issues when maintaining applications long-term. 2) It puts the “JS” in the “HTML” instead of the “HTML” in the “JS” (like React or Elm). This is largely a matter of preference, but as a programmer, I prefer the latter. The success of React indicates that many others do, too.

Functional Languages

Functional languages are becoming more common in frontend development because they solve many reliability and refactoring challenges. Some options include ClojureScript, Elm (discussed below), PureScript, ReasonML, F#, Scala.js. Some of these options likely have smaller asset sizes than non-compiled options.

Elm

Elm is a compiled language that produces small assets. It is like React JSX in that it puts the “HTML” in the “Elm“. Elm is a functional language that solves many of the challenges with web programming, including stability and managing state. Runtime errors in an Elm application are very rare. I wrote the Simple IoT UI in Elm, which is now 18,610 lines of code, and have maintained it since 2018. It has never crashed, and it is easy to add new features. Code density/reuse in Elm is very high. If it compiles, it generally works, so very little time is spent in the browser console debugging stuff. The elm-ui package offers a powerful style and layout system that eliminates most of the pain with web programming and allows someone (like me) who is not a dedicated web developer to maintain a pretty nice web application without learning the intricacies of CSS.

As you maintain an Elm project, you’ll find that the language pushes you to a cleaner design. It nudges you in the right direction. The opposite generally happens in JavaScript projects, where a lot more discipline is required to keep things from becoming a mess.

Elm allows for creating powerful functions like the following for a text input form:

textInput Point.typeBitRate "Bit rate" "250000"

This line of code renders a form element and automatically posts a new point to the backend of type BitRate when the content is changed, nothing else in the system needs to change for this to happen. There are no “handlers” that need to be attached to this element. Elm also has some of the best tooling in this industry — elm-review, elm-land, etc. This tooling is largely made possible by the language.

Elm is a small project and community, but it is a stable/friendly community that has existed for many years, and many companies are successfully using Elm. The Elm package system is fairly comprehensive and has most of what you might need to implement web apps. Since Elm packages are pure Elm, the entire ecosystem is stable and relatively secure. There is some concern about the future of Elm as the compiler does not get frequent updates. Some have produced new versions of the compiler to meet special needs. The packages and tooling around Elm are where most innovation is happening right now, and it is phenomenal. Elm got it mostly right, so there is not much that needs immediate improvement in the core.

Summary

There is no silver bullet in technology. With any choice, there are tradeoffs. Most people stick with mainstream JavaScript for frontend programming as the “safe” approach. However, in Zephyr/MCU-based systems, there are constraints that we don’t face in server environments.

There are significant challenges in the HTML/JS/CSS world. While it is nice to have options, the continual churn of new frameworks (jQuery -> Backbone -> Ember -> Angular -> Meteor -> React -> Angular2 -> Vue -> Svelte -> Next.js) (and many others) indicates there is some fundamental problem with the Javacript approach that has not been solved in a general way. The number of NPM package dependencies pulled into a typical project is staggering and nearly impossible for small teams to audit for security and longevity. NPM installs routinely break for various reasons. NPM suffers from some of the same problems as Python, as many packages compile C code in non-standard ways during the package install process, which can be fragile.

There is also the issue of what might be viewed as an impedance mismatch — what the technology mega-companies produce and use is not always the best fit for small companies/teams with limited resources. Facebook faces different problems on a different scale than a small embedded team. While React is right for them, we can’t assume it is right for everyone. Most of us are not facing Google-scale problems that require a system as complex as Kubernetes. Ansible and a few servers work just fine. Facebook has near-infinite resources. They have ways to deal with the inherent stability problems of JavaScript. A small company/team does not have these resources, so different technologies like Elm may be a better fit.

When faced with the need for a Zephyr web application, I tried htmx first, then looked at “no build”. Neither of these worked very well for me. Having worked on decent-sized React and Elm projects, Elm looked like a better fit for the next try due to asset size, reliability, and maintainability. Additionally, I had already solved the problem of how to effectively deal with data in a distributed IoT system, and had frontend code already written to interface with this dataflow in Simple IoT. It was easy to port the Simple IoT frontend to a smaller version tailored for MCUs, and thus far it seems to be working well. Check out the Simple IoT Zephyr Networking Example to learn more.

How to implement Zephyr web applications