Skip to content

New Project Questionnaire

When starting a new Embedded Linux or IoT project, below are some questions you might consider. Although there is some commonality, these systems are different than MCU, Web, phone, or desktop applications, in that you are responsible for an end-to-end system where many pieces need to work together. Not only is the application a concern, but also the system software, connectivity, errors handling/recovery, updates, cloud services, etc. Many of these instances are remote. You are using many components (both hardware and software) that you did not create, write, or build. In one sense, you are standing on the shoulders of giants, but you are also dependent on these giants. It only takes one piece of the system to fail for the entire product to fail. These questions are designed to encourage comprehensive and long-term thinking.

Below are definitions of common acronyms/terms:

  • MPU: microprocessor unit (runs Linux)
  • MCU: microcontroller (runs RTOS or bare metal program)
  • RTOS: real-time operating system
  • SBC: single board computer
  • SOM: system on module
  • OSS: open source software
  • LAN: local area network (Ethernet, WiFi)
  • WAN: wide area network (cellular, internet)
  • SBOM: software BOM — lists all the software components and licenses used in the product
  • BLE: Bluetooth low energy
  • PLC: programmable logic controller
  • UI: user interface
  • edge: computers located at the application site (embedded in machines, gateways, etc). Often these are located all over the world and we don’t have direct physical access to them.
  • cloud: Internet accessible compute, database, and other resources (AWS, Google Cloud, Azure, Digital Ocean, Linode, etc)

Questions

  1. Overall product requirements
    1. What is the hardware budget target/volume?
    2. What is the product lifecycle (1, 5, 10 years)?
    3. What is the data lifecycle? What and how much data is collected, where and how is it presented, what and how much is stored?
  2. Environment/Packaging/IO
    1. How will the edge MPU be implemented? Options include PLC, SBC, SOM + custom baseboard, or full custom design.
    2. What is the power budget? Is a battery or solar power required?
    3. Is the power stable and what is the effect if power is removed from the system while writing data to local storage? Is a bridge battery or super capacitor required?
    4. What is the operating temperature range?
    5. Is the enclosure vented or sealed? Are fans for cooling acceptable?
    6. Do you need expansion/remote IO? If so, what communication mechanisms will be used? (Ethernet, WiFi, USB, I2C, SPI, RS485, CAN, 1-wire, BLE, Zigbee, LoRa, etc) What are the tradeoffs?
    7. What does your IO (input/output/sensors/actuators) look like?
  3. Performance/Security/Safety/Regulatory
    1. What regulatory approvals are required? (FCC, CE, PTCRB, UL, FDA, etc.)
    2. What is the control loop timing (seconds, ms, etc.)?
    3. Is real-time response required? If so, what latency can be tolerated?
    4. How much downtime can this application tolerate — both at the edge and in the cloud? Be careful what you ask for here — 100% uptime guarantees can be costly to implement or purchase, and even large cloud providers with redundant infrastructure like Amazon occasionally have outages.
    5. Is a split MPU/MCU architecture appropriate?
    6. Does the system need to play video or advanced 3D graphics? If so, what resolution?
    7. Is graphics acceleration required?
    8. Is edge machine learning required?
    9. Are there safety concerns? Could anything bad happen (heaters causing fire, etc) if the system locks up and leaves IO in a bad state?
    10. Do edge instances need to store configuration, state, or historical data? What happens if the system crashes or loses power while writing this data? How much edge storage is required?
    11. How many different tasks will your applications be doing at once? Do these tasks need to share information? Concurrent systems can quickly become complex and a challenge to implement reliably. Have modern technologies and methods for developing highly concurrent reliable systems been considered?
    12. What are the security concerns and how will they be managed?
    13. Is the system designed with enough storage and processing headroom for future expansion over the product lifecycle?
  4. User interface/Cloud/Configuration/Connectivity
    1. What network connectivity is required? (Ethernet, WiFi, cellular)
    2. If cellular is required, how will connectivity be implemented? (modem, carrier, etc.)
    3. For headless edge systems, do you need BLE functionality for connecting locally to a phone? In this setup, a phone can act as a local display.
    4. Do you need remote access? If so, is remote on the same LAN (local area network), or anywhere in the world?
    5. Do you need a cloud component (for remote access, history, graphs, dashboards, logging, etc.)?
    6. If there are both local and cloud user interfaces, how are you going to merge configuration changes made in both places?
    7. If the edge system is disconnected from the cloud for some time, do you need to buffer data at the edge and send everything when connectivity is restored? How will you ensure that config/state changes made when disconnected are propagated?
    8. Is there a local user interface? If so, what resolution display is needed? What input methods will be used: touchscreen, keys, etc? Is a capacitive or resistive touchscreen more appropriate for this application?
    9. What UI technology will be used? (Qt, LVGL, GTK, wxWidgets, HTML5, Flutter, Phone App, Android, proprietary)
    10. What latency in events do you need in user interfaces? If real-time response is desired (these systems are the nicest to use), how are you going to get real time data from sensors/actuators<->edge device<->cloud<->browser? Does real time also include making changes in a remote browser and having them show up on the edge device instantly?
    11. Most IoT systems are optimized for getting sensor data to the cloud (one direction). Most real-world systems also need to get configuration data from the cloud to the edge system, and often in real-time. Does your IoT architecture support data going in both directions?
  5. Architecture/Development/Testing/Deployment/Maintenance/Support
    1. Do your hardware suppliers (especially SBC/SOM) or integrators offer the support you need and are they committed to providing software updates over your entire product lifecycle?
    2. How will the team learn skills needed? (Linux sys admin, Yocto/Buildroot, new programming languages, Web development, Cloud infrastructure, CI/Deployment pipelines, high-speed PCB design, etc)?
    3. What edge programming languages will you use? Which have you used in the past? How have they worked out? How will you ensure applications (especially edge apps you don’t have direct access to) are reliable, and when they crash/fail, log/debug information can be collected? Do you understand the tradeoffs between single-threaded interpreted languages like NodeJS/Python, traditional compiled languages like C++, managed languages like Java/C#, and modern compiled languages like Go/Rust? Do you understand the benefits of compilers and type systems?
    4. Are the capabilities of the web/browser platform understood in comparison to native phone and desktop application technologies?
    5. For Web applications (both edge and cloud), what backend and frontend languages/frameworks will be used?
    6. For edge systems, does it make sense to implement the core logic in a different language than the the UI application?
    7. Is an event bus appropriate for this application? (NATS, MQTT, etc.)
    8. What databases will be used?
    9. What is the backup strategy?
    10. Is your system architecture easy to expand and adapt to new features requirements? For example, if you add another sensor at the edge, do you need to touch dozens of different software modules, communication protocols, data structures, etc. to make this happen, or is it as simple as tweaking the software at the edge to read the new sensor, and possibly the UI display software.
    11. What is your plan for deploying software updates to all processing units (both MCU and MPU) in the system (edge and cloud)? Can you easily and reliably update all of the system software as well as the applications on edge devices remotely?
    12. What development skills do you have access to (HW/SW)? Are your developers good at figuring out systems/code they did not write themselves?
    13. Do you use the Git version control system?
    14. How will you maintain your product over its lifecycle and keep up-to-date with the latest OSS developments that will add value and security updates to your product? (Linux kernel, Linux libraries, application packages, tools, run-times, etc)
    15. How will the system be tested, both during development and in manufacturing?
    16. Do you have repeatable, reliable, and automated build/deployment/CI (continuous integration) systems? Can your software be built reliably with a single command by any developer, or does it require special golden machines, know-how, undocumented manual steps, etc?
    17. Is your development/deployment process documented and automated to the point that if a key developer disappears, other developers can pick up the project?
    18. What field diagnostic mechanisms are required? (remote shell access, local serial terminal access, logs, stack traces, user runnable diagnostics, metrics, etc.)
    19. In the event of file-system corruption, what are the consequences, and what is the recovery path?
    20. What are your OSS compliance requirements? Do you need SBOMs (software BOMs) or reproducible builds? What are the licenses of all the software components used and are they compatible with your business?

This may seem like a daunting list, but with some planning and judicious selection of technologies, this can all be managed, even with a small team. With most Embedded Linux/IoT systems that have a lifecycle of more than a couple years, the development costs of maintaining the system swamp the initial development. Because these are capable, expandable systems, features will continually be added and the system improved as customers require new features. Connected systems may require security updates. Bugs will need to be fixed. Thus, it is critical that the selected technologies and implementation are maintainable over the long term. If this is not planned for, complex systems accrue technical debt to the point where all you are doing is fighting fires in order to keep the system running rather than adding new features. Additionally, we’ve collected many of the best practices we’ve learned over many years into two general open source projects:

These projects are building blocks that address many of these issues. The vision we are working toward is that a small team can implement custom end-to-end IoT systems. There are many excellent technologies available today that provide good building blocks: SBC/SOMs, Linux, cellular modem/service providers, cloud services like Linode, SQLite, InfluxDB, Grafana, modern programming languages like Go, etc. What is needed is a simple, yet comprehensive framework to glue all this together into a system and this is what the Yoe Distribution and Simple IoT aim to provide.

When faced with a task like this, many decide to build on someone else’s hosted IoT infrastructure. This may be the best and only practical option, but you are still ultimately responsible for all of the above points, so these still need investigated and verified with any vendor or supplier under consideration. Do your due diligence now and avoid the surprises later.