Mixing C and Java programming in embedded, IoT designs
Although the Java language is the number one programming language in the world [1, 2], one may think that its adoption is lagging in "traditional" embedded systems because of its "fat and slow" reputation.
Actually, Java technologies changed the game in one particular type of embedded system, which is the cell phone. Cell phones have always had their specific hardware platforms and operating systems (for example, the Nokia Symbian and BlackBerry OS), but the advent of smartphones contributed to the emergence of app ecosystems such as Google's Android. Android apps are programmed in the Java language, so Java, in fact, has already won a significant percentage of embedded systems development – yes, smartphones have become "big" devices with powerful processors and plenty of memory/storage, but they're still embedded devices.
Today, the Java language is winning more and more designs in traditional non-mobile embedded systems and, in conjunction with real-time operating systems (RTOSs) and traditional C programming, is poised to become the solution of choice for IoT developers. To understand why, let's explore in more detail.
Taking embedded development to the next level
The Internet of Things is the next level for embedded systems, as IoT can be seen as "embedded" on a much larger scale:
• Programmability: Billions of IoT devices cannot be programmed with the limited number (in the hundred thousand range) of embedded/C/RTOS experts in the world. Industry needs to leverage larger communities (millions) of programmers from mobile/PC/server to meet the massive demands of the Internet of Things.
• Connectivity: IoT involves multiple wired and wireless physical layers and IP-based transport layer protocols such as UDP, TCP/IP, HTTP, TLS, REST, as well as new protocols and frameworks like CoAP, MQTT, and LWM2M.
• Complexity: IoT devices embed larger software content with more features and the capability to add new features dynamically (in the field) to address evolving technical or market needs.
• User experience: Consumers expect to interact with IoT devices as they do with their smartphones and tablets.
• Security: IoT devices need security at all levels – code execution, communications, identification/authentication, data storage, etc.
Java platforms provide a good solution to these challenges, as:
• Java is the number one language in the world, and all software engineering students learn it at the university level
• Java platforms offer generic implementation and APIs for IP-based networking, IoT protocols, and most non-IP protocols.
• The Java language and object-oriented programming (OOP) is well known for minimizing complexity, improving productivity, and reducing bugs.
• Java platforms enable dynamic downloading of code.
• Java platforms provide built-in security.
The key to success of Java platform implementations in embedded systems relies on tight integration with the underlying world of C, and leveraging it to the fullest extent. Java programming is not meant to replace C programming, as the C language and RTOSs are very good at providing a base runtime on top of embedded microprocessors (MPUs) and microcontrollers (MCU), and solving challenges associated with hardware-dependent software. However, Java programming is better at dealing with (developing, debugging, and maintaining) larger software packages and complexity, and at addressing hardware-independent application code.
Just like Android's virtual machine sits on top of Linux, an embedded Java platform can sit on top of an embedded RTOS and C runtime. The embedded Java platform has to be open and integrated as an independent piece of software by the C developer responsible for software bring-up on the embedded hardware, but this combined approach allows embedded projects to benefit from the best of both worlds: C for hardware interfacing and performance and Java for portability and scalability. Projects can also solve device programmability and software productivity issues as a few low-level C developers can enable dozens of higher level Java developers to build Java platforms on top of their C runtime.
Four key ingredients for Java integration
Java source code is compiled into a specific format called bytecode stored in class (.class) files. Class files are usually packaged into Java archive (.jar) files, which are in fact zip files that first require inflating before their bytecode can be executed. Standard Java platforms on PCs dynamically interpret bytecode with a Java virtual machine and compile it to machine code on the fly for performance improvement using a just-in-time (JIT) compiler. Unfortunately, this process cannot be transposed to MCU-based systems because it requires a lot of memory and fast processors (for storage, the inflating program, and running the JIT compiler) that are beyond the capabilities of that class of device.
But four key ingredients exist that make Java platforms suitable for integration with an embedded C-based environment with minimal memory footprint overhead (tens of kilobytes) and equivalent performance (yes, Java code can run as fast as C code). Let's review them:
1) A single, standards-based binary code format
The Executable and Linkable Format (ELF)  has become the de facto industry standard binary format for compiled code on MCUs. It is supported by the open source GNU GCC toochain and by other commercial toolchains. ARM, the industry-leading MCU architecture, defines its application binary interface (ABI) and relocations based on ELF.
ELF should be used as the unique and final binary code format for all programming languages used in an embedded software project.
2. Minimal onboard runtime linking
The bytecode format should not be considered as an embedded binary format, but rather as an intermediate format between the source code and the binary (machine-specific) code that is compiled and linked off-board (cross-compilation process). Off-board bytecode compilation, or ahead-of-time (AOT) compilation and linking, allows one to leverage desktop compiler optimization techniques and take advantage of the underlying instruction set and its characteristics to produce efficient code.
The Java code has to be programmed and linked into flash memory at the same time as the C code. With such an implementation, no special Java linking program is required in on-board flash: the embedded virtual machine library is just a small runtime engine that can cost only a few tens of kilobytes. All code can be directly executed in place to ensure short boot time.
3. A single, standards-based native linker
The main idea behind successful integration of a Java platform on MCU-based systems is to simply see the Java language as another programming language in addition to the C language, without having to change production toolchains used today by C developers. This involves converting Java bytecode into ELF that can be mixed with ELF coming from compiled C code using off-the-shelf linker tools:
• The bytecode is compiled into a regular object file by a dedicated off-board compiler. Java functions are compiled to regular ELF sections, targeted by an ELF symbol with a naming convention that ensures standard ELF linkers can resolve Java symbols.
• The virtual machine is just a new ELF library added to the global project.
• The virtual machine APIs are described using regular C header files. APIs must be as generic as possible to enable porting the virtual machine to any underlying C runtime and associated RTOS, drivers, board support package (BSP), and C libraries. In extreme cases, only a timer is required when the virtual machine integrates its own internal scheduler, thus no RTOS is required.
• The whole (mixed) object files are statically linked with an off-the-shelf ELF linker. C developers still can use their favorite toolchain and integrated development environment (IDE).
Figure 1 shows the full mixed C and Java code compilation and linking steps.
4. Optimized Java-to-C code programming bridges
The embedded Java programming environment must offer access to some embedded specifics that can be done with C code:
• Immutable data (read-only data) for managing persistent (const) data stored in flash
• Bridges between Java and C programs linked as standard function calls so that any routines can be turned into C/assembly code if needed with zero-link runtime cost (linking Java code to C code is done by the off-the-shelf ELF linker)
• Fixed-size buffer sharing without any copy
The embedded Java runtime environment has to be implemented in an optimized way on top of the C runtime in order to:
• Provide an autonomous scheduler with built-in threads to ensure predictable scheduling adapted to embedded constraints ("green thread" integration to the RTOS: all Java threads run inside a single RTOS thread)
• Support object-oriented specifics (e.g., late binding in order to manage polymorphism)
• Manage memory (e.g., garbage collection adapted to embedded constraints, optimized array copy based on the C memcpy)
This enables easy reuse of legacy C code and integration of that code into the global Java application code.
It is common practice in software engineering to link object files with the linker provided by the same toolchain used for compiling the object files. This allows avoiding issues when trying to link objects with different binary formats. This rule remains true for embedded C and Java programming on MCU-based systems.
The four ingredients detailed previously ensure that using the Java language for programming MCU-based embedded systems does not result into large-footprint overhead. Furthermore, developers can benefit from the compactness of the Java bytecode. Developers can use widespread Java APIs (e.g., for networking, file systems) that make software truly portable. They don't need to port their source code to heterogeneous C APIs, stacks, and compilers, or work around an unequal level of support for standards like POSIX across MCU/RTOS/compilers. Off-the-shelf binary components can be created and reused across multiple MCU architectures and associated C runtimes without porting or even re-compiling source code. Binary components can be configured at link time (using link-time constants), avoiding source-level configurations with C #define statements and interdependent source files.
1. The PYPL PopularitY of Programming Language Index: http://pypl.github.io/PYPL.html
2. The TIOBE Programming Community Index: http://www.tiobe.com/tiobe_index?page=index
3. The industry-accepted ELF specification document is chapter "Object File Format" of the "Sun Solaris Linkers and Libraries Guide" (https://docs.oracle.com/cd/E26502_01/pdf/E26507.pdf)