numx: The Math Library Embedded Systems Always Deserved.

numx is a portable, scientific-grade numerical computing library written in pure C99. No heap. No operating system. No external dependencies.
Complete modules covering linear algebra, signal processing, automatic differentiation, ODE solvers, compressed sensing, and more, all designed to run directly on the processors where every byte and every clock cycle carries real weight.
ESP32, ARM Cortex-M, RISC-V, and anything else your project runs on.

Where It Started

It was 2020. COVID had put the world on pause, and two engineers found themselves with something rare: uninterrupted time to actually build. Working out of the same house for weeks on end, deep in a project called DENU, the kind of focused collaboration that normally takes months was happening in days.

Both had backgrounds in electronics, robotics, and AI. Both had spent years building systems that had to perform under real hardware constraints. And both kept running into the same wall: every time the work touched serious numerical computation, the ecosystem had the same answer. Offload it. Add a runtime. Get a bigger machine. Do not try this on a microcontroller.

That answer stopped being acceptable. The repo was created, the first functions were written, and numx began as something quiet and honest: a tool built to solve a problem that everyone else had decided was not worth solving.

What It Became

What started as a handful of utility functions grew steadily and honestly, driven by use cases rather than ambition. Every new engineering challenge revealed another gap in what embedded developers had available to them. Every gap became a module. The library expanded not because of a roadmap, but because the problems kept arriving and the existing tools kept falling short.

By the time numx had its first complete shape, it covered a scope that had never existed before in a single, dependency-free C99 library: linear algebra, statistics, numerical integration and differentiation, interpolation, polynomial arithmetic, ordinary differential equation solvers, signal processing, FFT, automatic differentiation, compressed sensing, and randomized matrix algorithms. Thirteen modules. All of them allocation-free. All of them reentrant. All of them designed to compile without modification on every major embedded toolchain.

The library did not try to be everything to everyone. It tried to be exactly what embedded engineers actually need, written the way embedded engineers actually work.

Proven in Production

When NIKX Technologies was founded in June 2024, numx had been in development. It was not a prototype or a proof of concept. It was a mature, tested library that had been shaped by real engineering problems and hardened by real constraints.

Through 2024 and 2025, NIKX developed the TERRA platform, a production IoT system running on ESP32 microcontrollers. TERRA did not create numx. But it put numx through exactly the kind of sustained, real-world use that separates libraries that look good on paper from libraries that hold up in production. Signal processing pipelines, numerical solvers, on-device mathematical operations that had no business running on a microcontroller according to the conventional wisdom of the embedded world. They ran. They ran reliably, at speed, within the memory constraints of the hardware, without sending a single computation to the cloud.

TERRA validated every architectural decision that had gone into numx since 2020. Zero dynamic allocation was not just a design preference. It was the difference between a library that works in production and one that fails unpredictably under load.

What Is Inside

numx covers the full spectrum of numerical computing that embedded engineers actually encounter, from the fundamentals of linear algebra and statistics all the way to advanced algorithms that have never before been available in a bare-metal C library. Each module is self-contained, allocation-free, and designed to integrate cleanly into any embedded project regardless of the platform or toolchain.

Module Grid:

linalg: Dot product, vector norms, cross product, matrix multiply, transpose, determinant, and LU decomposition

stats: Mean, variance, standard deviation, median, and percentile calculations

roots: Root finding via bisection, Newton-Raphson, and Brent's method

integrate: Numerical integration using trapezoidal, Simpson's rule, and Gaussian quadrature methods

differentiate: Forward difference, central difference, and Richardson extrapolation differentiation

interpolate: Linear interpolation, cubic spline, and Chebyshev polynomial interpolation

poly: Polynomial evaluation via Horner's method, root finding via Newton with deflation

ode: Ordinary differential equation solvers: fixed-step RK4 and adaptive RK45 with error control

signal: FIR and IIR filters, convolution, correlation, windowing functions, exponential moving average, and peak detection

fft: Cooley-Tukey FFT in both float32 and Q15 fixed-point, IFFT, and magnitude spectrum

autodiff: Automatic differentiation in both forward mode using dual numbers and reverse mode using a static tape

compressed sensing: Sparse signal recovery via Orthogonal Matching Pursuit & Iterative Shrinkage-Thresholding Algorithm

sketch: Randomized SVD using the Halko-Martinsson-Tropp algorithm for large-scale matrix approximation

ntt: Number Theoretic Transform with constant-time implementation and Kyber and Dilithium parameters for post-quantum cryptography

The Engineering Principles

Every decision in numx was made in the context of the hardware it runs on. That context shaped the architecture at every level, and it explains choices that might look unusual when compared to general-purpose numerical libraries.

There is no dynamic memory allocation anywhere in numx. Not a single call to malloc, calloc, or realloc in the entire codebase. On embedded systems, heap fragmentation is not a theoretical concern. It is a production failure mode. Every buffer, every intermediate result, every output lives either on the stack or in caller-provided memory. The library never surprises you with a memory footprint larger than what you gave it.

Every function in numx is reentrant. It holds no global state, no static mutable buffers, no hidden dependencies between calls. This matters in real embedded systems where the same library functions may be called from interrupt handlers, from RTOS tasks running concurrently, or from multiple execution contexts on multicore processors. Reentrancy is not a feature. It is a requirement for production use.

Every function returns a typed status code. There is no silent failure in numx. If a solver does not converge, if a matrix is singular, if the input data falls outside the valid range for an algorithm, the caller knows exactly what happened and why. This is the only honest way to build software for systems where a silent failure can mean a sensor reading the wrong value, a control loop running on corrupted data, or a device that behaves incorrectly in the field.

The entire precision of the library, float32 or float64, switches with a single compile-time flag. On hardware with a hardware floating-point unit, float64 runs without penalty. On hardware where memory and speed are tighter, float32 keeps the footprint minimal. One library, one codebase, the right precision for every target.

Who It Is For

The embedded world is full of capable hardware running software that was never built to take advantage of it. ESP32 microcontrollers run at 240 MHz with hardware floating-point support. ARM Cortex-M4 processors handle double-precision arithmetic in silicon. RISC-V cores are shipping in industrial sensors, medical devices, and wearables at a scale that was unimaginable five years ago.

The assumption that serious numerical computing requires a Linux machine, a Python runtime, and a gigabyte of available memory is not a technical truth. It is a legacy of a time when the tools did not exist to do anything better. numx exists to replace that assumption with something real.

If you are building an IoT platform and need to run signal processing pipelines directly on the device, numx gives you production-grade FIR and IIR filters, FFT, convolution, and peak detection without touching the heap. If you are building a control system with physics-based models, the ODE solvers give you both fixed-step RK4 for deterministic real-time loops and adaptive RK45 with error control for accuracy-critical applications. If you are working on edge AI and need compressed sensing for sparse signal recovery, that is in there too, running entirely on-device without any cloud dependency.

numx is also one of the only C libraries that brings automatic differentiation to bare-metal embedded systems. Forward mode with dual numbers and reverse mode with a static tape, both available without any runtime overhead beyond what the algorithm itself requires. If you are implementing a neural network, a physical model, or any system where you need exact gradients rather than finite difference approximations, this is the tool that makes it possible on the hardware you are already using.

Open Source, Openly Available

numx is released under the MIT License. Use it in personal projects, commercial products, academic research, or anything else you are building. No royalties, no attribution requirements beyond keeping the license notice, no restrictions on what you build with it.

The library compiles without modification on GCC, Clang, IAR, and every other major embedded toolchain. There is no platform-specific code, no architecture-specific intrinsics, no dependencies on any header or library outside the C99 standard library. If your toolchain compiles C99, numx compiles on your toolchain.

An academic paper is currently in preparation that documents the algorithms, the design decisions, and the performance characteristics of the library across multiple embedded platforms. Once published, numx will be formally citable for research and academic work.

Support for the Arduino and PlatformIO package registries is on the roadmap, which will make adding numx to any project as straightforward as adding any other library dependency.

Our Commitment

numx was not built to be a product. It was built to solve a problem that the embedded community had been living with for too long. Every algorithm is documented. Every function is tested. Every design decision is visible in the code and open to scrutiny, criticism, and improvement from anyone who works with it.

The embedded engineering community has produced extraordinary hardware, extraordinary toolchains, and extraordinary projects with tools that were never quite good enough for the mathematics the hardware could actually support. numx is an attempt to close that gap, to give the embedded world a numerical computing library that is as serious about quality and correctness as the engineers who will use it.

It is open source because that is the only way it makes sense. Good tools should be available to everyone building on constrained hardware, regardless of the size of their team or the budget of their project.

Made with ❤️ in Amsterdam