What we mean by embedded AI
Most AI recruitment stops at the cloud boundary. The job spec says “ML Engineer,” the recruiter searches for PyTorch experience, and the shortlist lands on engineers who have trained models and deployed them to an API endpoint. That works for SaaS feature-engineering. It fails completely when the model has to run on a Jetson Orin in a drone, an STM32 in a sensor module, or a Hailo-8 in a factory inspection rig.
Embedded AI — sometimes called Physical AI — is the discipline of deploying machine learning on resource-constrained hardware that operates in the real world. The engineers who do this work fluently across two worlds: they understand model architectures and memory budgets, training dynamics and interrupt service routines, Python notebooks and C++ on bare metal.
The stack is fundamentally different from cloud ML. Quantisation (INT8, INT4) and pruning replace scaling laws. On-target profiling replaces cloud autoscaling. Sensor drivers, real-time scheduling, and over-the-air update pipelines replace REST APIs and feature stores. The failure mode isn’t a 500 error — it’s a drone that drops out of the sky.
The embedded AI stack
Model deployment
TFLite Micro, TensorRT, ONNX Runtime, CoreML, STM32Cube.AI, OpenVINO
Hardware targets
NVIDIA Jetson (Orin, AGX), Hailo-8, Google Coral TPU, Qualcomm AI Engine, STM32, NXP
Languages & OS
C/C++ (14/17/20), Python, Rust (emerging). Linux (Yocto/Buildroot), FreeRTOS, Zephyr, QNX
Robotics & sensors
ROS 2, Foxglove, Isaac Sim, Gazebo. LiDAR, radar, IMU, EO/IR, depth cameras, CAN bus
