Jetson Orin Nano: Production-Grade Yocto/OpenEmbedded BSP Guide¶
Platform: NVIDIA Jetson Orin Nano 8GB (T234 SoC) BSP: Linux for Tegra (L4T) R36.x via meta-tegra Yocto Releases: Kirkstone (LTS), Scarthgap Scope: 25,000+ deployed devices, 50+ engineer teams, quarterly release cadence
Table of Contents¶
- Introduction
- Yocto and OpenEmbedded Fundamentals
- meta-tegra Layer
- Setting Up a Yocto Build for Jetson
- BSP Development
- Custom Yocto Layers
- Root Filesystem Customization
- Cross-Compilation and SDK
- Kernel Configuration and Driver Integration
- Bootloader and Secure Boot Integration
- Build System Optimization
- CI/CD for Embedded Linux
- OTA Update System
- System Bring-Up for New Hardware
- Boot Performance Optimization
- Licensing Compliance
- Quality and Release Engineering
- Production Deployment at Scale
- Common Issues and Debugging
1. Introduction¶
1.1 Why Yocto/OpenEmbedded for Production Jetson Deployments¶
NVIDIA provides JetPack SDK and L4T as the standard development path for Jetson modules. For prototyping, evaluation, and small-batch deployments (under 100 units), the stock L4T Ubuntu-based rootfs with apt-based package management is entirely adequate. The inflection point arrives when you face these production realities:
Stock L4T limitations at scale:
| Concern | Stock L4T | Yocto/OE |
|---|---|---|
| Image size | 12-16 GB (full JetPack) | 800 MB - 2 GB (tailored) |
| Attack surface | ~2,400 packages installed | 150-350 packages (audited) |
| Reproducibility | Depends on apt mirror state | Deterministic, bit-for-bit |
| License audit | Manual, error-prone | Automated SPDX manifests |
| OTA updates | No native A/B support | SWUpdate/Mender/RAUC integrated |
| Build automation | Shell scripts, fragile | BitBake, declarative, cacheable |
| Multi-board support | Per-board manual config | MACHINE variable, single build system |
| Boot time | 45-90 seconds typical | Sub-10 seconds achievable |
| Security hardening | Manual kernel/rootfs work | Reproducible, policy-enforced |
| Fleet management | Ad-hoc tooling | Integrated provisioning pipeline |
When you are shipping 15,000-25,000 devices, each unnecessary megabyte in the rootfs translates to real costs in OTA bandwidth, storage wear, and update window duration. Each unaudited package is a liability in regulated environments (automotive, medical, industrial). Yocto/OpenEmbedded gives you the control required for production.
1.2 When to Use Yocto vs NVIDIA Standard Flash Workflow¶
Use stock L4T / JetPack when:
- Prototyping and evaluation (fewer than 50 units)
- Application development where the OS is not the product
- Rapid iteration on CUDA/TensorRT models before production
- Teams without embedded Linux build system expertise
- Timeline does not permit Yocto ramp-up (4-8 weeks for a team)
Use Yocto/OpenEmbedded when:
- Deploying more than 100 units in the field
- Regulatory or certification requirements exist (DO-178C, IEC 62443, ISO 26262)
- Rootfs must be minimal, hardened, and auditable
- OTA updates are mandatory with rollback guarantees
- Multiple carrier board variants share a common software platform
- Continuous integration of BSP changes is required
- Long-term maintenance (5-10 year product lifecycle) is planned
- Custom bootloader or secure boot chain is needed
1.3 Scale Considerations¶
At 25,000+ deployed devices, specific engineering practices become non-negotiable:
Deployment tiers and their requirements:
1-50 units : Manual flash acceptable, stock L4T works
50-500 units : Scripted flash, basic image customization
500-5,000 : Automated build, OTA infrastructure, fleet monitoring
5,000-25,000+ : Full Yocto BSP, CI/CD pipeline, staged rollouts,
per-device identity, license compliance, dedicated
release engineering team
This guide targets the 5,000-25,000+ tier. Every section reflects practices validated across multi-year programs deploying Jetson Orin Nano at industrial scale.
1.4 Document Conventions¶
Throughout this guide:
MACHINE=jetson-orin-nano-devkitrefers to the Orin Nano 8GB developer kit$BUILDDIRrefers to the Yocto build directory (typicallybuild/)$TOPDIRrefers to the top-level project directory containing all layers- Shell commands assume a Bash environment on Ubuntu 22.04 LTS host
- BitBake recipes use Yocto Scarthgap syntax unless noted otherwise
- L4T version is R36.4.x (JetPack 6.1) unless noted otherwise
2. Yocto and OpenEmbedded Fundamentals¶
2.1 BitBake Build System¶
BitBake is the task execution engine at the heart of Yocto/OpenEmbedded. It parses recipes, resolves dependencies, and executes tasks in parallel. Understanding BitBake is prerequisite to everything else in this guide.
Core concepts:
BitBake Architecture:
Configuration Files Recipe Files Classes
(local.conf, etc.) (.bb, .bbappend) (.bbclass)
| | |
v v v
+----------------------------------------------------------+
| BitBake Parser |
+----------------------------------------------------------+
| | |
v v v
+-----------+ +---------------+ +-----------+
| Variable | | Task | | Package |
| Store | | Scheduler | | Backend |
+-----------+ +---------------+ +-----------+
|
v
+-----------+
| Execution |
| Workers |
+-----------+
Task execution order for a typical recipe:
do_fetch -> do_unpack -> do_patch -> do_configure ->
do_compile -> do_install -> do_package -> do_package_write_*
Essential BitBake commands:
# Parse all recipes and show the dependency graph
bitbake -g core-image-minimal
# Build a specific recipe
bitbake linux-tegra
# Build a specific task of a recipe
bitbake -c compile linux-tegra
# Show the environment for a recipe (invaluable for debugging)
bitbake -e linux-tegra | grep ^WORKDIR=
# List all tasks for a recipe
bitbake -c listtasks linux-tegra
# Force rebuild of a recipe
bitbake -f linux-tegra
# Clean a recipe (remove work directory and sstate)
bitbake -c cleansstate linux-tegra
# Show recipe dependency tree
bitbake -g linux-tegra && cat recipe-depends.dot
# Show which layer provides a recipe
bitbake-layers show-recipes linux-tegra
# Show all layers and their priorities
bitbake-layers show-layers
# Search for recipes by name
bitbake-layers show-recipes "*cuda*"
2.2 Layers Architecture¶
Yocto organizes metadata into layers. Each layer is a directory containing recipes, configuration, and classes. Layers are stacked with defined priorities, and higher-priority layers can override lower-priority content.
Layer Stack for Jetson Orin Nano Production:
+-----------------------------------------------+ Priority 99
| meta-myproject (project-specific recipes) |
+-----------------------------------------------+ Priority 20
| meta-myproject-distro (distro configuration) |
+-----------------------------------------------+ Priority 15
| meta-myproject-bsp (carrier board adaptations) |
+-----------------------------------------------+ Priority 10
| meta-tegra (Jetson BSP layer) |
+-----------------------------------------------+ Priority 9
| meta-openembedded/* (additional OE layers) |
+-----------------------------------------------+ Priority 5
| poky/meta (OE-Core) |
+-----------------------------------------------+
bblayers.conf example:
# conf/bblayers.conf
POKY_BBLAYERS_CONF_VERSION = "2"
BBPATH = "${TOPDIR}"
BBFILES ?= ""
BBLAYERS ?= " \
${TOPDIR}/../poky/meta \
${TOPDIR}/../poky/meta-poky \
${TOPDIR}/../meta-openembedded/meta-oe \
${TOPDIR}/../meta-openembedded/meta-python \
${TOPDIR}/../meta-openembedded/meta-networking \
${TOPDIR}/../meta-openembedded/meta-multimedia \
${TOPDIR}/../meta-tegra \
${TOPDIR}/../meta-myproject-bsp \
${TOPDIR}/../meta-myproject-distro \
${TOPDIR}/../meta-myproject \
"
2.3 Recipes, Classes, and Configuration¶
Recipe (.bb) structure:
# example: recipes-app/myapp/myapp_1.0.bb
SUMMARY = "My production application"
DESCRIPTION = "Edge inference application for Jetson Orin Nano"
LICENSE = "Proprietary"
LIC_FILES_CHKSUM = "file://LICENSE;md5=abc123def456..."
SRC_URI = "git://git.mycompany.com/myapp.git;protocol=ssh;branch=main"
SRCREV = "a1b2c3d4e5f67890..."
S = "${WORKDIR}/git"
DEPENDS = "cuda-toolkit tensorrt opencv"
inherit cmake cuda
EXTRA_OECMAKE = " \
-DCUDA_TOOLKIT_ROOT_DIR=${STAGING_DIR_HOST}/usr/local/cuda \
-DWITH_TENSORRT=ON \
"
do_install() {
install -d ${D}${bindir}
install -m 0755 ${B}/myapp ${D}${bindir}/
install -d ${D}${sysconfdir}/myapp
install -m 0644 ${S}/config/default.json ${D}${sysconfdir}/myapp/
}
FILES:${PN} = " \
${bindir}/myapp \
${sysconfdir}/myapp/ \
"
bbappend pattern:
# recipes-app/myapp/myapp_%.bbappend
# Applied on top of the base recipe to customize for this project
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += "file://production.json"
do_install:append() {
install -m 0644 ${WORKDIR}/production.json \
${D}${sysconfdir}/myapp/production.json
}
Class (.bbclass) example:
# classes/myproject-versioning.bbclass
# Adds project-wide version metadata to all packages that inherit this class
MYPROJECT_VERSION ?= "1.0.0"
MYPROJECT_BUILD_ID ?= "${@d.getVar('DATETIME')}"
do_install:append() {
install -d ${D}${sysconfdir}
echo "version=${MYPROJECT_VERSION}" > ${D}${sysconfdir}/myproject-version
echo "build=${MYPROJECT_BUILD_ID}" >> ${D}${sysconfdir}/myproject-version
}
FILES:${PN} += "${sysconfdir}/myproject-version"
2.4 Configuration Hierarchy¶
local.conf -- build-specific settings:
# conf/local.conf
MACHINE = "jetson-orin-nano-devkit"
DISTRO = "myproject-distro"
PACKAGE_CLASSES = "package_ipk"
# Parallel build settings (tuned for build server)
BB_NUMBER_THREADS = "16"
PARALLEL_MAKE = "-j 16"
# Shared state cache location
SSTATE_DIR = "/opt/yocto/sstate-cache"
DL_DIR = "/opt/yocto/downloads"
TMPDIR = "${TOPDIR}/tmp"
# Additional image features
EXTRA_IMAGE_FEATURES += "debug-tweaks"
# Accept NVIDIA proprietary licenses
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
Distro configuration -- distro-wide policy:
# conf/distro/myproject-distro.conf
DISTRO = "myproject-distro"
DISTRO_NAME = "MyProject Embedded Linux"
DISTRO_VERSION = "3.0.0"
DISTRO_CODENAME = "production"
# Base distro features
DISTRO_FEATURES = " \
acl ipv4 ipv6 usbhost systemd pam seccomp \
"
DISTRO_FEATURES:remove = "x11 wayland pulseaudio bluetooth nfs zeroconf 3g"
# Use systemd as init manager
INIT_MANAGER = "systemd"
VIRTUAL-RUNTIME_init_manager = "systemd"
VIRTUAL-RUNTIME_initscripts = "systemd-compat-units"
# Package format
PACKAGE_CLASSES = "package_ipk"
# Reproducible builds
BUILD_REPRODUCIBLE_BINARIES = "1"
INHERIT += "reproducible_build"
# SDK settings
SDKMACHINE = "x86_64"
2.5 Image Recipes and Package Groups¶
Image recipe:
# recipes-core/images/myproject-image.bb
SUMMARY = "MyProject production image for Jetson Orin Nano"
LICENSE = "MIT"
inherit core-image
IMAGE_FEATURES += " \
ssh-server-openssh \
package-management \
"
IMAGE_INSTALL = " \
packagegroup-core-boot \
packagegroup-myproject-base \
packagegroup-myproject-inference \
packagegroup-myproject-connectivity \
"
# Root filesystem size limit (fail build if exceeded)
IMAGE_ROOTFS_MAXSIZE = "2097152"
# Extra rootfs space for runtime data
IMAGE_ROOTFS_EXTRA_SPACE = "131072"
Package group:
# recipes-core/packagegroups/packagegroup-myproject-base.bb
SUMMARY = "MyProject base system packages"
LICENSE = "MIT"
inherit packagegroup
RDEPENDS:${PN} = " \
base-files \
base-passwd \
busybox \
systemd \
openssh-sshd \
chrony \
sudo \
tzdata \
ca-certificates \
curl \
jq \
htop \
strace \
"
3. meta-tegra Layer¶
3.1 Overview¶
meta-tegra is the Yocto BSP layer for NVIDIA Jetson platforms. It provides machine configurations, kernel recipes, bootloader integration, and packaging of NVIDIA proprietary components (CUDA, TensorRT, cuDNN, multimedia APIs) as Yocto recipes.
Repository: https://github.com/OE4T/meta-tegra
The layer is maintained by the OE4T (OpenEmbedded for Tegra) community and tracks NVIDIA L4T releases. It is not an official NVIDIA product, but it is the de facto standard for Yocto-based Jetson development.
3.2 Supported Machines¶
Machine configurations relevant to Orin Nano:
jetson-orin-nano-devkit Orin Nano 8GB developer kit (P3767-0005 + P3768)
jetson-orin-nano-devkit-nvme Same, but boots from NVMe
jetson-orin-nx-xavier-nx-devkit Orin NX on Xavier NX carrier (reference)
The MACHINE variable selects the target. For custom carrier boards, you create a new machine configuration that inherits from the Orin Nano module definition.
3.3 Layer Setup and Compatibility¶
Compatibility matrix:
| meta-tegra Branch | Yocto Release | L4T Version | JetPack |
|---|---|---|---|
| kirkstone-l4t-r35.x | Kirkstone (LTS) | R35.4.1 | 5.1.2 |
| scarthgap-l4t-r36.x | Scarthgap | R36.4.0 | 6.1 |
| master | Next release | R36.4+ | 6.1+ |
For Orin Nano 8GB production deployments, use scarthgap-l4t-r36.x as the baseline. Kirkstone is available for projects that started earlier and need LTS stability without migration.
3.4 Relationship to L4T BSP Packages¶
meta-tegra does not rebuild the NVIDIA proprietary binaries. Instead, it downloads pre-built components from NVIDIA distribution servers and packages them as Yocto recipes. The layer provides:
NVIDIA Components Packaged by meta-tegra:
Component Yocto Recipe Source
---------------------------------------------------------------
Linux Kernel linux-tegra NVIDIA kernel source
UEFI Bootloader edk2-firmware-tegra NVIDIA bootloader source
CUDA Toolkit cuda-toolkit Pre-built from L4T
TensorRT tensorrt Pre-built from L4T
cuDNN cudnn Pre-built from L4T
GStreamer (nvargus) nvidia-gstreamer Pre-built from L4T
Multimedia API nvidia-mmapi Pre-built from L4T
Jetson GPIO library python3-jetson-gpio Source package
Display drivers nvidia-display-driver Pre-built from L4T
Flash tools tegra-flash NVIDIA flash tools
3.5 Layer Dependencies¶
# meta-tegra requires these layers:
# meta (OE-Core)
# meta-python (from meta-openembedded)
# meta-networking (from meta-openembedded, optional for some features)
# meta-oe (from meta-openembedded)
3.6 Key Recipes and Their Roles¶
# Examine what meta-tegra provides:
bitbake-layers show-recipes -l meta-tegra | head -50
# Key recipes you will interact with:
#
# linux-tegra - Kernel for Tegra platforms (5.15.x for L4T R36)
# edk2-firmware-tegra - UEFI bootloader firmware
# tegra-flash - Flash tooling and partition layout
# cuda-toolkit - CUDA compiler and runtime
# cuda-libraries - CUDA math libraries (cuBLAS, cuFFT, etc.)
# tensorrt - TensorRT inference optimizer
# cudnn - cuDNN deep learning primitives
# nvidia-l4t-* - L4T binary packages (firmware, drivers)
# tegra-tools - NVIDIA debugging and configuration tools
4. Setting Up a Yocto Build for Jetson¶
4.1 Host System Requirements¶
# Tested host: Ubuntu 22.04 LTS (x86_64)
# Minimum: 8 cores, 32 GB RAM, 500 GB SSD free space
# Recommended: 16+ cores, 64 GB RAM, 1 TB NVMe SSD
# Install required packages
sudo apt-get update
sudo apt-get install -y \
gawk wget git diffstat unzip texinfo gcc build-essential \
chrpath socat cpio python3 python3-pip python3-pexpect \
xz-utils debianutils iputils-ping python3-git python3-jinja2 \
python3-subunit zstd liblz4-tool file locales libacl1-dev \
lz4 device-tree-compiler
# Set locale (required by BitBake)
sudo locale-gen en_US.UTF-8
export LANG=en_US.UTF-8
# Install kas (build configuration tool)
pip3 install kas
# Install repo (for multi-repo management, alternative to kas)
mkdir -p ~/.local/bin
curl https://storage.googleapis.com/git-repo-downloads/repo > ~/.local/bin/repo
chmod a+x ~/.local/bin/repo
export PATH="${HOME}/.local/bin:${PATH}"
4.2 Project Setup with kas¶
kas is the recommended tool for managing Yocto layer configurations. It replaces manual repo init / repo sync workflows with a single YAML configuration file.
# kas/jetson-orin-nano.yml
header:
version: 14
includes:
- repo: meta-tegra
path: contrib/conf/kas/jetson-orin-nano-devkit.yml
distro: myproject-distro
machine: jetson-orin-nano-devkit
target: myproject-image
repos:
poky:
url: https://git.yoctoproject.org/poky
branch: scarthgap
path: layers/poky
layers:
meta:
meta-poky:
meta-openembedded:
url: https://git.openembedded.org/meta-openembedded
branch: scarthgap
path: layers/meta-openembedded
layers:
meta-oe:
meta-python:
meta-networking:
meta-multimedia:
meta-tegra:
url: https://github.com/OE4T/meta-tegra.git
branch: scarthgap-l4t-r36.x
path: layers/meta-tegra
meta-myproject:
path: layers/meta-myproject
layers:
meta-myproject:
meta-myproject-bsp:
meta-myproject-distro:
local_conf_header:
base: |
SSTATE_DIR = "/opt/yocto/sstate-cache"
DL_DIR = "/opt/yocto/downloads"
BB_NUMBER_THREADS = "16"
PARALLEL_MAKE = "-j 16"
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
# Build with kas
kas build kas/jetson-orin-nano.yml
# Open a shell inside the kas-configured build environment
kas shell kas/jetson-orin-nano.yml
# Build a specific recipe within the kas environment
kas shell kas/jetson-orin-nano.yml -c "bitbake linux-tegra"
4.3 Manual Setup (Without kas)¶
# Clone all layers
mkdir -p ~/jetson-yocto && cd ~/jetson-yocto
git clone -b scarthgap https://git.yoctoproject.org/poky
git clone -b scarthgap https://git.openembedded.org/meta-openembedded
git clone -b scarthgap-l4t-r36.x https://github.com/OE4T/meta-tegra.git
# Initialize build environment
source poky/oe-init-build-env build
# Add layers
bitbake-layers add-layer ../meta-openembedded/meta-oe
bitbake-layers add-layer ../meta-openembedded/meta-python
bitbake-layers add-layer ../meta-openembedded/meta-networking
bitbake-layers add-layer ../meta-openembedded/meta-multimedia
bitbake-layers add-layer ../meta-tegra
# Edit conf/local.conf:
# MACHINE = "jetson-orin-nano-devkit"
# LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
# Build minimal console image
bitbake core-image-minimal
# Build image with GPU support
bitbake demo-image-full
4.4 Building a Minimal Console Image¶
# After environment setup:
MACHINE=jetson-orin-nano-devkit bitbake core-image-minimal
# Build output location:
# tmp/deploy/images/jetson-orin-nano-devkit/
# core-image-minimal-jetson-orin-nano-devkit.tegraflash.tar.gz
# Approximate build time (first build, 16-core host):
# core-image-minimal: 2-3 hours
# Full image with CUDA: 4-6 hours
# Subsequent builds with sstate cache: 10-30 minutes
The .tegraflash.tar.gz archive contains everything needed to flash the device:
bootloader binaries, partition table, kernel, device tree, and rootfs.
4.5 Building with GPU/CUDA Support¶
# Ensure LICENSE_FLAGS_ACCEPTED includes commercial_nvidia in local.conf
# Then add CUDA packages to your image:
# In your image recipe or local.conf:
IMAGE_INSTALL:append = " \
cuda-toolkit \
cuda-libraries \
tensorrt \
tensorrt-plugins \
cudnn \
libcudla \
"
# Build
bitbake myproject-image
4.6 First Flash to Orin Nano¶
# Put the Orin Nano into Force Recovery Mode:
# 1. Power off the device
# 2. Hold the Force Recovery button
# 3. Apply power (or press Reset while holding Force Recovery)
# 4. Release Force Recovery after 2 seconds
# 5. Verify with lsusb:
lsusb | grep -i nvidia
# Expected: "Bus 00x Device 00y: ID 0955:7523 NVIDIA Corp. APX"
# Extract the tegraflash archive
mkdir -p ~/flash && cd ~/flash
tar xzf tmp/deploy/images/jetson-orin-nano-devkit/\
core-image-minimal-jetson-orin-nano-devkit.tegraflash.tar.gz
# Flash (requires sudo for USB access)
cd tegraflash
sudo ./initrd-flash
# Flash takes approximately 5-10 minutes
# The device will reboot automatically when complete
4.7 Flash to NVMe SSD¶
# For NVMe boot (recommended for production):
MACHINE=jetson-orin-nano-devkit-nvme bitbake myproject-image
# Extract and flash
tar xzf myproject-image-jetson-orin-nano-devkit-nvme.tegraflash.tar.gz
cd tegraflash
sudo ./initrd-flash
# The flash script handles QSPI bootloader + NVMe rootfs partitioning
4.8 Build Directory Structure¶
After a successful build, the directory layout:
build/
conf/
local.conf Build configuration
bblayers.conf Layer list
tmp/
deploy/
images/
jetson-orin-nano-devkit/
*.tegraflash.tar.gz Flash archive
Image Kernel image
*.dtb Device tree blobs
*.ext4 Root filesystem
*.manifest Package manifest
licenses/ License manifests
ipk/ IPK packages
work/ Per-recipe work directories
sysroots-components/ Shared sysroot components
log/ Build logs
cache/ BitBake cache
5. BSP Development¶
5.1 Board Support Package Architecture¶
The Jetson Orin Nano BSP consists of several tightly coupled components:
BSP Component Architecture:
+---------------------------------------------------+
| Application Software |
+---------------------------------------------------+
| Linux Kernel (linux-tegra 5.15) |
| +---------------------------------------------+ |
| | Device Tree (.dtb) | Kernel Modules | |
| | NVIDIA GPU Driver | Camera/ISP Drivers | |
| +---------------------------------------------+ |
+---------------------------------------------------+
| Bootloader Chain |
| +-----+ +-----+ +------+ +-----------+ |
| | MB1 | -> | MB2 | -> | UEFI | -> | extlinux | |
| | (BCT)| | (TOS)| | (BL) | | (kernel) | |
| +-----+ +-----+ +------+ +-----------+ |
+---------------------------------------------------+
| QSPI Flash | eMMC/NVMe Storage |
+---------------------------------------------------+
| Hardware (T234 SoC + Carrier Board) |
+---------------------------------------------------+
MB1 = Microboot 1 (runs on BPMP, configures SDRAM, pinmux)
MB2 = Microboot 2 (TrustZone setup, security)
UEFI = UEFI bootloader (replaces U-Boot on Orin platforms)
extlinux.conf = Kernel/initrd boot configuration
5.2 Bootloader Integration in Yocto¶
On the Orin Nano (T234), the bootloader chain uses UEFI (not U-Boot for primary boot). meta-tegra handles the UEFI build and integration.
# The UEFI recipe in meta-tegra:
# recipes-bsp/uefi/edk2-firmware-tegra_%.bb
# To customize UEFI build options, use a bbappend:
# meta-myproject-bsp/recipes-bsp/uefi/edk2-firmware-tegra_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
# Add custom UEFI configuration
SRC_URI += "file://custom-uefi.cfg"
# Example: Adjust UEFI boot timeout
EXTRA_UEFI_BUILD_FLAGS += " \
-DBOOT_TIMEOUT=3 \
"
5.3 Device Tree Integration and Customization¶
Device tree customization is essential for carrier board adaptation. meta-tegra provides the base device trees from NVIDIA, and you overlay your changes.
# meta-myproject-bsp/recipes-bsp/tegra-dtbs/tegra-dtbs_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
# Add custom device tree overlay
SRC_URI += " \
file://my-carrier-board.dtso \
"
do_install:append() {
install -m 0644 ${WORKDIR}/my-carrier-board.dtso \
${D}/boot/my-carrier-board.dtso
}
Custom carrier board device tree overlay:
/* files/my-carrier-board.dtso */
/dts-v1/;
/plugin/;
/ {
overlay-name = "MyCompany Carrier Board v2";
compatible = "nvidia,p3768-0000+p3767-0005";
fragment@0 {
target-path = "/";
__overlay__ {
model = "MyCompany Edge Device v2 (Orin Nano 8GB)";
};
};
/* Enable SPI1 for external ADC */
fragment@1 {
target = <&spi1>;
__overlay__ {
status = "okay";
#address-cells = <1>;
#size-cells = <0>;
adc@0 {
compatible = "ti,ads8688";
reg = <0>;
spi-max-frequency = <1000000>;
};
};
};
/* Configure GPIO for custom I/O */
fragment@2 {
target = <&gpio>;
__overlay__ {
custom-io-pins {
gpio-hog;
gpios = <42 0>;
output-low;
line-name = "status-led";
};
};
};
/* Disable unused HDMI output to save power */
fragment@3 {
target = <&hdmi>;
__overlay__ {
status = "disabled";
};
};
};
5.4 Kernel Recipe Customization¶
# meta-myproject-bsp/recipes-kernel/linux/linux-tegra_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
# Add kernel config fragments and patches
SRC_URI += " \
file://security-hardening.cfg \
file://disable-debug.cfg \
file://custom-drivers.cfg \
file://0001-add-custom-sensor-driver.patch \
"
Kernel config fragment for security hardening (files/security-hardening.cfg):
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_APPARMOR=y
CONFIG_STRICT_DEVMEM=y
CONFIG_IO_STRICT_DEVMEM=y
CONFIG_HARDENED_USERCOPY=y
CONFIG_FORTIFY_SOURCE=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SHUFFLE_PAGE_ALLOCATOR=y
Kernel config fragment for disabling debug in production (files/disable-debug.cfg):
# CONFIG_DEBUG_INFO is not set
# CONFIG_DEBUG_FS is not set
# CONFIG_KALLSYMS is not set
# CONFIG_FTRACE is not set
# CONFIG_KPROBES is not set
# CONFIG_PROFILING is not set
# CONFIG_DEBUG_KERNEL is not set
5.5 Custom Board Definition for Carrier Boards¶
When using a custom carrier board (not the NVIDIA developer kit carrier), you create a new MACHINE configuration:
# meta-myproject-bsp/conf/machine/mycompany-edge-v2.conf
#@TYPE: Machine
#@NAME: MyCompany Edge Device v2
#@DESCRIPTION: MyCompany carrier board with Jetson Orin Nano 8GB
# Include the Orin Nano module definition
require conf/machine/include/orin-nano.inc
# Carrier board specifics
TEGRA_BOARDID = "3768"
TEGRA_FAB = "0000"
TEGRA_BOARDSKU = ""
TEGRA_BOARDREV = ""
TEGRA_CHIPREV = "0"
# Custom device tree
KERNEL_DEVICETREE = "tegra234-p3768-0000+p3767-0005-my-carrier.dtb"
# Boot device (NVMe for production)
TNSPEC_BOOTDEV = "nvme0n1p1"
# Serial console
SERIAL_CONSOLES = "115200;ttyTCU0"
# Machine features
MACHINE_FEATURES += "ext-rtc watchdog"
MACHINE_FEATURES:remove = "bluetooth wifi"
# Flash configuration
TEGRAFLASH_ROOTFS_DEVICE_TYPE = "nvme"
# Custom partition layout
PARTITION_LAYOUT_TEMPLATE = "flash_t234_qspi_custom.xml"
# Kernel and module configuration
PREFERRED_PROVIDER_virtual/kernel = "linux-tegra"
MACHINE_EXTRA_RRECOMMENDS += " \
kernel-module-custom-sensor \
kernel-module-custom-can \
"
5.6 Partition Layout Customization¶
<!-- flash_t234_qspi_custom.xml -->
<!-- Custom partition layout for NVMe-based production device -->
<partition_layout version="01.00.0000">
<device type="qspi" instance="0">
<partition name="mb1" type="mb1_bootloader">
<allocation_policy> sequential </allocation_policy>
<size> 524288 </size>
<filename> mb1_t234_prod.bin </filename>
</partition>
<partition name="mb2" type="mb2_bootloader">
<allocation_policy> sequential </allocation_policy>
<size> 1048576 </size>
<filename> mb2_t234.bin </filename>
</partition>
<partition name="uefi" type="data">
<allocation_policy> sequential </allocation_policy>
<size> 4194304 </size>
<filename> uefi_jetson.bin </filename>
</partition>
</device>
<device type="nvme" instance="0">
<partition name="APP" type="data">
<allocation_policy> sequential </allocation_policy>
<size> 2147483648 </size> <!-- 2 GB rootfs partition -->
<filename> rootfs.ext4 </filename>
</partition>
<partition name="APP_b" type="data">
<allocation_policy> sequential </allocation_policy>
<size> 2147483648 </size> <!-- 2 GB A/B rootfs partition -->
<filename> rootfs.ext4 </filename>
</partition>
<partition name="DATA" type="data">
<allocation_policy> sequential </allocation_policy>
<size> -1 </size> <!-- Use remaining space -->
<filename> data.ext4 </filename>
</partition>
</device>
</partition_layout>
6. Custom Yocto Layers¶
6.1 Creating Project-Specific Layers¶
# Create a new layer
cd $TOPDIR/..
bitbake-layers create-layer meta-myproject
# Recommended layer structure for a large project:
#
# meta-myproject/
# conf/
# layer.conf
# recipes-core/
# images/
# myproject-image.bb
# myproject-image-dev.bb
# myproject-image-manufacturing.bb
# packagegroups/
# packagegroup-myproject-base.bb
# packagegroup-myproject-inference.bb
# packagegroup-myproject-connectivity.bb
# recipes-app/
# myapp/
# myapp_1.0.bb
# files/
# myapp.service
# myapp.conf
# myapp-updater/
# myapp-updater_1.0.bb
# recipes-support/
# factory-test/
# factory-test_1.0.bb
# device-provisioning/
# device-provisioning_1.0.bb
# classes/
# myproject-versioning.bbclass
#
# meta-myproject-bsp/
# conf/
# layer.conf
# machine/
# mycompany-edge-v2.conf
# recipes-bsp/
# tegra-dtbs/
# uefi/
# recipes-kernel/
# linux/
#
# meta-myproject-distro/
# conf/
# layer.conf
# distro/
# myproject-distro.conf
layer.conf:
# meta-myproject/conf/layer.conf
BBPATH .= ":${LAYERDIR}"
BBFILES += " \
${LAYERDIR}/recipes-*/*/*.bb \
${LAYERDIR}/recipes-*/*/*.bbappend \
"
BBFILE_COLLECTIONS += "meta-myproject"
BBFILE_PATTERN_meta-myproject = "^${LAYERDIR}/"
BBFILE_PRIORITY_meta-myproject = 99
LAYERDEPENDS_meta-myproject = " \
core \
tegra \
meta-myproject-bsp \
meta-myproject-distro \
"
LAYERSERIES_COMPAT_meta-myproject = "scarthgap"
6.2 bbappend Patterns for Modifying Upstream Recipes¶
Pattern 1: Add files to an existing recipe
# meta-myproject/recipes-core/systemd/systemd_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += "file://journald-production.conf"
do_install:append() {
install -d ${D}${sysconfdir}/systemd/journald.conf.d
install -m 0644 ${WORKDIR}/journald-production.conf \
${D}${sysconfdir}/systemd/journald.conf.d/production.conf
}
Pattern 2: Change configuration of an existing recipe
# meta-myproject/recipes-connectivity/openssh/openssh_%.bbappend
EXTRA_OECONF += "--disable-lastlog --disable-utmp"
do_install:append() {
# Harden SSH configuration
sed -i 's/#PermitRootLogin.*/PermitRootLogin no/' \
${D}${sysconfdir}/ssh/sshd_config
sed -i 's/#PasswordAuthentication.*/PasswordAuthentication no/' \
${D}${sysconfdir}/ssh/sshd_config
sed -i 's/#MaxAuthTries.*/MaxAuthTries 3/' \
${D}${sysconfdir}/ssh/sshd_config
}
Pattern 3: Apply patches to upstream source
# meta-myproject/recipes-devtools/python3/python3_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += "file://0001-fix-cross-compile-issue.patch"
Pattern 4: Override PREFERRED_VERSION
# In distro.conf or local.conf:
PREFERRED_VERSION_linux-tegra = "5.15%"
PREFERRED_PROVIDER_virtual/kernel = "linux-tegra"
PREFERRED_PROVIDER_virtual/bootloader = "edk2-firmware-tegra"
6.3 Managing 350+ User-Space Packages¶
For large package sets, organize into themed package groups:
# recipes-core/packagegroups/packagegroup-myproject-inference.bb
SUMMARY = "Inference engine and dependencies"
LICENSE = "MIT"
inherit packagegroup
RDEPENDS:${PN} = " \
cuda-libraries \
cuda-cudart \
tensorrt \
tensorrt-plugins \
cudnn \
libcudla \
opencv \
python3-numpy \
python3-pillow \
onnxruntime \
myapp-inference-engine \
"
# recipes-core/packagegroups/packagegroup-myproject-connectivity.bb
SUMMARY = "Network and connectivity packages"
LICENSE = "MIT"
inherit packagegroup
RDEPENDS:${PN} = " \
networkmanager \
modemmanager \
wpa-supplicant \
openssh-sshd \
openssh-sftp-server \
curl \
wget \
mosquitto \
mosquitto-clients \
python3-paho-mqtt \
chrony \
iptables \
nftables \
wireguard-tools \
wireguard-module \
"
# recipes-core/packagegroups/packagegroup-myproject-monitoring.bb
SUMMARY = "Device monitoring and diagnostics"
LICENSE = "MIT"
inherit packagegroup
RDEPENDS:${PN} = " \
collectd \
collectd-plugin-cpu \
collectd-plugin-memory \
collectd-plugin-disk \
collectd-plugin-interface \
collectd-plugin-thermal \
tegra-tools \
nvfancontrol \
htop \
iotop \
sysstat \
"
6.4 Layer Priority and Override Mechanisms¶
# Layer priorities determine which recipe wins when multiple layers
# provide the same recipe:
# meta (OE-Core): 5
# meta-tegra: 10
# meta-myproject-bsp: 15
# meta-myproject-distro: 20
# meta-myproject: 99
# OVERRIDES allow conditional variable assignment:
# Machine-specific override
SRC_URI:append:jetson-orin-nano-devkit = " file://orin-nano-specific.patch"
# Distro-specific override
PACKAGECONFIG:myproject-distro = "feature-a feature-b"
# Architecture override
EXTRA_OEMAKE:aarch64 = "ARCH=arm64"
# Conditional package inclusion based on machine features
RDEPENDS:${PN}:append = " \
${@bb.utils.contains('MACHINE_FEATURES', 'wifi', 'wpa-supplicant', '', d)} \
${@bb.utils.contains('MACHINE_FEATURES', 'bluetooth', 'bluez5', '', d)} \
"
7. Root Filesystem Customization¶
7.1 Image Recipe Construction¶
# recipes-core/images/myproject-image.bb
SUMMARY = "MyProject production image"
LICENSE = "MIT"
inherit core-image
# Core image features
IMAGE_FEATURES += " \
ssh-server-openssh \
package-management \
"
# Remove features for production builds
IMAGE_FEATURES:remove = " \
allow-empty-password \
allow-root-login \
debug-tweaks \
"
# Package installation
IMAGE_INSTALL = " \
packagegroup-core-boot \
packagegroup-myproject-base \
packagegroup-myproject-inference \
packagegroup-myproject-connectivity \
packagegroup-myproject-monitoring \
swupdate \
device-provisioning \
"
# Rootfs post-processing commands
ROOTFS_POSTPROCESS_COMMAND += " \
remove_dev_debug; \
harden_rootfs; \
"
remove_dev_debug() {
# Remove development and debugging artifacts from production image
rm -rf ${IMAGE_ROOTFS}/usr/src
rm -rf ${IMAGE_ROOTFS}/usr/share/doc
rm -rf ${IMAGE_ROOTFS}/usr/share/man
rm -rf ${IMAGE_ROOTFS}/usr/share/info
rm -rf ${IMAGE_ROOTFS}/usr/share/gtk-doc
find ${IMAGE_ROOTFS} -name "*.a" -delete
find ${IMAGE_ROOTFS} -name "*.la" -delete
}
harden_rootfs() {
# Set restrictive file permissions
chmod 700 ${IMAGE_ROOTFS}/root
chmod 750 ${IMAGE_ROOTFS}/etc/sudoers.d
# Disable core dumps
echo "* hard core 0" >> ${IMAGE_ROOTFS}/etc/security/limits.conf
# Restrict dmesg and kptr
echo "kernel.dmesg_restrict = 1" >> \
${IMAGE_ROOTFS}/etc/sysctl.d/99-hardening.conf
echo "kernel.kptr_restrict = 2" >> \
${IMAGE_ROOTFS}/etc/sysctl.d/99-hardening.conf
}
# Image size constraints
IMAGE_ROOTFS_SIZE = "1048576"
IMAGE_ROOTFS_MAXSIZE = "2097152"
IMAGE_OVERHEAD_FACTOR = "1.1"
7.2 Reducing Image Size¶
Typical first-build images are bloated. A systematic reduction approach:
# Step 1: Analyze installed packages
cat tmp/deploy/images/jetson-orin-nano-devkit/\
myproject-image-jetson-orin-nano-devkit.manifest | wc -l
# Typical: 800+ packages in an unoptimized image
# Step 2: Identify large packages (sorted by installed size)
cat tmp/deploy/images/jetson-orin-nano-devkit/\
myproject-image-jetson-orin-nano-devkit.manifest | sort -k2 -rn | head -30
# Step 3: Aggressive feature removal in distro conf
# conf/distro/myproject-distro.conf:
DISTRO_FEATURES:remove = " \
x11 wayland pulseaudio bluetooth nfs nfc 3g \
zeroconf ptest multilib gobject-introspection-data \
"
# Step 4: Minimize busybox configuration
# meta-myproject/recipes-core/busybox/busybox_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += "file://production.cfg"
# production.cfg enables only the commands actually used in the field
# Step 5: Strip debug info and optimize
# In local.conf or distro.conf:
INHIBIT_PACKAGE_DEBUG_SPLIT = "1"
INHIBIT_PACKAGE_STRIP = "0"
EXTRA_IMAGE_FEATURES:remove = "dbg-pkgs"
IMAGE_INSTALL:remove = "gdb gdbserver strace ltrace valgrind"
# Step 6: Remove locale data (keep only needed locales)
IMAGE_LINGUAS = "en-us"
GLIBC_GENERATE_LOCALES = "en_US.UTF-8"
Size reduction results (typical):
Component Before After Savings
--------------------------------------------------------------
Full JetPack rootfs 14.2 GB -- (stock L4T baseline)
Yocto unoptimized image 1.8 GB -- (first Yocto build)
Remove unused DISTRO_FEATURES -- 1.3 GB 500 MB
Strip development files -- 1.0 GB 300 MB
Minimize busybox -- 980 MB 20 MB
Remove documentation -- 920 MB 60 MB
Optimize package selection -- 750 MB 170 MB
--------------------------------------------------------------
Final production image -- 750 MB (from 1.8 GB)
7.3 Read-Only Root Filesystem¶
For production devices, a read-only rootfs improves reliability and security:
# In the image recipe:
IMAGE_FEATURES += "read-only-rootfs"
# Create writable overlay for runtime data
# meta-myproject/recipes-core/volatile-binds/volatile-binds_%.bbappend
VOLATILE_BINDS += " \
/tmp /var/tmp \
/var/log /var/log \
/var/lib/systemd /var/lib/systemd \
/etc/machine-id /etc/machine-id \
"
Mount overlay script for persistent data:
#!/bin/sh
# /usr/lib/systemd/system-generators/mount-overlays
# Mount tmpfs overlays for writable areas on read-only rootfs
mount -t tmpfs -o size=64M tmpfs /var/log
mount -t tmpfs -o size=16M tmpfs /tmp
mount -t tmpfs -o size=8M tmpfs /var/lib/systemd
mount -t tmpfs -o size=4M tmpfs /run
# Persistent data partition (for configuration and application data)
mount /dev/nvme0n1p3 /data
systemd mount unit for persistent data:
# /etc/systemd/system/data.mount
[Unit]
Description=Persistent data partition
Before=local-fs.target
[Mount]
What=/dev/nvme0n1p3
Where=/data
Type=ext4
Options=defaults,noatime,commit=60
[Install]
WantedBy=local-fs.target
7.4 Minimal vs Full Image Comparison¶
# recipes-core/images/myproject-image-minimal.bb
# Absolute minimum for headless operation
IMAGE_INSTALL = " \
packagegroup-core-boot \
openssh-sshd \
chrony \
swupdate \
"
# Result: ~180 MB rootfs, ~45 packages
# recipes-core/images/myproject-image-inference.bb
# Inference workload with CUDA
IMAGE_INSTALL = " \
packagegroup-core-boot \
packagegroup-myproject-base \
packagegroup-myproject-inference \
swupdate \
"
# Result: ~750 MB rootfs, ~220 packages
# recipes-core/images/myproject-image-dev.bb
# Development image with debugging tools
IMAGE_INSTALL = " \
packagegroup-core-boot \
packagegroup-myproject-base \
packagegroup-myproject-inference \
packagegroup-myproject-connectivity \
gdb \
gdbserver \
strace \
ltrace \
valgrind \
perf \
tcpdump \
python3 \
"
IMAGE_FEATURES += "debug-tweaks tools-debug tools-profile"
# Result: ~2.1 GB rootfs, ~600 packages
7.5 Image Build History Tracking¶
# Enable buildhistory to track image changes between builds
# In local.conf:
INHERIT += "buildhistory"
BUILDHISTORY_COMMIT = "1"
# After each build, inspect changes:
buildhistory-diff
# Output example:
# images/jetson-orin-nano-devkit/myproject-image:
# Package list changed:
# + curl 8.5.0
# - wget 1.21
# Rootfs size changed: 748.2 MB -> 749.1 MB (+0.9 MB)
8. Cross-Compilation and SDK¶
8.1 Generating the Yocto SDK¶
The Yocto SDK provides a standalone cross-compilation toolchain that developers can use without running a full BitBake build. This is critical for application teams who need to compile and test against the exact same libraries shipped in the production image.
# Generate the standard SDK
bitbake myproject-image -c populate_sdk
# Output location:
# tmp/deploy/sdk/myproject-distro-glibc-x86_64-myproject-image-
# aarch64-jetson-orin-nano-devkit-toolchain-3.0.0.sh
# Install the SDK (self-extracting archive)
./tmp/deploy/sdk/myproject-distro-*.sh -d /opt/myproject-sdk -y
# Source the SDK environment
source /opt/myproject-sdk/environment-setup-aarch64-poky-linux
# Verify the cross-compiler
$CC --version
# aarch64-poky-linux-gcc (GCC) 13.x.x ...
# Verify sysroot contains CUDA headers
ls $SDKTARGETSYSROOT/usr/local/cuda/
# bin include lib64 ...
8.2 Using the SDK for Application Development¶
# Source the SDK environment
source /opt/myproject-sdk/environment-setup-aarch64-poky-linux
# Cross-compile a simple C application
cat > hello.c << 'EOF'
#include <stdio.h>
int main() {
printf("Hello from Orin Nano\n");
return 0;
}
EOF
$CC hello.c -o hello
file hello
# hello: ELF 64-bit LSB pie executable, ARM aarch64, ...
# Cross-compile a CMake project
mkdir build && cd build
cmake .. \
-DCMAKE_TOOLCHAIN_FILE=$OECORE_NATIVE_SYSROOT/usr/share/cmake/OEToolchainConfig.cmake
make -j$(nproc)
8.3 Cross-Compiling CUDA Applications with Yocto Toolchain¶
CUDA cross-compilation requires special handling because nvcc runs on the host but generates code for the target GPU architecture.
# Source the Yocto SDK
source /opt/myproject-sdk/environment-setup-aarch64-poky-linux
# Set CUDA-specific variables
export CUDA_PATH=$SDKTARGETSYSROOT/usr/local/cuda
export PATH=$CUDA_PATH/bin:$PATH
Example CUDA application (vector_add.cu):
#include <cuda_runtime.h>
#include <stdio.h>
__global__ void vectorAdd(float *a, float *b, float *c, int n) {
int i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < n) c[i] = a[i] + b[i];
}
int main() {
const int N = 1024;
size_t size = N * sizeof(float);
float *h_a, *h_b, *h_c;
float *d_a, *d_b, *d_c;
h_a = (float*)malloc(size);
h_b = (float*)malloc(size);
h_c = (float*)malloc(size);
for (int i = 0; i < N; i++) {
h_a[i] = 1.0f;
h_b[i] = 2.0f;
}
cudaMalloc(&d_a, size);
cudaMalloc(&d_b, size);
cudaMalloc(&d_c, size);
cudaMemcpy(d_a, h_a, size, cudaMemcpyHostToDevice);
cudaMemcpy(d_b, h_b, size, cudaMemcpyHostToDevice);
vectorAdd<<<(N+255)/256, 256>>>(d_a, d_b, d_c, N);
cudaMemcpy(h_c, d_c, size, cudaMemcpyDeviceToHost);
printf("Result: %f\n", h_c[0]); // Should print 3.0
cudaFree(d_a); cudaFree(d_b); cudaFree(d_c);
free(h_a); free(h_b); free(h_c);
return 0;
}
# Compile for Orin Nano (SM 8.7 -- Ampere architecture)
nvcc -ccbin $CC \
--sysroot=$SDKTARGETSYSROOT \
-arch=sm_87 \
-o vector_add vector_add.cu \
-L$SDKTARGETSYSROOT/usr/local/cuda/lib64 \
-lcudart
8.4 Extensible SDK (eSDK) and devtool¶
The extensible SDK (eSDK) includes BitBake and allows developers to modify recipes, add new packages, and push changes back to the build system.
# Generate eSDK
bitbake myproject-image -c populate_sdk_ext
# Install eSDK
./tmp/deploy/sdk/myproject-distro-*-toolchain-ext-*.sh \
-d /opt/myproject-esdk -y
# Source eSDK environment
source /opt/myproject-esdk/environment-setup-aarch64-poky-linux
# Use devtool to modify a recipe
devtool modify myapp
# This extracts the source to a workspace directory and creates
# a bbappend that points to the local source
# Make changes to the source
cd workspace/sources/myapp
# ... edit files ...
# Build the modified recipe
devtool build myapp
# Deploy to a running device for testing
devtool deploy-target myapp root@192.168.1.100
# When satisfied, create a patch and update the recipe
devtool update-recipe myapp
# Clean up the workspace
devtool reset myapp
8.5 SDK Distribution and Versioning¶
# Automate SDK generation in CI with version tagging
# In your CI pipeline:
SDK_VERSION=$(date +%Y%m%d)-$(git rev-parse --short HEAD)
# Set SDK version in local.conf before build
echo "SDK_VERSION = \"${SDK_VERSION}\"" >> conf/local.conf
bitbake myproject-image -c populate_sdk
# Upload to artifact server
aws s3 cp tmp/deploy/sdk/*.sh \
s3://mycompany-artifacts/sdk/${SDK_VERSION}/
# Developers install a specific version:
aws s3 cp s3://mycompany-artifacts/sdk/20260301-abc1234/ .
./myproject-distro-*-toolchain-*.sh -d /opt/myproject-sdk -y
8.6 SDK Contents and Customization¶
# Customize what is included in the SDK
# In the image recipe or local.conf:
# Add extra packages to the SDK target sysroot
TOOLCHAIN_TARGET_TASK:append = " \
cuda-toolkit-dev \
tensorrt-dev \
opencv-dev \
protobuf-dev \
"
# Add extra packages to the SDK host tools
TOOLCHAIN_HOST_TASK:append = " \
nativesdk-cmake \
nativesdk-protobuf-compiler \
"
# SDK output structure after installation:
# /opt/myproject-sdk/
# environment-setup-aarch64-poky-linux # Source this file
# sysroots/
# x86_64-pokysdk-linux/ # Host (native) tools
# usr/bin/aarch64-poky-linux/ # Cross-compiler
# aarch64-poky-linux/ # Target sysroot
# usr/local/cuda/ # CUDA toolkit
# usr/include/ # All target headers
# usr/lib/ # All target libraries
# site-config-aarch64-poky-linux # Autoconf site config
# version-aarch64-poky-linux # SDK version info
9. Kernel Configuration and Driver Integration¶
9.1 Kernel Recipe (linux-tegra)¶
The linux-tegra recipe in meta-tegra builds NVIDIA's fork of the Linux kernel for Tegra platforms. The kernel version tracks NVIDIA L4T releases (5.15.x for R36.x).
# Examine the kernel recipe
bitbake -e linux-tegra | grep ^SRC_URI=
bitbake -e linux-tegra | grep ^SRCREV=
bitbake -e linux-tegra | grep ^PV=
bitbake -e linux-tegra | grep ^WORKDIR=
# Kernel source is extracted to:
# tmp/work/jetson_orin_nano_devkit-poky-linux/linux-tegra/5.15.xxx/git/
# Kernel config location after build:
# tmp/work/jetson_orin_nano_devkit-poky-linux/linux-tegra/5.15.xxx/build/.config
9.2 defconfig Management¶
There are two approaches to managing kernel configuration: replacing the entire defconfig, or using kernel config fragments. Config fragments are strongly preferred for maintainability.
# meta-myproject-bsp/recipes-kernel/linux/linux-tegra_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
# Option A: Override the entire defconfig (not recommended for production)
# SRC_URI += "file://defconfig"
# Option B: Use kernel config fragments (recommended)
SRC_URI += " \
file://production.cfg \
file://networking.cfg \
file://security.cfg \
file://disable-unused.cfg \
"
files/production.cfg:
# Disable kernel debug features for production
# CONFIG_DEBUG_KERNEL is not set
# CONFIG_DEBUG_INFO is not set
# CONFIG_KALLSYMS is not set
# CONFIG_MAGIC_SYSRQ is not set
# CONFIG_DEBUG_FS is not set
# Enable kernel hardening
CONFIG_SECURITY=y
CONFIG_SECCOMP=y
CONFIG_SECCOMP_FILTER=y
CONFIG_STRICT_DEVMEM=y
CONFIG_IO_STRICT_DEVMEM=y
CONFIG_FORTIFY_SOURCE=y
CONFIG_STACKPROTECTOR_STRONG=y
# Enable watchdog for production reliability
CONFIG_WATCHDOG=y
CONFIG_TEGRA_WATCHDOG=y
files/networking.cfg:
# CAN bus support (for industrial applications)
CONFIG_CAN=y
CONFIG_CAN_RAW=y
CONFIG_CAN_BCM=y
CONFIG_CAN_MTTCAN=y
# WireGuard VPN
CONFIG_WIREGUARD=y
files/disable-unused.cfg:
# Disable unused subsystems to reduce kernel size and attack surface
# CONFIG_WIRELESS is not set
# CONFIG_BT is not set
# CONFIG_NFC is not set
# CONFIG_HAMRADIO is not set
# CONFIG_SOUND is not set
# CONFIG_USB_GADGET is not set
# CONFIG_MEDIA_ANALOG_TV_SUPPORT is not set
# CONFIG_MEDIA_DIGITAL_TV_SUPPORT is not set
# CONFIG_MEDIA_RADIO_SUPPORT is not set
# CONFIG_DVB_CORE is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
9.3 Validating Kernel Configuration¶
# After building, verify config fragments were applied:
bitbake linux-tegra -c kernel_configcheck
# This outputs warnings for any unresolved config fragments:
# WARNING: linux-tegra: config 'CONFIG_FOO' was set, but not in final .config
# WARNING: linux-tegra: config 'CONFIG_BAR' was requested as 'n' but is 'y'
# Manually diff against known-good config:
bitbake -e linux-tegra | grep ^B=
# Use that path to find .config
diff tmp/work/.../linux-tegra/.../build/.config saved-configs/known-good.config
# Generate a minimal defconfig from the current config:
bitbake linux-tegra -c savedefconfig
# Output: tmp/work/.../linux-tegra/.../build/defconfig
# Interactive config exploration (for development only):
bitbake linux-tegra -c menuconfig
# After menuconfig, generate the diff:
bitbake linux-tegra -c diffconfig
# Output: fragment.cfg containing only your changes
9.4 Out-of-Tree Module Recipes¶
# recipes-kernel/custom-sensor/kernel-module-custom-sensor_1.0.bb
SUMMARY = "Custom sensor kernel module for industrial I/O"
LICENSE = "GPL-2.0-only"
LIC_FILES_CHKSUM = "file://COPYING;md5=b234ee4d69f5fce4486a80fdaf4a4263"
SRC_URI = " \
git://git.mycompany.com/custom-sensor-driver.git;protocol=ssh;branch=main \
"
SRCREV = "abc123def456..."
S = "${WORKDIR}/git"
inherit module
EXTRA_OEMAKE += " \
KERNEL_SRC=${STAGING_KERNEL_DIR} \
KERNEL_VERSION=${KERNEL_VERSION} \
"
# Automatically load the module at boot
KERNEL_MODULE_AUTOLOAD += "custom-sensor"
# Module parameters applied at load time
KERNEL_MODULE_PROBECONF += "custom-sensor"
module_conf_custom-sensor = "options custom-sensor sample_rate=100 gain=2"
# Install additional firmware files if needed
do_install:append() {
install -d ${D}${nonarch_base_libdir}/firmware
install -m 0644 ${S}/firmware/custom-sensor.fw \
${D}${nonarch_base_libdir}/firmware/
}
FILES:${PN} += "${nonarch_base_libdir}/firmware/custom-sensor.fw"
9.5 DKMS-Style Driver Management¶
For drivers that need to be rebuilt against different kernel versions, use a DKMS-inspired pattern:
# recipes-kernel/driver-framework/driver-framework_1.0.bb
SUMMARY = "Framework for building out-of-tree kernel modules"
LICENSE = "GPL-2.0-only"
LIC_FILES_CHKSUM = "file://COPYING;md5=..."
SRC_URI = "git://git.mycompany.com/drivers.git;protocol=ssh;branch=main"
SRCREV = "${AUTOREV}"
S = "${WORKDIR}/git"
inherit module
# Build multiple modules from a single source tree
MODULES_DIRS = "sensor-driver can-driver gpio-driver"
do_compile() {
for dir in ${MODULES_DIRS}; do
oe_runmake -C ${STAGING_KERNEL_DIR} \
M=${S}/${dir} \
modules
done
}
do_install() {
for dir in ${MODULES_DIRS}; do
oe_runmake -C ${STAGING_KERNEL_DIR} \
M=${S}/${dir} \
INSTALL_MOD_PATH=${D} \
modules_install
done
}
9.6 Device Tree Overlay Recipes¶
# recipes-kernel/dtoverlays/custom-dtoverlay_1.0.bb
SUMMARY = "Custom device tree overlays for carrier board peripherals"
LICENSE = "MIT"
LIC_FILES_CHKSUM = "file://${COMMON_LICENSE_DIR}/MIT;md5=0835ade698e0bcf8506ecda2f7b4f302"
SRC_URI = " \
file://spi-adc.dtso \
file://can-bus.dtso \
file://gpio-leds.dtso \
"
S = "${WORKDIR}"
inherit devicetree
# The devicetree class handles compilation of .dtso to .dtbo
do_install() {
install -d ${D}/boot/overlays
for dtbo in ${B}/*.dtbo; do
install -m 0644 ${dtbo} ${D}/boot/overlays/
done
}
FILES:${PN} = "/boot/overlays/*.dtbo"
Example overlay for CAN bus (files/can-bus.dtso):
/dts-v1/;
/plugin/;
/ {
overlay-name = "MTTCAN Bus Interface";
compatible = "nvidia,p3768-0000+p3767-0005";
fragment@0 {
target = <&mttcan0>;
__overlay__ {
status = "okay";
pinctrl-names = "default";
pinctrl-0 = <&mttcan0_pins>;
};
};
};
9.7 Integrating NVIDIA Proprietary Drivers¶
NVIDIA GPU drivers for Tegra are provided as pre-built binaries by meta-tegra. Integrating them requires accepting the NVIDIA license:
# In local.conf or distro.conf:
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
# The key driver packages and their roles:
#
# nvidia-l4t-core - Core L4T runtime libraries (libtegradrm, etc.)
# nvidia-l4t-firmware - GPU firmware blobs loaded at boot
# nvidia-l4t-3d-core - EGL/GLES libraries for display
# nvidia-l4t-cuda - CUDA runtime libraries
# nvidia-l4t-multimedia - Video encode/decode (NVENC/NVDEC)
# nvidia-l4t-camera - Camera (Argus) runtime libraries
# nvidia-l4t-tools - tegrastats, nvpmodel, jetson_clocks
# nvidia-l4t-dla - Deep Learning Accelerator runtime
# These are automatically pulled in by machine configuration.
# To explicitly control which NVIDIA packages are included:
IMAGE_INSTALL:append = " \
nvidia-l4t-core \
nvidia-l4t-firmware \
nvidia-l4t-cuda \
nvidia-l4t-tools \
"
# For a headless deployment without display:
IMAGE_INSTALL:remove = "nvidia-l4t-3d-core nvidia-l4t-wayland"
9.8 Kernel Patching Workflow¶
# Step 1: Enter the kernel source directory
bitbake linux-tegra -c devshell
# This drops you into a shell inside the kernel source tree
# Step 2: Make your changes
vi drivers/my-driver/my-driver.c
# Step 3: Create a patch
git add -A && git commit -m "Fix custom driver timeout handling"
git format-patch -1
# Step 4: Copy the patch to your layer
cp 0001-Fix-custom-driver-timeout-handling.patch \
/path/to/meta-myproject-bsp/recipes-kernel/linux/files/
# Step 5: Add to the bbappend
# In linux-tegra_%.bbappend:
# SRC_URI += "file://0001-Fix-custom-driver-timeout-handling.patch"
# Step 6: Rebuild
bitbake linux-tegra -c cleansstate && bitbake linux-tegra
10. Bootloader and Secure Boot Integration¶
10.1 Boot Chain on Orin Nano (T234)¶
Orin Nano Boot Sequence:
Power On
|
v
BootROM (in silicon, immutable)
| Reads BCT from QSPI flash
v
MB1 (Microboot 1, runs on BPMP-FW)
| SDRAM init, pinmux, clocks, power rails
| Verifies MB2 signature (if secure boot enabled)
v
MB2 (Microboot 2)
| TrustZone setup, secure world initialization
| Verifies UEFI signature
v
UEFI (TianoCore EDK2, NVIDIA fork)
| Hardware init, USB/PCIe/NVMe enumeration
| Reads extlinux.conf or UEFI boot manager entries
v
Linux Kernel
| Device tree, initramfs (optional)
v
systemd (PID 1)
| Service startup
v
Application Ready
10.2 UEFI Recipe Customization¶
# meta-myproject-bsp/recipes-bsp/uefi/edk2-firmware-tegra_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += " \
file://0001-custom-boot-logo.patch \
file://0002-disable-uefi-shell.patch \
file://0003-reduce-boot-timeout.patch \
"
# Customize UEFI build flags
EXTRA_UEFI_BUILD_FLAGS += " \
-DUEFI_SHELL_DISABLE=TRUE \
-DBOOT_TIMEOUT=0 \
-DSILENT_BOOT=TRUE \
"
extlinux.conf for kernel boot configuration:
# /boot/extlinux/extlinux.conf
# This file is generated by Yocto and controls kernel boot parameters
TIMEOUT 30
DEFAULT primary
LABEL primary
MENU LABEL Primary Boot
LINUX /boot/Image
FDT /boot/tegra234-p3768-0000+p3767-0005.dtb
INITRD /boot/initrd
APPEND root=/dev/nvme0n1p1 rw rootwait console=ttyTCU0,115200
LABEL recovery
MENU LABEL Recovery Boot
LINUX /boot/Image
FDT /boot/tegra234-p3768-0000+p3767-0005.dtb
APPEND root=/dev/nvme0n1p2 ro rootwait console=ttyTCU0,115200 single
10.3 Secure Boot Key Generation¶
Secure boot on Tegra uses a PKC (Public Key Cryptography) chain with RSA-3072. The process involves generating keys, computing the public key hash, fusing the hash into the device OTP (one-time programmable) fuses, and signing all boot components.
# Key generation -- done once per product line, store keys in HSM or vault
# NEVER store production signing keys in source control
# Generate RSA-3072 key pair for secure boot
openssl genrsa -out rsa_priv.pem 3072
openssl rsa -in rsa_priv.pem -pubout -out rsa_pub.pem
# Generate SBK (Secure Boot Key) for bootloader encryption (AES-256)
openssl rand -hex 32 > sbk.key
# Compute the public key hash for fusing
# NVIDIA tegrasign tool computes the hash:
python3 tegrasign_v3.py --pubkeyhash rsa_pub.pem pkc_hash.txt
# The hash in pkc_hash.txt will be programmed into device fuses
cat pkc_hash.txt
# 0x12345678 0xabcdef01 0x23456789 ...
# Store keys securely:
# Production keys -> Hardware Security Module (HSM) or HashiCorp Vault
# Development keys -> Encrypted USB drive, never on build servers
# CI/CD signing -> Use key references via HSM PKCS#11 interface
10.4 Integrating Signing into the Yocto Build¶
# meta-myproject-bsp/classes/tegra-secure-boot.bbclass
# Path to the signing key (not stored in git -- injected at build time)
TEGRA_SIGNING_KEY ?= "${TOPDIR}/../keys/rsa_priv.pem"
TEGRA_SBK_KEY ?= "${TOPDIR}/../keys/sbk.key"
# Validate key presence at parse time
python () {
import os
key = d.getVar('TEGRA_SIGNING_KEY')
if not os.path.exists(key):
bb.warn("Secure boot signing key not found at: %s" % key)
bb.warn("Build will produce UNSIGNED images")
}
# In local.conf or distro.conf, enable signed image generation:
TEGRA_SIGNING_ARGS = "--key ${TOPDIR}/../keys/rsa_priv.pem"
# For encrypted bootloader (Orin supports SBK + PKC combined):
TEGRA_SIGNING_ARGS += "--encrypt_key ${TOPDIR}/../keys/sbk.key"
# Build produces signed tegraflash archive:
bitbake myproject-image
# Output: myproject-image-*.tegraflash.tar.gz
# All boot components inside are signed with the production key
10.5 Fuse Provisioning Automation¶
#!/bin/bash
# scripts/provision-fuses.sh
# WARNING: Fuse burning is IRREVERSIBLE. Test on development units first.
set -euo pipefail
SCRIPT_DIR=$(dirname "$(readlink -f "$0")")
KEY_DIR="${SCRIPT_DIR}/../keys"
FLASH_DIR="${SCRIPT_DIR}/../tegraflash"
# Read the PKC hash
PKC_HASH=$(cat "${KEY_DIR}/pkc_hash.txt")
echo "============================================="
echo " FUSE PROVISIONING -- IRREVERSIBLE OPERATION"
echo "============================================="
echo "PKC Hash: ${PKC_HASH}"
echo ""
echo "This operation permanently enables secure boot."
echo "The device will only boot images signed with the"
echo "corresponding private key after fusing."
echo ""
read -p "Type 'BURN' to proceed: " confirm
if [ "${confirm}" != "BURN" ]; then
echo "Aborted."
exit 1
fi
cd "${FLASH_DIR}"
# Generate the fuse configuration XML
cat > odmfuse_pkc.xml << FUSEXML
<?xml version="1.0"?>
<genericfuse MagicId="0x45535546" version="1.0.0">
<fuse name="PublicKeyHash" size="64" value="${PKC_HASH}"/>
<fuse name="SecurityMode" size="4" value="0x1"/>
<fuse name="OdmLock" size="4" value="0x1"/>
</genericfuse>
FUSEXML
# Burn fuses
sudo ./tegraflash.py \
--chip 0x23 \
--applet mb1_t234_prod.bin \
--cmd "burnfuses odmfuse_pkc.xml"
echo ""
echo "Fuses burned successfully."
echo "Device will now enforce secure boot on all subsequent boots."
10.6 Chain of Trust from Build System to Device¶
Chain of Trust Architecture:
+-------------------+
| Key Management |
| (HSM / Vault) |----> Signing Key (RSA-3072)
+-------------------+ |
v
+-------------------+ +-------------------+
| CI/CD Pipeline | | Signing Service |
| (BitBake build) |--->| (signs binaries) |
+-------------------+ +-------------------+
| |
v v
+-------------------+ +-------------------+
| Artifact Server | | Signed Binaries |
| (versioned images)|<---| MB1,MB2,UEFI, |
+-------------------+ | kernel, rootfs |
| +-------------------+
|
+----+----+
| |
v v
Flash OTA Server
| |
v v
+-------------------+
| Device (Fused) |
| |
| BootROM verifies |
| MB1 signature |
| MB1 verifies |
| MB2 signature |
| MB2 verifies |
| UEFI signature |
| UEFI verifies |
| kernel sig |
| dm-verity verifies|
| rootfs integrity|
+-------------------+
10.7 dm-verity for Rootfs Integrity¶
# Enable dm-verity in the image recipe
# meta-myproject/recipes-core/images/myproject-image.bb
IMAGE_CLASSES += "dm-verity-img"
DM_VERITY_IMAGE = "myproject-image"
DM_VERITY_IMAGE_TYPE = "ext4"
# Kernel config fragment for dm-verity:
# CONFIG_DM_VERITY=y
# CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
# CONFIG_DM_VERITY_FEC=y
# The dm-verity hash tree is generated at build time.
# The root hash is embedded in the kernel command line:
# root=/dev/dm-0
# dm="vroot none ro,0 1638400 verity 1 /dev/nvme0n1p1 /dev/nvme0n1p1
# 4096 4096 204800 1 sha256 <root_hash> <salt>"
# Alternatively, pass the root hash via UEFI variables for dynamic updates
10.8 Key Rotation and Revocation Strategy¶
Key Rotation Plan:
Year 1-3: Primary key (fused into all devices)
Year 3-5: Primary key + secondary key (new devices get both)
Emergency: Revocation via OTA firmware update that blocks
compromised key in UEFI Forbidden Signature Database
IMPORTANT: The PKC hash fused into the device is PERMANENT.
Key rotation at the BootROM level is not possible after fusing.
Plan for the entire product lifecycle (5-10 years) when
generating the initial key pair.
Mitigation strategies:
- Use HSM with FIPS 140-2 Level 3 for key storage
- Implement key ceremony procedures with multi-party control
- Maintain offline backup of root keys in secure facility
- Use intermediate signing keys for day-to-day operations
(UEFI Secure Boot allows key hierarchy via db/dbx)
11. Build System Optimization¶
11.1 Shared State (sstate) Cache¶
The shared state cache is the single most impactful optimization for Yocto build times. sstate stores the output of each task (do_compile, do_package, etc.) indexed by a hash of all inputs. When inputs have not changed, the task is skipped entirely.
# Configure sstate cache location (shared across builds)
# In local.conf:
SSTATE_DIR = "/opt/yocto/sstate-cache"
# For distributed builds, use an sstate mirror (HTTP server or NFS):
SSTATE_MIRRORS = " \
file://.* https://sstate.mycompany.com/PATH;downloadfilename=PATH \
"
# Pre-populate sstate from CI builds:
# On the CI server, after a successful build:
rsync -avz tmp/sstate-cache/ sstate-server:/opt/yocto/sstate-cache/
# sstate cache impact on build times:
#
# Scenario Without sstate With sstate Savings
# ------------------------------------------------------------------
# Full rebuild (all recipes) 4-6 hours 4-6 hours 0%
# Kernel config change only 4-6 hours 15-25 min ~90%
# Single recipe change 4-6 hours 5-15 min ~95%
# Image recipe change only 4-6 hours 2-5 min ~98%
# No changes (verify build) 4-6 hours 1-2 min ~99%
11.2 Download Mirrors (DL_DIR)¶
# Centralize source downloads to avoid redundant fetches
# In local.conf:
DL_DIR = "/opt/yocto/downloads"
# Set up a download mirror for CI:
PREMIRRORS:prepend = " \
git://.*/.* https://downloads.mycompany.com/ \
https://.*/.* https://downloads.mycompany.com/ \
ftp://.*/.* https://downloads.mycompany.com/ \
"
# Populate the mirror from a completed build:
# bitbake myproject-image --runall=fetch
# rsync -avz /opt/yocto/downloads/ mirror-server:/opt/yocto/downloads/
# This ensures builds work even if upstream sources are temporarily unavailable
# (critical for reproducible production builds)
11.3 Hash Equivalence Server¶
The hash equivalence server allows different builds to share sstate even when non-functional changes (like comments or whitespace in recipes) would normally invalidate the hash.
# Start the hash equivalence server:
bitbake-hashserv --bind 0.0.0.0:8687 --database /opt/yocto/hashserv.db &
# Configure clients to use it:
# In local.conf:
BB_HASHSERVE = "hashserv.mycompany.com:8687"
BB_SIGNATURE_HANDLER = "OEEquivHash"
# The hash equivalence server tracks which task hashes produce identical
# output. If task A produces the same output as task B (despite different
# input hashes), future builds with either input hash will reuse sstate.
# This provides an additional 10-20% sstate hit rate improvement.
11.4 Build Performance Tuning¶
# conf/local.conf -- performance tuning section
# Number of BitBake threads (recipe-level parallelism)
# Rule of thumb: number of CPU cores
BB_NUMBER_THREADS = "16"
# Number of make threads (compilation-level parallelism)
# Rule of thumb: 1.5x CPU cores (compilation is I/O-bound)
PARALLEL_MAKE = "-j 24"
# Use tmpfs for the build directory (requires sufficient RAM)
# 64 GB RAM minimum for this approach
# TMPDIR = "/dev/shm/yocto-tmp"
# Disable unnecessary features during development builds
# (re-enable for release builds)
INHERIT:remove = "buildhistory"
# INHERIT:remove = "reproducible_build" # Only disable for dev speed
# Use zstd compression for sstate (faster than gzip)
SSTATE_PKG_SUFFIX = "zst"
ZSTD_COMPRESSION_LEVEL = "3"
# Limit the number of parallel package write tasks
# (prevents I/O saturation on spinning disks)
BB_NUMBER_PARSE_THREADS = "16"
# Skip QA checks during development (NEVER for production builds)
# WARN_QA:remove = "ldflags"
# ERROR_QA:remove = "ldflags"
11.5 Managing 32+ Build Targets/Configurations¶
For programs with multiple machine targets and image variants, use a systematic build matrix approach:
# build-matrix.sh -- Build all target configurations
#!/bin/bash
set -euo pipefail
MACHINES=(
"jetson-orin-nano-devkit"
"jetson-orin-nano-devkit-nvme"
"mycompany-edge-v2"
"mycompany-edge-v3"
)
IMAGES=(
"myproject-image"
"myproject-image-dev"
"myproject-image-manufacturing"
)
RESULTS_FILE="build-results-$(date +%Y%m%d-%H%M%S).txt"
for machine in "${MACHINES[@]}"; do
for image in "${IMAGES[@]}"; do
echo "Building: ${machine} / ${image}" | tee -a "${RESULTS_FILE}"
start_time=$(date +%s)
MACHINE="${machine}" bitbake "${image}" 2>&1 | \
tee "build-log-${machine}-${image}.txt"
result=$?
end_time=$(date +%s)
duration=$((end_time - start_time))
if [ ${result} -eq 0 ]; then
echo " SUCCESS (${duration}s)" | tee -a "${RESULTS_FILE}"
else
echo " FAILED (${duration}s)" | tee -a "${RESULTS_FILE}"
fi
done
done
echo ""
echo "Build matrix complete. Results: ${RESULTS_FILE}"
kas multi-config approach:
# kas/build-matrix.yml
header:
version: 14
includes:
- kas/base.yml
env:
SSTATE_DIR: /opt/yocto/sstate-cache
DL_DIR: /opt/yocto/downloads
# Build all configs with:
# kas build kas/build-matrix.yml:kas/machine-orin-nano.yml:kas/image-production.yml
# kas build kas/build-matrix.yml:kas/machine-edge-v2.yml:kas/image-production.yml
# kas build kas/build-matrix.yml:kas/machine-edge-v2.yml:kas/image-dev.yml
11.6 Reproducible Builds¶
# Enable reproducible builds in distro.conf:
BUILD_REPRODUCIBLE_BINARIES = "1"
INHERIT += "reproducible_build"
# Set a fixed source date epoch for all packages:
SOURCE_DATE_EPOCH = "1704067200" # 2024-01-01 00:00:00 UTC
# Verify build reproducibility:
# Build the image twice and compare:
bitbake myproject-image
cp tmp/deploy/images/jetson-orin-nano-devkit/myproject-image-*.ext4 /tmp/build1.ext4
bitbake -c cleansstate myproject-image
bitbake myproject-image
cp tmp/deploy/images/jetson-orin-nano-devkit/myproject-image-*.ext4 /tmp/build2.ext4
# Compare:
diffoscope /tmp/build1.ext4 /tmp/build2.ext4 --html /tmp/repro-diff.html
# Goal: zero differences
11.7 Build Server Hardware Recommendations¶
Build Server Specifications (based on team size and build frequency):
Small Team (5-10 engineers, daily builds)
------------------------------------------
CPU: AMD EPYC 7313 (16 cores / 32 threads)
RAM: 128 GB DDR4 ECC
Storage: 2 TB NVMe SSD (build) + 4 TB HDD (sstate/downloads)
Network: 1 Gbps
Medium Team (10-30 engineers, hourly builds)
------------------------------------------
CPU: 2x AMD EPYC 7543 (64 cores / 128 threads total)
RAM: 256 GB DDR4 ECC
Storage: 4 TB NVMe RAID-0 (build) + 8 TB SSD (sstate/downloads)
Network: 10 Gbps
Large Team (30-50+ engineers, continuous builds)
------------------------------------------
CPU: 2x AMD EPYC 9654 (192 cores / 384 threads total)
RAM: 512 GB DDR5 ECC
Storage: 8 TB NVMe RAID-0 (build) + 16 TB SSD (sstate)
Network: 25 Gbps
Notes: Consider multiple build agents with shared sstate
12. CI/CD for Embedded Linux¶
12.1 Pipeline Architecture¶
CI/CD Pipeline for Yocto-Based Jetson Builds:
Developer Push
|
v
+------------------+
| Pre-Build Stage | Lint recipes, check layer compatibility,
| (5 min) | validate kas configs, license pre-check
+------------------+
|
v
+------------------+
| Build Stage | BitBake full image build (with sstate),
| (15-60 min) | generate SDK, generate license manifest
+------------------+
|
v
+------------------+
| Test Stage | QEMU smoke tests (where applicable),
| (10-30 min) | image size validation, package manifest diff
+------------------+
|
v
+------------------+
| Flash & HW Test | Flash to physical devices (HIL farm),
| (30-60 min) | boot test, peripheral test, stress test
+------------------+
|
v
+------------------+
| Artifact Stage | Upload images, SDK, manifests to artifact
| (5 min) | server, tag release, notify team
+------------------+
12.2 GitLab CI Pipeline¶
# .gitlab-ci.yml
stages:
- validate
- build
- test
- deploy
variables:
SSTATE_DIR: /opt/yocto/sstate-cache
DL_DIR: /opt/yocto/downloads
KAS_CONFIG: kas/jetson-orin-nano.yml
MACHINE: jetson-orin-nano-devkit
# ------------------------------------------------------------------
# Stage: Validate
# ------------------------------------------------------------------
validate-recipes:
stage: validate
image: crops/poky:latest
script:
- kas shell ${KAS_CONFIG} -c "bitbake-layers show-layers"
- kas shell ${KAS_CONFIG} -c "bitbake -p" # Parse all recipes
- kas shell ${KAS_CONFIG} -c "bitbake --runall=fetch myproject-image --dry-run"
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == "main"'
validate-licenses:
stage: validate
image: crops/poky:latest
script:
- kas shell ${KAS_CONFIG} -c "bitbake myproject-image -c populate_lic"
- python3 scripts/check-license-compliance.py tmp/deploy/licenses/
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
# ------------------------------------------------------------------
# Stage: Build
# ------------------------------------------------------------------
build-production-image:
stage: build
tags:
- yocto-builder # Requires dedicated build runner with 32+ cores
timeout: 4h
script:
- kas build ${KAS_CONFIG}
- kas build ${KAS_CONFIG} -c populate_sdk
artifacts:
paths:
- build/tmp/deploy/images/${MACHINE}/*.tegraflash.tar.gz
- build/tmp/deploy/images/${MACHINE}/*.manifest
- build/tmp/deploy/sdk/*.sh
- build/tmp/deploy/licenses/
expire_in: 30 days
cache:
key: sstate-${CI_COMMIT_REF_NAME}
paths:
- ${SSTATE_DIR}
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
- if: '$CI_COMMIT_TAG'
build-dev-image:
stage: build
tags:
- yocto-builder
timeout: 4h
variables:
KAS_CONFIG: kas/jetson-orin-nano-dev.yml
script:
- kas build ${KAS_CONFIG}
artifacts:
paths:
- build/tmp/deploy/images/${MACHINE}/*.tegraflash.tar.gz
expire_in: 7 days
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
# ------------------------------------------------------------------
# Stage: Test
# ------------------------------------------------------------------
test-image-size:
stage: test
script:
- |
IMAGE_SIZE=$(stat -c%s build/tmp/deploy/images/${MACHINE}/*.ext4)
MAX_SIZE=$((2 * 1024 * 1024 * 1024)) # 2 GB limit
if [ ${IMAGE_SIZE} -gt ${MAX_SIZE} ]; then
echo "ERROR: Image size ${IMAGE_SIZE} exceeds limit ${MAX_SIZE}"
exit 1
fi
echo "Image size: ${IMAGE_SIZE} bytes (limit: ${MAX_SIZE})"
needs:
- build-production-image
test-package-manifest:
stage: test
script:
- python3 scripts/validate-manifest.py \
build/tmp/deploy/images/${MACHINE}/*.manifest \
allowed-packages.txt
- python3 scripts/check-cve.py \
build/tmp/deploy/images/${MACHINE}/*.manifest
needs:
- build-production-image
test-hardware:
stage: test
tags:
- jetson-hil # Hardware-in-the-loop test runner with physical device
timeout: 1h
script:
- scripts/flash-and-test.sh \
build/tmp/deploy/images/${MACHINE}/*.tegraflash.tar.gz
needs:
- build-production-image
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
# ------------------------------------------------------------------
# Stage: Deploy
# ------------------------------------------------------------------
deploy-artifacts:
stage: deploy
script:
- VERSION=$(git describe --tags --always)
- aws s3 sync build/tmp/deploy/images/${MACHINE}/ \
s3://mycompany-releases/${VERSION}/${MACHINE}/
- aws s3 cp build/tmp/deploy/sdk/*.sh \
s3://mycompany-releases/${VERSION}/sdk/
- scripts/notify-release.sh ${VERSION}
needs:
- build-production-image
- test-image-size
- test-package-manifest
rules:
- if: '$CI_COMMIT_TAG'
12.3 Jenkins Pipeline¶
// Jenkinsfile
pipeline {
agent {
label 'yocto-builder'
}
options {
timeout(time: 6, unit: 'HOURS')
buildDiscarder(logRotator(numToKeepStr: '20'))
}
environment {
SSTATE_DIR = '/opt/yocto/sstate-cache'
DL_DIR = '/opt/yocto/downloads'
MACHINE = 'jetson-orin-nano-devkit'
}
parameters {
choice(name: 'IMAGE',
choices: ['myproject-image', 'myproject-image-dev', 'myproject-image-manufacturing'],
description: 'Image to build')
choice(name: 'MACHINE',
choices: ['jetson-orin-nano-devkit', 'mycompany-edge-v2', 'mycompany-edge-v3'],
description: 'Target machine')
booleanParam(name: 'BUILD_SDK',
defaultValue: false,
description: 'Also build SDK')
}
stages {
stage('Validate') {
steps {
sh 'kas shell kas/jetson-orin-nano.yml -c "bitbake -p"'
}
}
stage('Build Image') {
steps {
sh """
MACHINE=${params.MACHINE} kas build kas/jetson-orin-nano.yml
"""
}
}
stage('Build SDK') {
when {
expression { params.BUILD_SDK }
}
steps {
sh """
MACHINE=${params.MACHINE} kas build kas/jetson-orin-nano.yml \
-c populate_sdk
"""
}
}
stage('Verify') {
steps {
sh 'python3 scripts/validate-image.py'
sh 'python3 scripts/check-licenses.py'
}
}
stage('Archive') {
steps {
archiveArtifacts artifacts: 'build/tmp/deploy/images/**/*.tegraflash.tar.gz'
archiveArtifacts artifacts: 'build/tmp/deploy/images/**/*.manifest'
archiveArtifacts artifacts: 'build/tmp/deploy/licenses/**/*', allowEmptyArchive: true
}
}
}
post {
success {
slackSend channel: '#embedded-builds',
message: "Build SUCCESS: ${env.JOB_NAME} #${env.BUILD_NUMBER}"
}
failure {
slackSend channel: '#embedded-builds',
message: "Build FAILED: ${env.JOB_NAME} #${env.BUILD_NUMBER}"
}
}
}
12.4 Build Matrix for Multiple Machine Targets¶
# kas/base.yml -- shared configuration
header:
version: 14
repos:
poky:
url: https://git.yoctoproject.org/poky
branch: scarthgap
path: layers/poky
layers:
meta:
meta-poky:
meta-openembedded:
url: https://git.openembedded.org/meta-openembedded
branch: scarthgap
path: layers/meta-openembedded
layers:
meta-oe:
meta-python:
meta-networking:
meta-tegra:
url: https://github.com/OE4T/meta-tegra.git
branch: scarthgap-l4t-r36.x
path: layers/meta-tegra
meta-myproject:
path: layers/meta-myproject
local_conf_header:
base: |
SSTATE_DIR = "/opt/yocto/sstate-cache"
DL_DIR = "/opt/yocto/downloads"
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
---
# kas/machine-orin-nano.yml
header:
version: 14
includes:
- kas/base.yml
machine: jetson-orin-nano-devkit
---
# kas/machine-edge-v2.yml
header:
version: 14
includes:
- kas/base.yml
machine: mycompany-edge-v2
---
# kas/image-production.yml
header:
version: 14
target: myproject-image
---
# kas/image-dev.yml
header:
version: 14
target: myproject-image-dev
local_conf_header:
dev: |
EXTRA_IMAGE_FEATURES += "debug-tweaks tools-debug"
# Build all combinations:
for machine in kas/machine-*.yml; do
for image in kas/image-*.yml; do
echo "Building: ${machine} + ${image}"
kas build ${machine}:${image}
done
done
12.5 Release Engineering Workflow¶
Quarterly Release Cadence:
Week 1-8: Feature Development
- Feature branches merged to 'develop'
- CI builds on every merge
- Developer images deployed to test devices
Week 9-10: Integration & Stabilization
- 'develop' merged to 'release/Q1-2026'
- Only bug fixes accepted on release branch
- Full regression test suite run nightly
Week 11: Release Candidate
- RC1 built from release branch
- Hardware-in-the-loop testing on all variants
- License audit finalized
- Release notes drafted
Week 12: Production Release
- Final build from tagged commit
- Artifacts signed with production keys
- Images uploaded to OTA server
- Staged rollout begins (1% -> 10% -> 100%)
Post-Release: Maintenance
- Hotfix branches from release tag
- Security patches backported
- Next quarter planning begins
13. OTA Update System¶
13.1 OTA Framework Selection¶
Three major OTA frameworks integrate with Yocto for Jetson deployments:
| Feature | SWUpdate | Mender | RAUC |
|---|---|---|---|
| Update model | Single/dual copy, delta | A/B dual rootfs | A/B slots |
| Yocto integration | meta-swupdate | meta-mender | meta-rauc |
| Server component | Custom / hawkBit | Mender Server (hosted/self) | Custom |
| Delta updates | Yes (librsync, zchunk) | Yes (commercial) | Yes (casync) |
| Tegra flash integration | Manual | Manual | Manual |
| License | GPL-2.0 | Apache-2.0 / Commercial | LGPL-2.1 |
| Production readiness | High (widely deployed) | High (commercial support) | High |
For Jetson Orin Nano production deployments, SWUpdate is recommended due to its flexibility with custom handlers (needed for Tegra bootloader updates) and lack of commercial licensing requirements.
13.2 Integrating SWUpdate with Yocto¶
# Add meta-swupdate to bblayers.conf
# (clone from: https://github.com/sbabic/meta-swupdate)
# Image recipe addition:
IMAGE_INSTALL:append = " swupdate swupdate-www"
# SWUpdate recipe configuration:
# meta-myproject/recipes-support/swupdate/swupdate_%.bbappend
FILESEXTRAPATHS:prepend := "${THISDIR}/files:"
SRC_URI += " \
file://defconfig \
file://swupdate.cfg \
file://swupdate.pub.pem \
"
# SWUpdate defconfig (files/defconfig):
# CONFIG_HW_COMPATIBILITY=y
# CONFIG_SIGNED_IMAGES=y
# CONFIG_SIGALG_RSA_PSS=y
# CONFIG_ENCRYPTED_IMAGES=y
# CONFIG_SURICATTA=y
# CONFIG_SURICATTA_HAWKBIT=y
# CONFIG_WEBSERVER=y
# CONFIG_MONGOOSE=y
# CONFIG_CHANNEL_CURL=y
13.3 SWUpdate Image (SWU) Generation¶
# recipes-support/swupdate/swupdate-image.bb
SUMMARY = "SWUpdate OTA update image"
LICENSE = "MIT"
SRC_URI = "file://sw-description"
inherit swupdate
# The update image contains:
# 1. sw-description (metadata, signed)
# 2. rootfs image (compressed)
# 3. Optional: bootloader update, kernel, device tree
SWUPDATE_IMAGES = "myproject-image"
SWUPDATE_IMAGES_FSTYPES[myproject-image] = ".ext4.gz"
sw-description file:
{
"software": {
"version": "3.0.0",
"hardware-compatibility": ["1.0", "2.0"],
"jetson-orin-nano": {
"images": [
{
"filename": "myproject-image-jetson-orin-nano-devkit.ext4.gz",
"type": "raw",
"device": "/dev/nvme0n1p2",
"compressed": "zlib",
"installed-directly": true,
"sha256": "@myproject-image-jetson-orin-nano-devkit.ext4.gz"
}
],
"scripts": [
{
"filename": "post-update.sh",
"type": "shellscript",
"sha256": "@post-update.sh"
}
]
}
}
}
Post-update script:
#!/bin/sh
# post-update.sh -- executed after rootfs is written
# Switch the active boot partition
# For Tegra A/B boot, update extlinux.conf to point to new partition
CURRENT_ROOT=$(findmnt -n -o SOURCE /)
if [ "${CURRENT_ROOT}" = "/dev/nvme0n1p1" ]; then
NEW_ROOT="/dev/nvme0n1p2"
sed -i "s|root=/dev/nvme0n1p1|root=/dev/nvme0n1p2|" \
/boot/extlinux/extlinux.conf
else
NEW_ROOT="/dev/nvme0n1p1"
sed -i "s|root=/dev/nvme0n1p2|root=/dev/nvme0n1p1|" \
/boot/extlinux/extlinux.conf
fi
echo "Boot target updated to ${NEW_ROOT}"
sync
# Signal successful update
exit 0
13.4 Signing OTA Update Images¶
# Generate signing key pair for OTA updates
openssl ecparam -genkey -name prime256v1 -out swupdate_priv.pem
openssl ec -in swupdate_priv.pem -pubout -out swupdate_pub.pem
# Sign the SWU image during the build:
# In the swupdate image recipe or class:
SWUPDATE_SIGNING = "RSA"
SWUPDATE_PRIVATE_KEY = "${TOPDIR}/../keys/swupdate_priv.pem"
# The public key is installed on the device:
# /etc/swupdate/swupdate.pub.pem
# SWUpdate will verify the signature before applying any update
# Unsigned or incorrectly signed updates are rejected
13.5 A/B Partition Scheme¶
NVMe Partition Layout for A/B Updates:
+--------------------------------------------------+
| QSPI Flash (32 MB) |
| +------+------+------+------+------------------+ |
| | MB1 | MB1_b| MB2 | MB2_b| UEFI | UEFI_b | |
| +------+------+------+------+------------------+ |
+--------------------------------------------------+
| NVMe SSD |
| +-------------------+-------------------+------+ |
| | APP (Slot A) | APP_b (Slot B) | DATA | |
| | rootfs 2 GB | rootfs 2 GB | rest | |
| | /dev/nvme0n1p1 | /dev/nvme0n1p2 | p3 | |
| +-------------------+-------------------+------+ |
+--------------------------------------------------+
Active slot is determined by extlinux.conf (or UEFI boot vars).
During OTA: write to inactive slot, verify, switch boot target.
On failure: watchdog timeout triggers revert to previous slot.
13.6 Delta Updates for Bandwidth Savings¶
# Generate delta update (only changed blocks):
# Using bsdiff/bspatch approach:
bsdiff old-rootfs.ext4 new-rootfs.ext4 rootfs-delta.bsdiff
# Using casync (content-addressable storage):
casync make --store=/var/casync/store rootfs.caidx rootfs.ext4
# Only new chunks are transferred over the network
# SWUpdate with delta handler:
# sw-description entry for delta update:
# {
# "filename": "rootfs-delta.zck",
# "type": "delta",
# "device": "/dev/nvme0n1p2",
# "properties": {
# "source": "/dev/nvme0n1p1",
# "algorithm": "zchunk"
# }
# }
# Bandwidth savings typical for incremental updates:
#
# Update Type Full Image Delta Savings
# ---------------------------------------------------
# Minor patch 750 MB 15-50 MB 93-98%
# Feature release 750 MB 100-200 MB 73-87%
# Major upgrade 750 MB 400-500 MB 33-47%
13.7 Staged Rollouts¶
#!/bin/bash
# scripts/staged-rollout.sh
# Manages phased deployment to device fleet
set -euo pipefail
OTA_SERVER="https://ota.mycompany.com"
FIRMWARE_VERSION="$1"
API_TOKEN="${OTA_API_TOKEN}"
echo "Starting staged rollout for version ${FIRMWARE_VERSION}"
# Stage 1: Internal test devices (5 units)
echo "Stage 1: Deploying to test fleet..."
curl -X POST "${OTA_SERVER}/api/v1/rollout" \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"version\": \"${FIRMWARE_VERSION}\",
\"group\": \"internal-test\",
\"percentage\": 100
}"
echo "Waiting 24h for test fleet validation..."
sleep 86400
# Verify test fleet health
TEST_SUCCESS=$(curl -s "${OTA_SERVER}/api/v1/rollout/status" \
-H "Authorization: Bearer ${API_TOKEN}" | jq '.success_rate')
if (( $(echo "${TEST_SUCCESS} < 0.95" | bc -l) )); then
echo "ABORT: Test fleet success rate ${TEST_SUCCESS} < 95%"
exit 1
fi
# Stage 2: 1% of production fleet
echo "Stage 2: Deploying to 1% of production fleet..."
curl -X POST "${OTA_SERVER}/api/v1/rollout" \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"version\": \"${FIRMWARE_VERSION}\",
\"group\": \"production\",
\"percentage\": 1
}"
echo "Waiting 48h for canary validation..."
sleep 172800
# Stage 3: 10% of production fleet
echo "Stage 3: Deploying to 10%..."
curl -X POST "${OTA_SERVER}/api/v1/rollout" \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"version\": \"${FIRMWARE_VERSION}\",
\"group\": \"production\",
\"percentage\": 10
}"
echo "Waiting 72h..."
sleep 259200
# Stage 4: 100% of production fleet
echo "Stage 4: Full deployment..."
curl -X POST "${OTA_SERVER}/api/v1/rollout" \
-H "Authorization: Bearer ${API_TOKEN}" \
-H "Content-Type: application/json" \
-d "{
\"version\": \"${FIRMWARE_VERSION}\",
\"group\": \"production\",
\"percentage\": 100
}"
echo "Staged rollout complete for version ${FIRMWARE_VERSION}"
13.8 Rollback Mechanisms¶
# Automatic rollback via watchdog:
# systemd service that confirms successful boot
# /etc/systemd/system/update-confirm.service
# [Unit]
# Description=Confirm successful boot after OTA update
# After=multi-user.target myapp.service
# Wants=myapp.service
#
# [Service]
# Type=oneshot
# ExecStart=/usr/bin/update-confirm.sh
# RemainAfterExit=yes
#
# [Install]
# WantedBy=multi-user.target
#!/bin/bash
# /usr/bin/update-confirm.sh
# Confirms successful boot -- must run within watchdog timeout
set -euo pipefail
# Check critical services are running
systemctl is-active --quiet myapp.service || exit 1
systemctl is-active --quiet networkmanager.service || exit 1
# Check GPU is functional
nvidia-smi > /dev/null 2>&1 || exit 1
# All checks passed -- mark boot as successful
# This disarms the watchdog-based rollback
fw_setenv bootcount 0
echo "Boot confirmed successful. Rollback disarmed."
# If this script does not run (crash, hang, etc.),
# the hardware watchdog resets the device after 120 seconds.
# The bootloader increments bootcount and, if > 3,
# switches back to the previous partition.
13.9 Achieving 99%+ Update Success Rate¶
Practices for reliable OTA updates at scale:
1. Pre-flight checks before update:
- Verify battery/power supply is adequate (no update on low power)
- Check available storage space
- Verify network connectivity and bandwidth
- Validate update signature before writing
2. Atomic operations:
- Write to inactive partition (never the running system)
- Verify written data (read back and hash compare)
- Single atomic operation to switch boot target
3. Automatic rollback:
- Hardware watchdog with 120-second timeout
- Boot counter in persistent storage
- Rollback after 3 consecutive failed boots
- Application health check within 60 seconds of boot
4. Monitoring:
- Track update status per device (downloading, applying, rebooting, confirmed)
- Alert on devices stuck in update state > 30 minutes
- Dashboard showing fleet-wide update progress
5. Retry logic:
- Resume interrupted downloads (HTTP range requests)
- Retry failed updates up to 3 times with exponential backoff
- Fall back to full image if delta update fails
Typical results with these practices:
- 99.7% first-attempt success rate
- 99.95% success rate with retries
- 0.05% devices requiring manual intervention
(usually hardware failure, not software)
14. System Bring-Up for New Hardware¶
14.1 Bring-Up Methodology¶
System bring-up for a new carrier board design follows a structured, phased approach. Each phase has defined entry and exit criteria.
Bring-Up Phases:
Phase 1: Power and Clock Verification (Week 1)
------------------------------------------------
Entry: PCB assembled, visual inspection passed
Tasks: Verify power rails, clock frequencies, reset sequencing
Tools: Oscilloscope, multimeter, power supply with current limiting
Exit: All power rails within spec, clocks stable
Phase 2: JTAG and Serial Console (Week 1-2)
------------------------------------------------
Entry: Power verified
Tasks: Establish JTAG connection, verify serial console output
Tools: JTAG debugger (Lauterbach/Segger), USB-UART adapter
Exit: BootROM messages visible on serial console
Phase 3: Bootloader Bring-Up (Week 2-3)
------------------------------------------------
Entry: Serial console working
Tasks: Flash MB1/MB2/UEFI, debug boot failures, device tree adaptation
Tools: tegraflash, serial console, JTAG (if boot hangs)
Exit: UEFI boots to shell or extlinux prompt
Phase 4: Kernel Boot (Week 3-4)
------------------------------------------------
Entry: UEFI functional
Tasks: Boot Linux kernel, debug device tree issues, enable serial console
Tools: Serial console, kernel command line debugging
Exit: Kernel boots to login prompt with serial console
Phase 5: Peripheral Enablement (Week 4-6)
------------------------------------------------
Entry: Kernel boots
Tasks: Enable each peripheral (USB, PCIe, SPI, I2C, CAN, GPIO, camera)
Tools: Device tree editing, driver debugging, logic analyzer
Exit: All peripherals functional and passing tests
Phase 6: Stress Testing and Validation (Week 6-7)
------------------------------------------------
Entry: All peripherals working
Tasks: Thermal stress, power cycling, long-duration stability tests
Tools: Thermal chamber, automated test scripts, power cycling equipment
Exit: 72-hour continuous operation without errors
14.2 JTAG and Serial Console Debugging¶
# Serial console setup for Orin Nano:
# The default debug UART is ttyTCU0 (Tegra Combined UART)
# Accessible via the micro-USB connector on the dev kit
# Baud rate: 115200
# Connect with minicom:
sudo minicom -D /dev/ttyACM0 -b 115200
# Connect with screen:
sudo screen /dev/ttyACM0 115200
# Connect with picocom (recommended for scripting):
picocom -b 115200 /dev/ttyACM0 --logfile boot-log-$(date +%Y%m%d-%H%M%S).txt
# JTAG debugging setup (Segger J-Link):
# 1. Connect J-Link to JTAG header on carrier board
# 2. Start J-Link GDB Server:
JLinkGDBServer -device Cortex-A78AE -if JTAG -speed 4000
# 3. Connect GDB:
aarch64-linux-gnu-gdb vmlinux
(gdb) target remote localhost:2331
(gdb) monitor halt
(gdb) bt
14.3 Device Tree Debugging During Bring-Up¶
# On the target device, examine the live device tree:
dtc -I fs /sys/firmware/devicetree/base -O dts > live-dt.dts
# Check if a specific node is present:
ls /sys/firmware/devicetree/base/spi@3210000/
# Check device status:
cat /sys/firmware/devicetree/base/spi@3210000/status
# Expected: "okay" if enabled
# Find all disabled devices:
for d in /sys/firmware/devicetree/base/*/status; do
status=$(cat "$d" 2>/dev/null)
if [ "$status" = "disabled" ]; then
echo "DISABLED: $(dirname $d | sed 's|/sys/firmware/devicetree/base/||')"
fi
done
# Verify pinmux configuration:
cat /sys/kernel/debug/tegra_pinctrl_reg
# Check GPIO state:
cat /sys/kernel/debug/gpio
# Verify clock tree:
cat /sys/kernel/debug/clk/clk_summary | head -50
14.4 Kernel Boot Debugging¶
# Add kernel debug parameters for bring-up:
# In extlinux.conf APPEND line:
APPEND root=/dev/nvme0n1p1 rw rootwait \
console=ttyTCU0,115200 \
earlyprintk=ttyTCU0,115200 \
loglevel=8 \
initcall_debug \
log_buf_len=4M \
boot_delay=3
# If the kernel hangs during boot, identify the last successful initcall:
# The serial console output will show:
# calling some_driver_init+0x0/0x1c @ 1
# initcall some_driver_init+0x0/0x1c returned 0 after 5 usecs
# (next line never appears = driver causing hang)
# Kernel panic debugging:
# Add "panic=10" to reboot after 10 seconds on panic
# Add "crashkernel=256M" for kdump support
# Add "oops=panic" to convert oops to panic for capture
# For early boot debugging when serial is not yet available:
# Use earlycon:
APPEND earlycon=tegra_comb_uart,mmio32,0x0c168000 ...
14.5 Reducing Platform Stabilization Timeline¶
Practices that reduced bring-up from 12 weeks to 7 weeks:
1. Pre-silicon preparation (saves 1-2 weeks):
- Device tree drafted from schematic before PCB arrives
- Pinmux spreadsheet reviewed with hardware team
- Known-working kernel config prepared in advance
- Flash scripts and partition layouts pre-tested on dev kit
2. Parallel workstreams (saves 1-2 weeks):
- Software engineer works on kernel/DT while HW verifies power
- CI pipeline set up during Week 1 (before first boot)
- Test automation scripts written against dev kit
3. Structured debugging (saves 1 week):
- Checklist-driven peripheral bring-up (no ad-hoc debugging)
- Each peripheral test has pass/fail criteria defined upfront
- Hardware team available for real-time schematic queries
- Known-issue database from previous board revisions
4. Automation (saves 1 week):
- Automated flash-and-boot test (flash, boot, run test suite)
- Automated power cycling test (1000 cycles overnight)
- Automated peripheral test suite (GPIO, SPI, I2C, CAN, USB)
- Results posted to shared dashboard
14.6 Working with Hardware Teams Across Sites¶
Communication Protocol for Multi-Site Bring-Up:
Daily standup (15 min, video call):
- Hardware team: board status, ECO notices, test results
- Software team: boot status, driver issues, DT changes needed
- Shared blockers list updated in real-time
Shared artifacts:
- Schematic PDF (version controlled)
- Pinmux spreadsheet (locked cells for approved assignments)
- Serial console logs (uploaded to shared drive after each session)
- Board photo documentation (component placement, rework)
- Test result database (pass/fail per peripheral per board serial)
Escalation path:
- Level 1: Engineer-to-engineer (Slack/Teams, same day)
- Level 2: Technical lead review (next business day)
- Level 3: Cross-site engineering review (weekly meeting)
Board tracking:
- Each prototype board has a serial number and tracking spreadsheet
- Board location, status (functional/debug/rework), and owner tracked
- Shipping between sites uses tracked courier with anti-static packaging
15. Boot Performance Optimization¶
15.1 Boot Time Analysis¶
Before optimizing, measure the baseline. The Orin Nano stock L4T boots in 45-90 seconds. A well-optimized Yocto image can boot in under 10 seconds to application ready.
# Method 1: systemd-analyze (after boot)
systemd-analyze
# Startup finished in 1.234s (kernel) + 3.456s (userspace) = 4.690s
systemd-analyze blame
# Shows which services took the longest:
# 2.345s NetworkManager.service
# 1.234s systemd-udevd.service
# 0.987s myapp.service
# ...
systemd-analyze critical-chain
# Shows the critical path (longest sequential chain):
# multi-user.target @4.690s
# myapp.service @3.456s +987ms
# network-online.target @3.400s
# NetworkManager-wait-online.service @1.200s +2.200s
# NetworkManager.service @0.800s +400ms
# Method 2: Kernel boot timing
# Add "printk.time=1" to kernel command line
# Analyze with:
dmesg | grep -E "^\[.*\]" | head -50
# Method 3: bootchart (systemd built-in)
# Add "init=/lib/systemd/systemd-bootchart" to kernel command line
# After boot, find the SVG at: /run/log/bootchart-*.svg
# Method 4: GPIO toggle measurement
# Toggle a GPIO at key boot milestones and measure with oscilloscope
# This gives wall-clock time independent of software timestamps
15.2 Kernel Boot Optimization¶
# Kernel config fragments for fast boot:
# files/fast-boot.cfg
# Disable initramfs (boot directly to rootfs)
# CONFIG_BLK_DEV_INITRD is not set
# Reduce kernel log verbosity
CONFIG_PRINTK_TIME=y
CONFIG_CONSOLE_LOGLEVEL_DEFAULT=4
CONFIG_MESSAGE_LOGLEVEL_DEFAULT=4
# Disable unused subsystems (each saves 10-100ms)
# CONFIG_DEBUG_KERNEL is not set
# CONFIG_FTRACE is not set
# CONFIG_KPROBES is not set
# CONFIG_PROFILING is not set
# CONFIG_PERF_EVENTS is not set
# CONFIG_DEBUG_PREEMPT is not set
# Optimize kernel compression
CONFIG_KERNEL_LZ4=y
# LZ4 decompresses ~3x faster than gzip with minimal size increase
# Defer non-critical driver probing
CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
# Disable module signature verification (if not using signed modules)
# CONFIG_MODULE_SIG is not set
15.3 initramfs vs Direct rootfs Boot¶
Comparison for Orin Nano:
initramfs boot:
- Kernel loads initramfs from QSPI/eMMC/NVMe
- initramfs runs early userspace (udev, mount rootfs, switch_root)
- Adds 1-3 seconds to boot time
- Required for: encrypted rootfs, complex storage setups, network boot
- meta-tegra default: uses initrd for flexibility
Direct rootfs boot:
- Kernel mounts rootfs directly via root= parameter
- No intermediate userspace step
- Saves 1-3 seconds
- Requires: rootfs on a device the kernel can probe directly
For NVMe boot on Orin Nano:
- Direct boot is possible if NVMe driver is built into kernel (not module)
- Kernel command line: root=/dev/nvme0n1p1 rootwait
Recommendation for production:
- Use direct boot for fastest boot times
- Ensure NVMe, ext4, and dm-verity are built-in (not modules)
# Kernel config for direct NVMe boot without initramfs:
# CONFIG_BLK_DEV_INITRD is not set
CONFIG_NVME_CORE=y
CONFIG_BLK_DEV_NVME=y
CONFIG_EXT4_FS=y
CONFIG_DM_VERITY=y
15.4 systemd Service Optimization¶
# Identify and disable unnecessary services:
# List all enabled services:
systemctl list-unit-files --state=enabled
# Disable services not needed for production:
systemctl disable \
apt-daily.timer \
apt-daily-upgrade.timer \
man-db.timer \
fstrim.timer \
motd-news.timer \
systemd-resolved.service \
ModemManager.service \
avahi-daemon.service \
cups.service
# In Yocto, prevent services from being installed:
# meta-myproject/recipes-core/systemd/systemd_%.bbappend
PACKAGECONFIG:remove = " \
resolved \
timesyncd \
coredump \
hibernate \
"
# Create a minimal systemd target for fast boot:
# /etc/systemd/system/myproject.target
# [Unit]
# Description=MyProject Application Target
# Requires=basic.target
# After=basic.target
# AllowIsolate=yes
#
# Set as default target:
# systemctl set-default myproject.target
15.5 Removing Unnecessary Services¶
For a production Jetson image, the following services are typically removed:
# In the image recipe or distro configuration:
# Remove unnecessary packages (each saves boot time and image size)
IMAGE_INSTALL:remove = " \
avahi-daemon \
avahi-autoipd \
cups \
cups-filters \
modemmanager \
packagekit \
snapd \
unattended-upgrades \
apport \
whoopsie \
kerneloops \
popularity-contest \
ubuntu-advantage-tools \
"
# Remove unused kernel modules (each module probe adds latency):
# In kernel config:
# Disable ~600 unused drivers by selectively enabling only needed ones
# Use 'lsmod' on a running system to identify what is actually loaded
# Convert those to built-in (=y) and disable everything else (=n)
Service audit methodology:
#!/bin/bash
# scripts/audit-boot-services.sh
# Run on target device to identify optimization opportunities
echo "=== Boot Time Analysis ==="
systemd-analyze
echo ""
echo "=== Top 20 Slowest Services ==="
systemd-analyze blame | head -20
echo ""
echo "=== Critical Chain ==="
systemd-analyze critical-chain
echo ""
echo "=== All Enabled Services ==="
systemctl list-unit-files --state=enabled --no-pager
echo ""
echo "=== Running Services ==="
systemctl list-units --type=service --state=running --no-pager
echo ""
echo "=== Loaded Kernel Modules ==="
lsmod | wc -l
echo "modules loaded"
lsmod
echo ""
echo "=== Kernel Boot Time ==="
dmesg | tail -1 | grep -oP '^\[\s*\K[0-9.]+'
echo "seconds (kernel messages end)"
15.6 Parallel Service Initialization¶
# systemd inherently parallelizes services. Optimize by:
# 1. Remove unnecessary ordering dependencies
# If myapp.service does not actually need network:
# [Unit]
# Description=My Application
# # Remove: After=network-online.target
# # Remove: Wants=network-online.target
# After=basic.target
# 2. Use socket activation for services that do not need immediate start
# [Unit]
# Description=My API Server
# [Socket]
# ListenStream=8080
# [Install]
# WantedBy=sockets.target
# 3. Use Type=notify for accurate readiness signaling
# [Service]
# Type=notify
# ExecStart=/usr/bin/myapp
# # The service calls sd_notify(READY=1) when fully initialized
# 4. Reduce ExecStartPre overhead
# Avoid expensive pre-checks. Move validation to the main process.
15.7 Achieving Sub-10-Second Boot¶
Boot Time Budget for Sub-10-Second Boot on Orin Nano:
Component Target Typical Stock
--------------------------------------------------------
BootROM + MB1 + MB2 1.5s 2.0s
UEFI (with optimized timeout=0) 1.0s 3.0s
Kernel decompression + init 1.5s 3.5s
systemd to basic.target 1.5s 8.0s
Application service start 2.0s 5.0s+
--------------------------------------------------------
Total to application ready 7.5s 21.5s+
Key optimizations applied:
- UEFI boot timeout set to 0 (saves 3s)
- LZ4 kernel compression (saves 0.5s vs gzip)
- No initramfs (saves 1-2s)
- NVMe/ext4/dm-verity built into kernel (saves 0.5s)
- Only 12 systemd services enabled (vs 60+ default)
- Application uses Type=notify with early startup
- No DNS resolution at boot (static network config)
- Kernel loglevel=4 (reduces console output overhead)
- Deferred non-critical hardware init (camera, GPU after app starts)
16. Licensing Compliance¶
16.1 Why Licensing Matters at Scale¶
At 25,000 deployed devices, licensing compliance is not optional. Violations can result in injunctions, product recalls, or costly settlements. The two primary concerns are:
- Open source license obligations (GPL, LGPL, MPL, Apache, MIT, BSD)
- Source code availability for GPL/LGPL components
- Attribution requirements (NOTICE files, copyright statements)
-
License compatibility in combined works
-
Commercial/proprietary license compliance (NVIDIA, CUDA, TensorRT)
- Usage rights tied to NVIDIA Jetson modules
- Distribution restrictions on binaries
- Export control classifications (EAR/ITAR for some components)
16.2 SPDX License Tracking in Yocto¶
Yocto generates SPDX (Software Package Data Exchange) documents automatically, providing a machine-readable inventory of all software licenses in your image.
# Enable SPDX generation in local.conf or distro.conf:
INHERIT += "create-spdx"
# After building:
bitbake myproject-image
# SPDX documents are generated at:
# tmp/deploy/spdx/jetson-orin-nano-devkit/myproject-image/
# The SPDX output includes:
# - Package name and version
# - License expression (SPDX format)
# - Source location (URL, commit hash)
# - File checksums
# - Relationship to other packages (dependency graph)
16.3 License Manifest Generation¶
# Yocto generates license manifests automatically:
# tmp/deploy/licenses/myproject-image-jetson-orin-nano-devkit/
# license.manifest # One line per package with license
# package.manifest # Package names and versions
# image_license.manifest # Combined manifest
# Example license manifest entry:
# PACKAGE NAME: curl
# PACKAGE VERSION: 8.5.0
# RECIPE NAME: curl
# LICENSE: curl
# LIC_FILES_CHKSUM: file://COPYING;md5=...
# Generate a summary report:
cat tmp/deploy/licenses/myproject-image-*/license.manifest | \
awk '{print $NF}' | sort | uniq -c | sort -rn
# 145 MIT
# 89 GPL-2.0-only
# 67 LGPL-2.1-or-later
# 34 Apache-2.0
# 23 BSD-3-Clause
# 12 commercial_nvidia
# 8 Proprietary
16.4 Commercial License Handling¶
# NVIDIA components require explicit license acceptance:
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
# This flag covers:
# - CUDA Toolkit and runtime
# - TensorRT
# - cuDNN
# - NVIDIA display drivers
# - NVIDIA multimedia codecs
# - Various L4T binary packages
# For your own proprietary packages:
# In the recipe:
LICENSE = "Proprietary"
LICENSE_FLAGS = "commercial_mycompany"
LIC_FILES_CHKSUM = "file://LICENSE;md5=abc123..."
# In local.conf:
LICENSE_FLAGS_ACCEPTED += "commercial_mycompany"
# For GPL compliance with NVIDIA binaries:
# NVIDIA CUDA and GPU drivers are distributed under NVIDIA's proprietary
# license. They do NOT link against GPL code. The kernel interface uses
# a GPL-compatible shim (nvidia.ko has a dual MIT/GPL license for the
# kernel interface portion). Document this in your compliance records.
16.5 Export Control Considerations¶
Export Control Checklist:
1. NVIDIA Jetson modules:
- Classified as EAR99 (no license required for most destinations)
- Exception: certain countries under US sanctions (check current list)
- NVIDIA's EULA restricts redistribution of certain components
2. Cryptographic software:
- OpenSSL, GnuTLS, WireGuard: EAR Category 5, Part 2
- Mass market encryption exemption (License Exception ENC)
- File CCATS or self-classify and submit annual reports
3. Custom AI/ML models:
- Generally not export controlled unless tied to military applications
- Check if models were trained on controlled datasets
4. Documentation required:
- BIS self-classification records
- Encryption registration (if applicable)
- End-user statements for restricted destinations
- NVIDIA license agreement acknowledgment
16.6 License Audit Automation in CI¶
# .gitlab-ci.yml (license audit stage)
license-audit:
stage: test
script:
# Check for any packages with unaccepted licenses
- |
UNACCEPTED=$(grep -r "LICENSE_FLAGS" \
tmp/deploy/licenses/myproject-image-*/license.manifest | \
grep -v "commercial_nvidia" | \
grep -v "commercial_mycompany" || true)
if [ -n "${UNACCEPTED}" ]; then
echo "ERROR: Unaccepted license flags found:"
echo "${UNACCEPTED}"
exit 1
fi
# Check for GPL-3.0 packages (may be prohibited in some products)
- |
GPL3=$(grep "GPL-3.0" \
tmp/deploy/licenses/myproject-image-*/license.manifest || true)
if [ -n "${GPL3}" ]; then
echo "WARNING: GPL-3.0 packages found (review required):"
echo "${GPL3}"
# exit 1 # Uncomment to enforce
fi
# Verify source code availability for GPL packages
- python3 scripts/verify-gpl-sources.py \
tmp/deploy/licenses/myproject-image-*/license.manifest \
tmp/deploy/sources/
# Generate compliance report
- python3 scripts/generate-compliance-report.py \
--manifest tmp/deploy/licenses/myproject-image-*/license.manifest \
--spdx tmp/deploy/spdx/ \
--output compliance-report-$(date +%Y%m%d).html
artifacts:
paths:
- compliance-report-*.html
expire_in: 1 year
#!/usr/bin/env python3
# scripts/verify-gpl-sources.py
"""Verify that source code is available for all GPL-licensed packages."""
import sys
import os
def verify_gpl_sources(manifest_path, sources_dir):
missing = []
with open(manifest_path) as f:
current_package = None
current_license = None
for line in f:
line = line.strip()
if line.startswith("PACKAGE NAME:"):
current_package = line.split(":", 1)[1].strip()
elif line.startswith("LICENSE:"):
current_license = line.split(":", 1)[1].strip()
if "GPL" in current_license:
# Check source is available
source_found = False
for ext in [".tar.gz", ".tar.bz2", ".tar.xz", ".zip"]:
source_path = os.path.join(sources_dir,
current_package + ext)
if os.path.exists(source_path):
source_found = True
break
if not source_found:
missing.append((current_package, current_license))
if missing:
print("ERROR: Missing GPL source archives:")
for pkg, lic in missing:
print(f" {pkg} ({lic})")
return 1
else:
print(f"All GPL sources verified ({len(missing)} issues)")
return 0
if __name__ == "__main__":
sys.exit(verify_gpl_sources(sys.argv[1], sys.argv[2]))
16.7 Maintaining Compliance Across 30+ Build Configurations¶
# Strategy: Centralize license policy in distro configuration
# conf/distro/myproject-distro.conf:
LICENSE_FLAGS_ACCEPTED = " \
commercial_nvidia \
commercial_mycompany \
"
# Blocklist: these licenses are NEVER acceptable
# meta-myproject-distro/classes/license-policy.bbclass
python do_license_policy_check() {
license = d.getVar('LICENSE')
pn = d.getVar('PN')
blocked_licenses = ['AGPL-3.0-only', 'AGPL-3.0-or-later', 'SSPL-1.0']
for bl in blocked_licenses:
if bl in license:
bb.fatal(f"{pn}: License '{license}' contains blocked license '{bl}'")
}
addtask license_policy_check after do_populate_lic before do_build
# In distro.conf:
INHERIT += "license-policy"
# This ensures the same license policy applies regardless of which
# MACHINE or image recipe is being built.
17. Quality and Release Engineering¶
17.1 Release Versioning Strategy¶
Versioning Scheme:
MAJOR.MINOR.PATCH[-rc.N][-MACHINE]
MAJOR: Incompatible changes (new Yocto release, new L4T version)
MINOR: Feature additions (new packages, new board support)
PATCH: Bug fixes and security patches
-rc.N: Release candidate (rc.1, rc.2, ...)
Examples:
3.0.0-rc.1 First release candidate of major version 3
3.0.0 Production release
3.0.1 Security patch
3.1.0 Feature release (new camera driver)
4.0.0 Major release (migrated from Kirkstone to Scarthgap)
Build identifiers (appended to image filename, not version):
20260301-abc1234 Date + git short hash
CI-1234 CI build number
Machine-specific images:
myproject-image-3.0.0-jetson-orin-nano-devkit.tegraflash.tar.gz
myproject-image-3.0.0-mycompany-edge-v2.tegraflash.tar.gz
# Automate version embedding in images:
# classes/myproject-versioning.bbclass
MYPROJECT_VERSION ?= "0.0.0-dev"
# Set via CI environment or local.conf:
# MYPROJECT_VERSION = "3.0.0"
inherit image-buildinfo
IMAGE_BUILDINFO_VARS:append = " MYPROJECT_VERSION"
ROOTFS_POSTPROCESS_COMMAND += "inject_version_info;"
inject_version_info() {
echo "${MYPROJECT_VERSION}" > ${IMAGE_ROOTFS}/etc/myproject-version
echo "build_date=$(date -u +%Y-%m-%dT%H:%M:%SZ)" >> \
${IMAGE_ROOTFS}/etc/myproject-version
echo "build_host=$(hostname)" >> ${IMAGE_ROOTFS}/etc/myproject-version
echo "machine=${MACHINE}" >> ${IMAGE_ROOTFS}/etc/myproject-version
echo "distro=${DISTRO_VERSION}" >> ${IMAGE_ROOTFS}/etc/myproject-version
}
17.2 Build Reproducibility Verification¶
#!/bin/bash
# scripts/verify-reproducibility.sh
# Verifies that two independent builds produce identical output
set -euo pipefail
BUILD_DIR_1="/tmp/repro-build-1"
BUILD_DIR_2="/tmp/repro-build-2"
IMAGE="myproject-image"
MACHINE="jetson-orin-nano-devkit"
echo "Build 1..."
TMPDIR="${BUILD_DIR_1}" kas build kas/jetson-orin-nano.yml
cp "${BUILD_DIR_1}/deploy/images/${MACHINE}/${IMAGE}-${MACHINE}.ext4" \
/tmp/build1.ext4
echo "Clean and rebuild..."
rm -rf "${BUILD_DIR_1}/work"
TMPDIR="${BUILD_DIR_2}" kas build kas/jetson-orin-nano.yml
cp "${BUILD_DIR_2}/deploy/images/${MACHINE}/${IMAGE}-${MACHINE}.ext4" \
/tmp/build2.ext4
echo "Comparing..."
if sha256sum /tmp/build1.ext4 /tmp/build2.ext4 | awk '{print $1}' | \
sort -u | wc -l | grep -q "^1$"; then
echo "PASS: Builds are bit-for-bit identical"
else
echo "FAIL: Builds differ"
echo "Running diffoscope for detailed analysis..."
diffoscope /tmp/build1.ext4 /tmp/build2.ext4 \
--html /tmp/repro-diff-report.html
echo "Report: /tmp/repro-diff-report.html"
exit 1
fi
17.3 Regression Test Suites¶
#!/bin/bash
# scripts/regression-test.sh
# Run on target device after flash
set -euo pipefail
RESULTS_FILE="/tmp/regression-results-$(date +%Y%m%d-%H%M%S).txt"
PASS=0
FAIL=0
run_test() {
local name="$1"
local command="$2"
echo -n "TEST: ${name}... "
if eval "${command}" > /dev/null 2>&1; then
echo "PASS" | tee -a "${RESULTS_FILE}"
PASS=$((PASS + 1))
else
echo "FAIL" | tee -a "${RESULTS_FILE}"
FAIL=$((FAIL + 1))
fi
}
echo "=== Regression Test Suite ===" | tee "${RESULTS_FILE}"
echo "Date: $(date)" | tee -a "${RESULTS_FILE}"
echo "Version: $(cat /etc/myproject-version)" | tee -a "${RESULTS_FILE}"
echo "" | tee -a "${RESULTS_FILE}"
# System tests
run_test "Kernel version" "uname -r | grep '5.15'"
run_test "systemd running" "systemctl is-system-running | grep -E 'running|degraded'"
run_test "Root filesystem" "mount | grep 'on / ' | grep ext4"
run_test "NVMe detected" "lsblk | grep nvme"
run_test "Free memory > 2GB" "[ $(free -m | awk '/Mem/{print $7}') -gt 2048 ]"
run_test "Free disk > 500MB" "[ $(df -m / | awk 'NR==2{print $4}') -gt 500 ]"
# Network tests
run_test "Network interface up" "ip link show eth0 | grep 'state UP'"
run_test "DNS resolution" "nslookup google.com"
run_test "NTP synchronized" "chronyc tracking | grep 'Leap status.*Normal'"
# GPU tests
run_test "NVIDIA driver loaded" "lsmod | grep nvidia"
run_test "CUDA available" "ls /usr/local/cuda/lib64/libcudart.so*"
run_test "GPU detected" "cat /sys/class/drm/card0/device/vendor | grep '0x10de'"
# Peripheral tests
run_test "USB host" "lsusb | wc -l | grep -v '^0$'"
run_test "I2C buses" "ls /dev/i2c-* | wc -l | grep -v '^0$'"
run_test "SPI buses" "ls /dev/spidev* 2>/dev/null | wc -l | grep -v '^0$'"
# Application tests
run_test "myapp binary exists" "test -x /usr/bin/myapp"
run_test "myapp service active" "systemctl is-active myapp.service"
run_test "myapp health check" "curl -sf http://localhost:8080/health"
# Security tests
run_test "No root password" "grep '^root:!' /etc/shadow"
run_test "SSH password auth disabled" "sshd -T | grep 'passwordauthentication no'"
run_test "dm-verity active" "dmsetup status | grep verity"
echo ""
echo "=== Results ===" | tee -a "${RESULTS_FILE}"
echo "PASS: ${PASS}" | tee -a "${RESULTS_FILE}"
echo "FAIL: ${FAIL}" | tee -a "${RESULTS_FILE}"
echo "TOTAL: $((PASS + FAIL))" | tee -a "${RESULTS_FILE}"
exit ${FAIL}
17.4 Hardware-in-the-Loop (HIL) Testing¶
HIL Test Infrastructure:
+-------------------+ USB/Serial +-------------------+
| CI Build Server |-------------------->| HIL Controller |
| (artifact upload) | | (Raspberry Pi 4) |
+-------------------+ +-------------------+
|
USB | GPIO UART
|
+-------------------+
| Jetson Orin Nano |
| (Device Under |
| Test) |
+-------------------+
|
+-------------------+
| USB Power Switch |
| (YKUSH/uhubctl) |
+-------------------+
HIL Controller responsibilities:
1. Receive flash image from CI pipeline
2. Put DUT into recovery mode (GPIO-controlled)
3. Flash the DUT
4. Monitor serial console during boot
5. Run regression tests via SSH after boot
6. Power cycle DUT for stress tests
7. Report results back to CI pipeline
#!/bin/bash
# scripts/hil-flash-and-test.sh
# Runs on the HIL controller
set -euo pipefail
FLASH_ARCHIVE="$1"
DUT_IP="192.168.1.100"
SERIAL_DEV="/dev/ttyUSB0"
BOOT_LOG="/tmp/boot-log-$(date +%Y%m%d-%H%M%S).txt"
echo "Step 1: Put DUT into recovery mode"
# Toggle GPIO to hold Force Recovery while resetting
gpio-set RECOVERY_PIN LOW
gpio-set RESET_PIN LOW
sleep 1
gpio-set RESET_PIN HIGH
sleep 2
gpio-set RECOVERY_PIN HIGH
# Verify recovery mode
lsusb | grep -q "0955:7523" || { echo "FAIL: DUT not in recovery"; exit 1; }
echo "Step 2: Flash DUT"
mkdir -p /tmp/flash && cd /tmp/flash
tar xzf "${FLASH_ARCHIVE}"
cd tegraflash
sudo ./initrd-flash 2>&1 | tee flash-log.txt
FLASH_RESULT=$?
if [ ${FLASH_RESULT} -ne 0 ]; then
echo "FAIL: Flash failed"
exit 1
fi
echo "Step 3: Monitor boot (timeout 120s)"
timeout 120 picocom -b 115200 "${SERIAL_DEV}" \
--logfile "${BOOT_LOG}" \
--exit-after 120000 &
PICOCOM_PID=$!
# Wait for device to be reachable via SSH
for i in $(seq 1 60); do
if ssh -o ConnectTimeout=2 root@${DUT_IP} "true" 2>/dev/null; then
echo "DUT reachable via SSH after ${i} attempts"
break
fi
sleep 2
done
kill ${PICOCOM_PID} 2>/dev/null || true
echo "Step 4: Run regression tests"
scp scripts/regression-test.sh root@${DUT_IP}:/tmp/
ssh root@${DUT_IP} "bash /tmp/regression-test.sh"
TEST_RESULT=$?
echo "Step 5: Collect results"
scp root@${DUT_IP}:/tmp/regression-results-*.txt results/
exit ${TEST_RESULT}
17.5 Release Notes Generation¶
#!/bin/bash
# scripts/generate-release-notes.sh
set -euo pipefail
VERSION="$1"
PREV_VERSION="$2"
echo "# Release Notes: ${VERSION}"
echo ""
echo "**Date:** $(date +%Y-%m-%d)"
echo "**Previous Version:** ${PREV_VERSION}"
echo ""
echo "## Changes"
echo ""
git log ${PREV_VERSION}..${VERSION} --pretty=format:"- %s (%h)" \
--no-merges
echo ""
echo "## Package Changes"
echo ""
diff <(sort "releases/${PREV_VERSION}/package.manifest") \
<(sort "releases/${VERSION}/package.manifest") | \
grep "^[<>]" | sed 's/^< /- Removed: /; s/^> /+ Added: /'
echo ""
echo "## Image Sizes"
echo ""
for machine in jetson-orin-nano-devkit mycompany-edge-v2; do
size=$(stat -c%s \
"releases/${VERSION}/${machine}/myproject-image-${machine}.ext4" \
2>/dev/null || echo "N/A")
echo "- ${machine}: ${size} bytes"
done
echo ""
echo "## Known Issues"
echo ""
echo "See JIRA query: project=MYPROJ AND fixVersion=${VERSION} AND type=Bug AND status!=Closed"
17.6 Managing Releases for Distributed Teams¶
Release Management for 50+ Engineer Teams:
Roles:
- Release Manager: Owns the release branch, gatekeeps merges
- BSP Lead: Approves kernel/bootloader/DT changes
- Application Lead: Approves application-layer changes
- QA Lead: Signs off on test results
- Security Champion: Reviews CVE patches and compliance
Git Branching Strategy:
main --------o---------o---------o---------> (always releasable)
\ \ \
release/3.0 ---o--o--o---\---------\-------> (quarterly release branch)
| | | \ \
fix fix rc1 \ \
\ \
release/3.1 ----------------o--o--o---\----> (next quarter)
| | | \
fix fix rc1 \
\
develop -----o--o--o--o--o--o--o--o--o---o-> (integration branch)
\ \ \
feature/foo ---o----o \
\
feature/bar ---------------o
Merge Rules:
- feature/* -> develop: Requires 2 approvals, CI green
- develop -> release/*: Requires release manager approval
- release/* -> main: Requires QA sign-off + release manager
- hotfix/* -> release/*: Requires BSP lead + security champion
- Never force-push main or release branches
17.7 Branching Strategy for Yocto Layers¶
# Each Yocto layer follows the same branch naming as the project:
#
# meta-myproject (git)
# main Tracks latest stable
# develop Integration branch
# release/3.0 Release branch
# feature/add-can-driver Feature branch
#
# meta-myproject-bsp (git)
# main
# develop
# release/3.0
#
# meta-myproject-distro (git)
# main
# develop
# release/3.0
# The kas configuration pins layer branches:
# kas/release-3.0.yml
# repos:
# meta-myproject:
# branch: release/3.0
# meta-myproject-bsp:
# branch: release/3.0
# meta-myproject-distro:
# branch: release/3.0
# meta-tegra:
# commit: abc123def456 # Pinned to exact commit for reproducibility
# poky:
# commit: def456abc123 # Pinned to exact commit
# For development builds:
# kas/develop.yml
# repos:
# meta-myproject:
# branch: develop
# meta-tegra:
# branch: scarthgap-l4t-r36.x # Track upstream HEAD
18. Production Deployment at Scale¶
18.1 Manufacturing Provisioning Workflow¶
Manufacturing Line Workflow:
Station 1: Assembly
- Jetson module mounted on carrier board
- Mechanical assembly (enclosure, connectors, antennas)
- Visual inspection
Station 2: Flash
- Connect USB cable to host PC
- Put device in recovery mode (automated jig with pogo pins)
- Flash production image (tegraflash, ~5-10 minutes)
- Flash includes: bootloader, kernel, rootfs, factory test image
Station 3: Provisioning
- Device boots into factory test mode
- Unique device identity injected:
- Serial number
- Device certificate (X.509)
- WiFi MAC address (if custom)
- Product configuration
- Keys written to secure storage (RPMB or Trusty TA)
Station 4: Factory Test
- Automated test suite runs:
- GPU test (CUDA compute test)
- Memory test (stress test)
- Storage test (sequential and random I/O)
- Network test (Ethernet, WiFi if applicable)
- Peripheral test (I2C, SPI, GPIO, CAN)
- Camera test (image capture and analysis)
- Power consumption measurement
- Results logged to MES (Manufacturing Execution System)
Station 5: Final Configuration
- Switch from factory test image to production image
- Set production boot flags
- Burn secure boot fuses (if not already done)
- Final functional verification
Station 6: Packaging
- Label with serial number and QR code
- Pack and ship
Cycle time target: 15-20 minutes per device
Daily throughput: 50-100 devices per line
18.2 Per-Device Identity and Key Injection¶
#!/bin/bash
# scripts/provision-device.sh
# Runs at manufacturing Station 3
set -euo pipefail
DEVICE_IP="$1"
SERIAL_NUMBER="$2"
PKI_SERVER="https://pki.mycompany.com"
echo "Provisioning device: ${SERIAL_NUMBER}"
# Step 1: Generate device-specific key pair on the device
ssh root@${DEVICE_IP} << 'REMOTE_SCRIPT'
# Generate private key in TPM/secure storage
openssl ecparam -genkey -name prime256v1 \
-out /data/device-key.pem
chmod 600 /data/device-key.pem
# Generate CSR (Certificate Signing Request)
openssl req -new -key /data/device-key.pem \
-out /tmp/device.csr \
-subj "/O=MyCompany/OU=EdgeDevices/CN=${HOSTNAME}"
REMOTE_SCRIPT
# Step 2: Sign the CSR with the company CA
scp root@${DEVICE_IP}:/tmp/device.csr /tmp/
curl -X POST "${PKI_SERVER}/api/v1/sign" \
-F "csr=@/tmp/device.csr" \
-F "serial=${SERIAL_NUMBER}" \
-F "validity=3650" \
-o /tmp/device-cert.pem
# Step 3: Install the signed certificate on the device
scp /tmp/device-cert.pem root@${DEVICE_IP}:/data/device-cert.pem
# Step 4: Write device identity
ssh root@${DEVICE_IP} << REMOTE_SCRIPT
echo "${SERIAL_NUMBER}" > /data/serial-number
echo "$(date -u +%Y-%m-%dT%H:%M:%SZ)" > /data/provisioning-date
# Configure the device to use its identity
cat > /data/device-config.json << JSONEOF
{
"serial_number": "${SERIAL_NUMBER}",
"ota_server": "https://ota.mycompany.com",
"telemetry_server": "https://telemetry.mycompany.com",
"certificate": "/data/device-cert.pem",
"private_key": "/data/device-key.pem"
}
JSONEOF
# Verify identity
openssl x509 -in /data/device-cert.pem -noout -subject
echo "Provisioning complete for ${SERIAL_NUMBER}"
REMOTE_SCRIPT
# Step 5: Register device in fleet management system
curl -X POST "https://fleet.mycompany.com/api/v1/devices" \
-H "Content-Type: application/json" \
-d "{
\"serial_number\": \"${SERIAL_NUMBER}\",
\"firmware_version\": \"$(ssh root@${DEVICE_IP} cat /etc/myproject-version | head -1)\",
\"mac_address\": \"$(ssh root@${DEVICE_IP} cat /sys/class/net/eth0/address)\",
\"provisioning_date\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\"
}"
echo "Device ${SERIAL_NUMBER} provisioned and registered"
18.3 Fleet Image Management¶
Fleet Image Management for 15,000-25,000 Devices:
Image Server Architecture:
+--------------------+
| Build Pipeline |
| (produces images) |
+--------------------+
|
v
+--------------------+
| Artifact Store | Images, manifests, signatures
| (S3/MinIO) | Versioned and immutable
+--------------------+
|
v
+--------------------+
| OTA Distribution | CDN for global distribution
| (CloudFront/Akamai)| Edge caching for bandwidth
+--------------------+
|
v
+--------------------+
| Fleet Manager | Device groups, rollout policies,
| (hawkBit/custom) | status tracking, rollback triggers
+--------------------+
|
+----+----+
| | |
v v v
Device Device Device (15,000-25,000 units)
Image Naming Convention:
myproject-image-{version}-{machine}-{build_id}.tegraflash.tar.gz
Version Tracking:
- Each device reports its current firmware version to fleet manager
- Fleet manager maintains desired version per device group
- Devices poll for updates every 4 hours (configurable)
- Critical security updates trigger push notification to devices
18.4 Device Groups and Update Policies¶
# Fleet management API examples:
# Create device groups
curl -X POST "https://fleet.mycompany.com/api/v1/groups" \
-H "Content-Type: application/json" \
-d '{
"name": "canary",
"description": "Early update testing group (50 devices)",
"selection": "random",
"size": 50
}'
curl -X POST "https://fleet.mycompany.com/api/v1/groups" \
-H "Content-Type: application/json" \
-d '{
"name": "site-seattle",
"description": "All devices at Seattle site",
"selection": "tag:site=seattle"
}'
# Assign update policy
curl -X POST "https://fleet.mycompany.com/api/v1/policies" \
-H "Content-Type: application/json" \
-d '{
"name": "standard-rollout",
"stages": [
{"group": "canary", "percentage": 100, "wait_hours": 24},
{"group": "all", "percentage": 1, "wait_hours": 48},
{"group": "all", "percentage": 10, "wait_hours": 72},
{"group": "all", "percentage": 50, "wait_hours": 48},
{"group": "all", "percentage": 100, "wait_hours": 0}
],
"abort_criteria": {
"failure_rate_threshold": 0.02,
"health_check_failures": 5,
"rollback_on_abort": true
}
}'
18.5 Field Debugging¶
# Remote access to deployed devices:
# Option 1: Reverse SSH tunnel (device initiates connection)
# On the device (systemd service):
# [Service]
# ExecStart=/usr/bin/ssh -N -R 0:localhost:22 \
# tunnel@bastion.mycompany.com -o ServerAliveInterval=30
# Restart=always
# RestartSec=30
# On the bastion server:
ssh -p <dynamic_port> root@localhost
# Option 2: WireGuard VPN (each device has unique keys)
# /etc/wireguard/wg0.conf on device:
# [Interface]
# PrivateKey = <device_private_key>
# Address = 10.100.X.Y/32
#
# [Peer]
# PublicKey = <server_public_key>
# AllowedIPs = 10.100.0.0/16
# Endpoint = vpn.mycompany.com:51820
# PersistentKeepalive = 25
# Then access via VPN:
ssh root@10.100.X.Y
Remote log collection:
#!/bin/bash
# /usr/bin/log-collector.sh (runs on device as systemd timer)
set -euo pipefail
LOG_SERVER="https://logs.mycompany.com"
DEVICE_ID=$(cat /data/serial-number)
AUTH_CERT="/data/device-cert.pem"
AUTH_KEY="/data/device-key.pem"
# Collect system logs
journalctl --since "4 hours ago" --no-pager | gzip > /tmp/journal.gz
# Collect application logs
tar czf /tmp/app-logs.tar.gz /var/log/myapp/ 2>/dev/null || true
# Collect system stats
{
echo "--- System Info ---"
uname -a
echo "--- Memory ---"
free -m
echo "--- Disk ---"
df -h
echo "--- Temperature ---"
cat /sys/class/thermal/thermal_zone*/temp
echo "--- GPU ---"
tegrastats --interval 1000 --logfile /dev/stdout --verbose &
TEGRA_PID=$!
sleep 3
kill $TEGRA_PID 2>/dev/null
echo "--- Network ---"
ip addr
echo "--- Top Processes ---"
top -bn1 | head -20
} | gzip > /tmp/sysinfo.gz
# Upload to log server
curl -X POST "${LOG_SERVER}/api/v1/logs/${DEVICE_ID}" \
--cert "${AUTH_CERT}" \
--key "${AUTH_KEY}" \
-F "journal=@/tmp/journal.gz" \
-F "app_logs=@/tmp/app-logs.tar.gz" \
-F "sysinfo=@/tmp/sysinfo.gz"
# Cleanup
rm -f /tmp/journal.gz /tmp/app-logs.tar.gz /tmp/sysinfo.gz
18.6 Monitoring Deployed Device Health¶
# Health check daemon (runs on each device)
# /usr/bin/health-monitor.sh
#!/bin/bash
set -euo pipefail
TELEMETRY_SERVER="https://telemetry.mycompany.com"
DEVICE_ID=$(cat /data/serial-number)
INTERVAL=300 # Report every 5 minutes
while true; do
# Collect metrics
CPU_TEMP=$(cat /sys/class/thermal/thermal_zone0/temp)
GPU_TEMP=$(cat /sys/class/thermal/thermal_zone1/temp)
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}')
MEM_USED=$(free -m | awk '/Mem/{print $3}')
MEM_TOTAL=$(free -m | awk '/Mem/{print $2}')
DISK_USED=$(df -m / | awk 'NR==2{print $3}')
UPTIME=$(cat /proc/uptime | awk '{print $1}')
LOAD_AVG=$(cat /proc/loadavg | awk '{print $1}')
# Application-specific metrics
APP_STATUS=$(systemctl is-active myapp.service 2>/dev/null || echo "inactive")
INFERENCE_FPS=$(curl -sf http://localhost:8080/metrics/fps 2>/dev/null || echo "0")
# Send telemetry
curl -sf -X POST "${TELEMETRY_SERVER}/api/v1/metrics" \
--cert /data/device-cert.pem \
--key /data/device-key.pem \
-H "Content-Type: application/json" \
-d "{
\"device_id\": \"${DEVICE_ID}\",
\"timestamp\": \"$(date -u +%Y-%m-%dT%H:%M:%SZ)\",
\"cpu_temp_mc\": ${CPU_TEMP},
\"gpu_temp_mc\": ${GPU_TEMP},
\"cpu_usage_pct\": ${CPU_USAGE},
\"mem_used_mb\": ${MEM_USED},
\"mem_total_mb\": ${MEM_TOTAL},
\"disk_used_mb\": ${DISK_USED},
\"uptime_sec\": ${UPTIME},
\"load_avg\": ${LOAD_AVG},
\"app_status\": \"${APP_STATUS}\",
\"inference_fps\": ${INFERENCE_FPS},
\"firmware_version\": \"$(head -1 /etc/myproject-version)\"
}" || true
sleep ${INTERVAL}
done
18.7 Containerized Application Deployment on Yocto Base¶
For applications that benefit from container isolation while running on a minimal Yocto base OS:
# Add container runtime to the Yocto image:
IMAGE_INSTALL:append = " \
docker-ce \
docker-ce-cli \
containerd-opencontainers \
docker-compose \
nvidia-container-toolkit \
"
# Or for a lighter-weight approach, use podman:
IMAGE_INSTALL:append = " \
podman \
crun \
slirp4netns \
nvidia-container-toolkit \
"
Architecture: Yocto base OS + containerized application:
+-----------------------------------------------------+
| Containerized Application |
| +-------------------+ +-------------------+ |
| | Inference Engine | | Data Pipeline | |
| | (CUDA, TensorRT) | | (Python, MQTT) | |
| | Container | | Container | |
| +-------------------+ +-------------------+ |
| +-------------------+ +-------------------+ |
| | Monitoring Agent | | OTA Updater | |
| | Container | | Container | |
| +-------------------+ +-------------------+ |
+-----------------------------------------------------+
| NVIDIA Container Runtime (nvidia-ctk) |
+-----------------------------------------------------+
| Container Engine (Docker/Podman) |
+-----------------------------------------------------+
| Yocto Minimal Base OS |
| (Kernel, systemd, networking, container runtime) |
+-----------------------------------------------------+
| Hardware (Jetson Orin Nano) |
+-----------------------------------------------------+
# Example docker-compose.yml for production:
# /data/containers/docker-compose.yml
version: "3.8"
services:
inference:
image: mycompany/inference-engine:3.0.0
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
volumes:
- /data/models:/models:ro
- /data/config:/config:ro
devices:
- /dev/video0:/dev/video0
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
data-pipeline:
image: mycompany/data-pipeline:3.0.0
depends_on:
inference:
condition: service_healthy
volumes:
- /data/output:/output
environment:
- MQTT_BROKER=mqtt.mycompany.com
- DEVICE_ID_FILE=/data/serial-number
restart: always
monitoring:
image: mycompany/monitor:1.0.0
volumes:
- /data:/data:ro
- /sys:/sys:ro
- /proc:/proc:ro
restart: always
# systemd service to manage containers:
# /etc/systemd/system/myproject-containers.service
# [Unit]
# Description=MyProject Application Containers
# After=docker.service
# Requires=docker.service
#
# [Service]
# Type=oneshot
# RemainAfterExit=yes
# WorkingDirectory=/data/containers
# ExecStart=/usr/bin/docker-compose up -d
# ExecStop=/usr/bin/docker-compose down
#
# [Install]
# WantedBy=multi-user.target
19. Common Issues and Debugging¶
19.1 BitBake Build Failures¶
Problem: do_fetch fails with network error
# Symptom:
# ERROR: Fetcher failure: Unable to find file ...
# ERROR: Task do_fetch failed
# Diagnosis:
bitbake -e problematic-recipe | grep ^SRC_URI=
# Check if the URL is accessible:
wget <url_from_SRC_URI>
# Solutions:
# 1. Use a download mirror:
PREMIRRORS:prepend = "git://.*/.* https://downloads.mycompany.com/"
# 2. Pre-populate DL_DIR:
# Copy the tarball manually to ${DL_DIR}/
# 3. For git fetches, check SSH key:
ssh -T git@github.com
# 4. For NVIDIA proprietary packages, ensure acceptance:
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
Problem: do_compile fails with missing header
# Symptom:
# fatal error: someheader.h: No such file or directory
# ERROR: Task do_compile failed
# Diagnosis:
bitbake -e failing-recipe | grep ^DEPENDS=
bitbake -e failing-recipe | grep ^STAGING_DIR=
# Check if the dependency is built and staged:
ls tmp/sysroots-components/aarch64/*/usr/include/someheader.h
# Solutions:
# 1. Add missing dependency:
# In the recipe: DEPENDS += "missing-package"
# 2. If the header is in a non-standard location:
# EXTRA_OECMAKE += "-DSOMEHEADER_DIR=${STAGING_DIR_HOST}/usr/include/custom"
# 3. If DEPENDS is correct but staging is broken:
bitbake -c cleansstate failing-recipe
bitbake -c cleansstate missing-package
bitbake failing-recipe
Problem: do_install fails with file not found
# Symptom:
# install: cannot stat 'some-file': No such file or directory
# ERROR: Task do_install failed
# Diagnosis:
# Enter the build directory to inspect:
bitbake -c devshell failing-recipe
ls -la ${B}/ # Check what was built
ls -la ${S}/ # Check source directory
# Common causes:
# 1. S or B variable incorrect
# 2. File was not built (conditional compilation)
# 3. File path changed between versions
# Solution: Fix the install paths in the recipe
19.2 Recipe Parsing Errors¶
# Symptom:
# ERROR: ParseError at /path/to/recipe.bb:42
# Common causes and fixes:
# 1. Syntax error in Python function
# ERROR: ParseError ... invalid syntax
# Check for Python 3 compatibility (print as function, not statement)
# 2. Missing closing quote
# Ensure all variable assignments have matching quotes:
VARIABLE = "value" # Correct
# VARIABLE = "value # Missing closing quote
# 3. Incorrect override syntax (Yocto 4.0+)
# Old syntax (pre-Honister): VARIABLE_append = " value"
# New syntax (Honister+): VARIABLE:append = " value"
# Use overridecheck script:
bitbake-layers show-overlayed --filenames --same-version | \
grep -l "_append\|_prepend\|_remove"
# 4. Tab vs space issues
# BitBake is sensitive to indentation in Python functions
# Use spaces, not tabs
# 5. Variable expansion issues
# Use ${VAR} for BitBake variables
# Use ${@python_expression} for inline Python
# Use $variable for shell scripts within do_* functions
19.3 sstate Cache Corruption¶
# Symptoms:
# - Build fails with mysterious errors after working previously
# - "Taskhash mismatch" warnings
# - Packages with wrong content
# Diagnosis:
# Check sstate integrity:
find ${SSTATE_DIR} -name "*.siginfo" -newer ${SSTATE_DIR}/last-verified | wc -l
# Solutions:
# 1. Clear sstate for the affected recipe:
bitbake -c cleansstate affected-recipe
# 2. Clear ALL sstate (nuclear option, causes full rebuild):
rm -rf ${SSTATE_DIR}/*
# 3. Verify and clean corrupted sstate entries:
sstate-cache-management.sh --cache-dir=${SSTATE_DIR} \
--remove-duplicated --yes
# 4. If using shared sstate over NFS/HTTP, check for
# incomplete transfers or permission issues:
find ${SSTATE_DIR} -size 0 -delete # Remove empty files
find ${SSTATE_DIR} -name "*.lock" -delete # Remove stale locks
# Prevention:
# - Use separate SSTATE_DIR per Yocto release
# - Use SSTATE_DIR on local SSD, not network filesystem
# - Periodically prune old sstate entries:
sstate-cache-management.sh --cache-dir=${SSTATE_DIR} \
--stamps-dir=tmp/stamps \
--remove-duplicated --yes
19.4 License Warnings¶
# Symptom:
# WARNING: linux-tegra: ... has an incompatible license
# ERROR: ... requires license flag 'commercial_nvidia'
# Solutions:
# 1. Accept the license:
LICENSE_FLAGS_ACCEPTED += "commercial_nvidia"
# 2. If you see "license checksum mismatch":
# The license file changed. Update the checksum:
bitbake -e recipe-name | grep LIC_FILES_CHKSUM
# Then update the recipe with the new md5 sum
# 3. If using a package with restrictive license:
# Check if an alternative exists:
# Instead of: DEPENDS = "proprietary-lib"
# Consider: DEPENDS = "open-source-alternative"
# 4. Verify all licenses are accounted for:
bitbake myproject-image -c populate_lic
ls tmp/deploy/licenses/myproject-image-*/
19.5 Image Size Bloat¶
# Symptom:
# Image exceeds size limit or is unexpectedly large
# Diagnosis:
# Check image manifest:
cat tmp/deploy/images/${MACHINE}/myproject-image-*.manifest | \
sort -t' ' -k2 -rn | head -30
# Shows largest packages
# Check for unnecessary recommended packages:
bitbake -g myproject-image
cat pn-buildlist | wc -l # Total number of recipes
# Solutions:
# 1. Disable RRECOMMENDS (pulls in optional packages):
BAD_RECOMMENDATIONS += "package-to-exclude"
# Or globally:
NO_RECOMMENDATIONS = "1" # Disable ALL recommendations
# 2. Remove locale data:
IMAGE_LINGUAS = ""
GLIBC_GENERATE_LOCALES = "en_US.UTF-8"
# 3. Remove documentation:
ROOTFS_POSTPROCESS_COMMAND += "remove_docs;"
remove_docs() {
rm -rf ${IMAGE_ROOTFS}/usr/share/doc
rm -rf ${IMAGE_ROOTFS}/usr/share/man
rm -rf ${IMAGE_ROOTFS}/usr/share/info
}
# 4. Use IMAGE_ROOTFS_MAXSIZE to enforce a size limit:
IMAGE_ROOTFS_MAXSIZE = "2097152" # 2 GB in KB
# Build will fail if image exceeds this size
# 5. Analyze with buildhistory:
INHERIT += "buildhistory"
BUILDHISTORY_COMMIT = "1"
buildhistory-diff # Shows what changed since last build
19.6 Kernel Module Loading Failures¶
# Symptom:
# modprobe: FATAL: Module custom-driver not found
# OR
# insmod: ERROR: could not insert module: Invalid module format
# Diagnosis:
# Check module is built:
find /lib/modules/$(uname -r) -name "custom-driver.ko*"
# Check module dependencies:
modinfo custom-driver
modprobe --show-depends custom-driver
# Check kernel version match:
modinfo custom-driver | grep vermagic
uname -r
# These MUST match exactly
# Solutions:
# 1. Ensure module is built against the same kernel:
# In the module recipe, DEPENDS must include virtual/kernel
DEPENDS = "virtual/kernel"
# 2. If using KERNEL_MODULE_AUTOLOAD, verify it is set correctly:
KERNEL_MODULE_AUTOLOAD += "custom-driver"
# 3. Rebuild modules depmap:
depmod -a
# 4. Check if module is blacklisted:
cat /etc/modprobe.d/*.conf | grep custom-driver
# 5. Check kernel config has module support enabled:
zcat /proc/config.gz | grep CONFIG_MODULES=y
19.7 Flash Failures¶
# Symptom:
# Flash fails with various errors
# Problem 1: "No Tegra device found"
lsusb | grep -i nvidia
# Fix: Ensure device is in Force Recovery Mode
# Check USB cable (use data cable, not charge-only)
# Try different USB port (USB 3.0 ports may have issues)
# On Linux host: check udev rules for NVIDIA devices
# Problem 2: "Error: tegraflash.py failed"
# Check serial console output during flash for specific errors
# Common: Wrong boardid/fab/boardsku in machine configuration
# Fix: Verify TEGRA_BOARDID matches your hardware
# Problem 3: "Filesystem image too large for partition"
# The rootfs exceeds the partition size in the partition layout XML
# Fix: Reduce image size or increase partition size:
# In machine.conf or local.conf:
# ROOTFS_PARTITION_SIZE = "2147483648" # 2 GB
# Problem 4: Flash succeeds but device does not boot
# Check serial console:
picocom -b 115200 /dev/ttyACM0
# Look for:
# - MB1/MB2 errors (power/clock issues)
# - UEFI errors (bad device tree, missing kernel)
# - Kernel panic (driver issues, wrong rootfs)
# Problem 5: Intermittent flash failures
# Usually caused by USB issues
# Fix: Use a powered USB hub
# Fix: Add udev rule for reliable USB permissions:
# /etc/udev/rules.d/99-tegra-flash.rules
# SUBSYSTEM=="usb", ATTR{idVendor}=="0955", MODE="0666"
19.8 Yocto Version Migration Issues¶
# Migrating from Kirkstone to Scarthgap (or similar major version upgrade):
# Step 1: Read the migration guide
# https://docs.yoctoproject.org/migration-guides/
# Step 2: Update override syntax (if not already done)
# The biggest breaking change in recent Yocto releases is the
# override syntax change from underscore to colon:
#
# Old: VARIABLE_append = " value"
# New: VARIABLE:append = " value"
#
# Old: VARIABLE_machine = "value"
# New: VARIABLE:machine = "value"
#
# Automated conversion:
# In each layer directory:
find . -name "*.bb" -o -name "*.bbappend" -o -name "*.bbclass" \
-o -name "*.conf" -o -name "*.inc" | while read f; do
sed -i \
-e 's/_append\b/:append/g' \
-e 's/_prepend\b/:prepend/g' \
-e 's/_remove\b/:remove/g' \
"$f"
done
# WARNING: This is a rough conversion. Manual review is required
# for variables with underscores in their names (e.g., IMAGE_INSTALL).
# Step 3: Check deprecated variables
bitbake -e myproject-image 2>&1 | grep "is deprecated"
# Step 4: Update layer compatibility
# In each layer.conf:
LAYERSERIES_COMPAT_mylayer = "scarthgap"
# Step 5: Address recipe-specific changes
# Check meta-tegra release notes for the new branch
# Some recipe names may change, dependencies may shift
# Step 6: Full rebuild and test
bitbake -c cleansstate world
bitbake myproject-image
19.9 do_compile and do_install Debugging¶
# Enter the build environment for a failing recipe:
bitbake -c devshell failing-recipe
# This drops you into a shell with the correct cross-compilation
# environment set up. You can run make/cmake manually.
# View the build log:
cat tmp/work/aarch64-poky-linux/failing-recipe/*/temp/log.do_compile
# View the run script (exact commands BitBake executed):
cat tmp/work/aarch64-poky-linux/failing-recipe/*/temp/run.do_compile
# Common do_compile debugging:
# 1. Check compiler flags:
echo $CC $CFLAGS $LDFLAGS
# 2. Run make manually with verbose output:
oe_runmake V=1
# 3. Check cross-compilation sysroot:
ls ${STAGING_DIR_HOST}/usr/include/
ls ${STAGING_DIR_HOST}/usr/lib/
# Common do_install debugging:
# 1. Check what files were produced:
ls -la ${B}/
# 2. Verify install destinations:
echo ${D} # The image directory (fakeroot)
echo ${bindir} # /usr/bin
echo ${libdir} # /usr/lib or /usr/lib64
echo ${sysconfdir} # /etc
# 3. Ensure FILES variable includes installed files:
bitbake -e failing-recipe | grep ^FILES:
19.10 devshell and devpyshell Usage¶
# devshell: Opens a shell in the recipe work directory
bitbake -c devshell linux-tegra
# You are now in the kernel source directory with cross-compilation
# environment configured. You can:
# make menuconfig
# make -j16
# make modules
# devpyshell: Opens a Python shell with BitBake data store
bitbake -c devpyshell linux-tegra
# In the Python shell:
d.getVar('SRC_URI') # Show SRC_URI value
d.getVar('WORKDIR') # Show work directory
d.getVar('B') # Show build directory
d.getVar('DEPENDS') # Show dependencies
d.getVarFlags('do_compile') # Show task flags
# Useful for understanding variable expansion and debugging
# recipe logic without modifying the recipe
19.11 Common Error Messages Reference¶
Error Message Likely Cause Quick Fix
---------------------------------------------------------------------------------------------------------
"Nothing PROVIDES 'xxx'" Missing recipe or layer Add the layer providing xxx
"Multiple providers for xxx" Ambiguous provider Set PREFERRED_PROVIDER_xxx
"LICENSE_FLAGS ... not accepted" Commercial license not accepted Add to LICENSE_FLAGS_ACCEPTED
"QA Issue: ... not shipped" Files installed but not in FILES Add to FILES:${PN}
"QA Issue: ... is owned by uid 0" Permission issue in do_install Use install -o root -g root
"do_package_qa: ... non -dev/-dbg Runtime package has dev files Move headers to ${PN}-dev
contains symlink .so"
"Taskhash mismatch" sstate corruption bitbake -c cleansstate recipe
"Nothing RPROVIDES 'xxx'" Runtime dependency missing Add to RDEPENDS
"ERROR: Function failed: do_rootfs" Package conflict or missing pkg Check IMAGE_INSTALL deps
"Signer not found" Missing signing tool Install tegrasign/openssl
"No space left on device" TMPDIR partition full Free space or change TMPDIR
19.12 Performance Debugging on Target¶
# After deploying to the Orin Nano, these tools help diagnose issues:
# GPU utilization and power
tegrastats
# Output: CPU/GPU usage, memory, temperature, power consumption
# Detailed GPU profiling
nvidia-smi # Limited on Tegra, use tegrastats instead
# System-wide performance
perf top # Real-time CPU profiling
perf record ./myapp # Record performance data
perf report # Analyze recorded data
# Memory debugging
valgrind --tool=memcheck ./myapp # Memory leak detection
cat /proc/meminfo # System memory overview
cat /proc/buddyinfo # Memory fragmentation
# I/O debugging
iotop # I/O usage by process
iostat -x 1 # I/O statistics per device
# Thermal monitoring
cat /sys/class/thermal/thermal_zone*/type
cat /sys/class/thermal/thermal_zone*/temp
# zone0: CPU, zone1: GPU, zone2: CV (computer vision engine)
# Power mode management
nvpmodel -q # Show current power mode
nvpmodel -m 0 # Set to maximum performance (MAXN)
nvpmodel -m 1 # Set to 15W mode
jetson_clocks # Lock clocks to maximum frequency
jetson_clocks --show # Show current clock frequencies
Appendix A: Quick Reference Commands¶
# Build commands
kas build kas/jetson-orin-nano.yml # Full build with kas
bitbake myproject-image # Build production image
bitbake myproject-image -c populate_sdk # Generate SDK
bitbake linux-tegra -c menuconfig # Kernel config menu
bitbake -c cleansstate recipe-name # Clean recipe
bitbake -e recipe-name | grep ^VARIABLE= # Show variable value
# Flash commands
sudo ./initrd-flash # Flash device
lsusb | grep -i nvidia # Check recovery mode
# Debug commands
bitbake -c devshell recipe-name # Enter build shell
bitbake -g myproject-image # Generate dependency graph
bitbake-layers show-recipes "*pattern*" # Find recipes
bitbake-layers show-layers # Show all layers
# On-target commands
tegrastats # GPU/CPU/power monitor
nvpmodel -q # Show power mode
jetson_clocks # Max performance
systemd-analyze blame # Boot time analysis
journalctl -u myapp.service -f # Follow app logs
cat /etc/myproject-version # Show firmware version
Appendix B: Directory Structure Reference¶
project-root/
kas/
base.yml
jetson-orin-nano.yml
jetson-orin-nano-dev.yml
machine-orin-nano.yml
machine-edge-v2.yml
image-production.yml
image-dev.yml
layers/
poky/ # Yocto Project reference distro
meta-openembedded/ # Additional OE layers
meta-tegra/ # Jetson BSP layer
meta-myproject/ # Project application recipes
meta-myproject-bsp/ # Board support (DT, kernel, bootloader)
meta-myproject-distro/ # Distro configuration
keys/
rsa_priv.pem # Secure boot signing key (NOT in git)
swupdate_priv.pem # OTA signing key (NOT in git)
scripts/
provision-device.sh # Manufacturing provisioning
provision-fuses.sh # Fuse burning (IRREVERSIBLE)
flash-and-test.sh # HIL flash and test
regression-test.sh # On-target regression tests
build-matrix.sh # Multi-target build script
verify-reproducibility.sh # Reproducible build verification
generate-release-notes.sh # Release notes generator
check-license-compliance.py # License audit script
build/
conf/
local.conf
bblayers.conf
tmp/ # Build output (gitignored)
.gitlab-ci.yml # CI/CD pipeline
Jenkinsfile # Alternative CI/CD pipeline
Appendix C: Recommended Reading¶
Official Documentation:
- Yocto Project Documentation: https://docs.yoctoproject.org/
- BitBake User Manual: https://docs.yoctoproject.org/bitbake/
- NVIDIA L4T Documentation: https://docs.nvidia.com/jetson/
- meta-tegra README: https://github.com/OE4T/meta-tegra
- SWUpdate Documentation: https://sbabic.github.io/swupdate/
Books:
- "Embedded Linux Systems with the Yocto Project" by Rudolf Streif
- "Embedded Linux Development Using Yocto Project" by Otavio Salvador
- "Mastering Embedded Linux Programming" by Chris Simmonds
Community:
- Yocto Project mailing list: yocto@lists.yoctoproject.org
- meta-tegra GitHub issues: https://github.com/OE4T/meta-tegra/issues
- NVIDIA Developer Forums: https://forums.developer.nvidia.com/
- #yocto IRC channel on irc.libera.chat
This guide reflects production practices validated across multi-year programs deploying 15,000-25,000+ Jetson Orin Nano devices in industrial edge computing applications. All code examples are representative of real-world implementations and should be adapted to your specific project requirements.