KernelSbom¶
Introduction¶
KernelSbom is a Python script scripts/sbom/sbom.py that can be
executed after a successful kernel build. When invoked, KernelSbom
analyzes all files involved in the build and generates Software Bill of
Materials (SBOM) documents in SPDX 3.0.1 format.
The generated SBOM documents capture:
Final output artifacts, typically the kernel image and modules
All source files that contributed to the build with metadata and licensing information
Details of the build process, including intermediate artifacts and the build commands linking source files to the final output artifacts
KernelSbom is originally developed in the KernelSbom repository.
Requirements¶
Python 3.10 or later. No libraries or other dependencies are required.
Basic Usage¶
Run the make sbom target.
For example:
$ make defconfig O=kernel_build
$ make sbom O=kernel_build -j$(nproc)
This will trigger a kernel build. After all build outputs have been generated, KernelSbom produces three SPDX documents in the root directory of the object tree:
sbom-source.spdx.jsonDescribes all source files involved in the build and associates each file with its corresponding license expression.sbom-output.spdx.jsonCaptures all final build outputs (kernel image and.komodule files) and includes build metadata such as environment variables and a hash of the.configfile used for the build.sbom-build.spdx.jsonImports files from the source and output documents and describes every intermediate build artifact. For each artifact, it records the exact build command used and establishes the relationship between input files and generated outputs.
When invoking the sbom target, it is recommended to perform
out-of-tree builds using O=<objtree>. KernelSbom classifies files as
source files when they are located in the source tree and not in the
object tree. For in-tree builds, where the source and object trees are
the same directory, this distinction can no longer be made reliably.
In that case, KernelSbom does not generate a dedicated source SBOM.
Instead, source files are included in the build SBOM.
Standalone Usage¶
KernelSbom can also be used as a standalone script to generate
SPDX documents for specific build outputs. For example, after a
successful x86 kernel build, KernelSbom can generate SPDX documents
for the bzImage kernel image:
$ SRCARCH=x86 python3 scripts/sbom/sbom.py \
--src-tree . \
--obj-tree ./kernel_build \
--roots arch/x86/boot/bzImage \
--generate-spdx \
--generate-used-files \
--prettify-json \
--debug
Note that when KernelSbom is invoked outside of the make process,
the environment variables used during compilation are not available and
therefore cannot be included in the generated SPDX documents. It is
recommended to set at least the SRCARCH environment variable to the
architecture for which the build was performed.
For a full list of command-line options, run:
$ python3 scripts/sbom/sbom.py --help
Output Format¶
KernelSbom generates documents conforming to the SPDX 3.0.1 specification serialized as JSON-LD.
To reduce file size, the output documents use the JSON-LD @context
to define custom prefixes for spdxId values. While this is compliant
with the SPDX specification, only a limited number of tools in the
current SPDX ecosystem support custom JSON-LD contexts. To use such
tools with the generated documents, the custom JSON-LD context must
be expanded before providing the documents.
See https://lists.spdx.org/g/Spdx-tech/message/6064 for more information.
How it Works¶
KernelSbom operates in two major phases:
Generate the cmd graph, an acyclic directed dependency graph.
Generate SPDX documents based on the cmd graph.
KernelSbom begins from the root artifacts specified by the user, e.g.,
arch/x86/boot/bzImage. For each root artifact, it collects all
dependencies required to build that artifact. The dependencies come
from multiple sources:
.cmd files: The primary source is the
.cmdfile of the generated artifact, e.g.,arch/x86/boot/.bzImage.cmd. These files contain the exact command used to build the artifact and often include an explicit list of input dependencies. By parsing the.cmdfile, the full list of dependencies can be obtained..incbin statements: The second source are include binary
.incbinstatements in.Sassembly files.Hardcoded dependencies: Unfortunately, not all build dependencies can be found via
.cmdfiles and.incbinstatements. Some build dependencies are directly defined in Makefiles or Kbuild files. Parsing these files is considered too complex for the scope of this project. Instead, the remaining gaps of the graph are filled using a list of manually defined dependencies, seescripts/sbom/sbom/cmd_graph/hardcoded_dependencies.py. This list is known to be incomplete. However, analysis of the cmd graph indicates a ~99% completeness. For more information about the completeness analysis, see KernelSbom #95.
Given the list of dependency files, KernelSbom recursively processes each file, expanding the dependency chain all the way to the version controlled source files. The result is a complete dependency graph where nodes represent files, and edges represent “file A was used to build file B” relationships.
Using the cmd graph, KernelSbom produces three SPDX documents. For every file in the graph, KernelSbom:
Parses
SPDX-License-Identifierheaders,Computes file hashes,
Estimates the file type based on extension and path,
Records build relationships between files.
Each root output file is additionally associated with an SPDX Package element that captures version information, license data, and copyright.
Advanced Usage¶
Including Kernel Modules¶
The list of all .ko kernel modules produced during a build can be
extracted from the modules.order file within the object tree.
For example:
$ echo "arch/x86/boot/bzImage" > sbom-roots.txt
$ sed 's/\.o$/.ko/' ./kernel_build/modules.order >> sbom-roots.txt
Then use the generated roots file:
$ SRCARCH=x86 python3 scripts/sbom/sbom.py \
--src-tree . \
--obj-tree ./kernel_build \
--roots-file sbom-roots.txt \
--generate-spdx
Equal Source and Object Trees¶
When the source tree and object tree are identical (for example, when
building in-tree), source files can no longer be reliably distinguished
from generated files.
In this scenario, KernelSbom does not produce a dedicated
sbom-source.spdx.json document. Instead, both source files and build
artifacts are included together in sbom-build.spdx.json, and
sbom.used-files.txt lists all files referenced in the build document.
Unknown Build Commands¶
Because the kernel supports a wide range of configurations and versions,
KernelSbom may encounter build commands in .cmd files that it does
not yet support. By default, KernelSbom will fail if an unknown build
command is encountered.
If you still wish to generate SPDX documents despite unsupported
commands, you can use the --do-not-fail-on-unknown-build-command
option. KernelSbom will continue and produce the documents, although
the resulting SBOM will be incomplete.
This option should only be used when the missing portion of the dependency graph is small and an incomplete SBOM is acceptable for your use case.