SBOM Standard Formats

The Software Bill of Materials (SBOM) is a list of all the components, libraries, and other dependencies used in a software application. Standard formats for SBOMs include SPDX, CycloneDX, and CPE (Common Platform Enumeration). These formats provide a structured way to represent the components and dependencies in a software application, making it easier to understand and manage the security risks associated with those components.

In this article, we are going to explain—in detail—what are the various SBOM formats and standards, what an SBOM should include, and why all organizations need to use it.

What is an SBOM Standard?

The complexity and dynamic nature of modern software systems’ supply chains pose a significant challenge to transparency. This lack of transparency contributes to cybersecurity risks and increases the costs associated with development, procurement, and maintenance. The consequences of this are far-reaching, affecting not only businesses but also collective matters such as public safety and national security in our interconnected world.

Increased transparency in software supply chains can lead to a reduction in cybersecurity risks and costs by:

  • Improving the identification of vulnerable systems and identifying the root cause of incidents
  • Decreasing unplanned and unproductive work
  • Allowing for more informed market differentiation and component selection
  • Standardizing formats across multiple sectors, leading to a reduction in duplication of effort
  • Detecting suspicious or counterfeit software components

Collecting and sharing this information in a clear and consistent format can help lower costs, improve reliability, and enhance trust in our digital infrastructure.

For that purpose, the NTIA Software Transparency Working Group on Standards and Formats was established in 2018 to evaluate current formats for Software Bills of Materials and identify potential future uses. The group examined existing standards and initiatives related to identifying external components and shared libraries used in software products and communicating this information in a machine-readable format. The group did not consider proprietary formats. The original survey was released in late 2019 and was updated in 2021, with a focus on highlighting the benefits of the SBOM tooling ecosystem and the importance of coordination and harmonization in the technical SBOM world. The key takeaway is that SBOM data can be conveyed in various formats and the ecosystem should support interoperability between them.

The working group found that three formats are commonly used: 

  1. Software Package Data Exchange (SPDX®), an open-source, machine-readable format developed by the Linux Foundation and now an ISO/IEC standard
  2. CycloneDX (CDX), an open-source, machine-readable format developed by the OWASP community
  3. Software Identification (SWID), an ISO/IEC industry standard used by various commercial software publishers

It’s worth noting that these three formats share some common information. However, they have traditionally been used at different stages of the software development process and are intended for different audiences. We will discuss each of these formats in great detail in this article.

What should an SBOM include?

The NTIA’s minimum components of an SBOM, referred to as elements, consist of three broad and interrelated areas. These elements allow for a flexible approach to software transparency, addressing both the technology and the functional operation. More detail or technical advancements may be added in the future. As previously mentioned, these are the minimum components at present, and organizations may require more. The ability for transparency in the software supply chain may improve and evolve over time.

These minimum required elements for SBOM are typically grouped into three categories:

  1. Data fields: An SBOM should include important data about software components, such as the component name, supplier name, version, and unique identifiers. It should also include information about dependencies between components, allowing for accurate identification and tracking of all software components throughout the supply chain.
  2. Practices and processes: The SBOM documentation should also outline standard practices and procedures for creating and updating the SBOM, distributing and accessing it, as well as handling errors.
  3. Automation support: The Software Bill of Materials should be both machine-readable and able to be automatically generated for continuous tracking of data. It is typically in standard formats like SPDX, CycloneDX, and SWID tags which also make them readable for humans.

SPDX SBOM Standard Format

The SPDX® (Software Package Data Exchange) specification is an ISO/IEC standard for sharing information about software components, licenses, copyrights, and security details in multiple file formats. This project has developed and continues to improve a set of data exchange standards that allow businesses and organizations to share software metadata in a format that can be understood by both humans and machines, simplifying software supply chain processes.

SPDX information can be linked to specific software products, components or sets of components, individual files, or even small code snippets. The SPDX project is focused on creating and refining a language to describe the data that can be exchanged as part of an SBOM, and the ability to present this data in multiple file formats (RDF/XML, XLSX, tag-value, JSON, YAML, and XML) to make it easy to collect and share information about software packages and related content, resulting in time and accuracy improvements.

The SPDX specification outlines the fields and sections necessary for a valid document, but it’s important to note that not all sections are mandatory—only the creation information section is required. The document creator can choose which sections and fields they want to include, that describes the software and metadata information they plan to share.

SPDX can effectively capture Software Bill of Material data by representing all the components found in software development and deployment. It is used to document distro .iso images, containers, software packages, binary files, source files, patches, and even small code snippets embedded in other files. SPDX offers a comprehensive set of relationships to connect software elements within documents and across SBOM documents. An SPDX SBOM document can also reference external sources such as the National Vulnerability Database and other packaging system metadata.

There are several components that make up an SPDX document: Creation information, package information, file information, snippet information, other licensing information, relationships, and annotations.

Each SPDX document can be represented by a complete data model implementation and identifier syntax, allowing for exchange between different data output formats (RDF/XML, tag-value, XLSX) and formal validation of the document’s accuracy. The SPDX specification’s version 2.2 release includes additional output formats such as JSON, YAML, and XML, and also addresses “known unknowns” as identified in the original SBOM document. More information about the underlying data model of SPDX can be found in Appendix III of the SPDX Specification and on the SPDX website.

CycloneDX SBOM Standard Format

CycloneDX project was established in 2017 with the aim of developing a fully automated, security-focused SBOM standard. The core working group releases immutable and backward-compatible versions annually, using a risk-based standards process. CycloneDX includes existing specifications such as Package URL, CPE, SWID, and SPDX license IDs and expressions. The SBOMs can be represented in different formats including XML, JSON, and Protocol Buffers (protobuf).

CycloneDX is a lightweight SBOM specification that is intended for use in supply chain component analysis and software security. It enables the communication of software components inventory, external services, and the relationships between them. It is an open-source standard developed by the OWASP (Open Web Application Security Project).

CycloneDX can capture the dynamic nature of open-source components whose source code is accessible, modifiable, and redistributable. The specification can represent the pedigree of a component, including its ancestors, descendants, and variants, describing the component’s lineage from any perspective, as well as the commits, patches, and diffs that make it unique.

The CycloneDX project maintains a list of known open-source and proprietary tools that support or are compatible with the standard, which is supported by the community.

The CycloneDX specification lays out a detailed object model that ensures consistency across all implementations. It can be validated using XML Schema and JSON Schema, or by using the CycloneDX command-line interface. Media types for XML and JSON are also provided for automated delivery and consumption of supported formats.

CycloneDX SBOMs may contain the following information: BOM Metadata, components, services, dependencies, compositions, and extensions

CycloneDX is a comprehensive SBOM standard that can characterize various types of software, including applications, components, services, firmware, and devices. It is widely used across industries to describe software packages, libraries, frameworks, applications, and container images. The project is compatible with major development ecosystems and offers implementations for software factories like GitHub actions, enabling organizations to fully automate SBOM creation.

SWID Tag

SWID Tags, or Software Identification Tags, were created to enable organizations to track software installed on their managed devices in a transparent manner. The standard was established by ISO in 2012 and revised as ISO/IEC 19770-2:2015 in 2015. These tags contain detailed information about a specific release of a software product.

The SWID standard outlines a lifecycle for tracking software: a SWID Tag is added to an endpoint during the installation of a software product, and removed by the product’s uninstall process. The existence of a specific SWID Tag corresponds directly to the presence of the software it describes. Multiple standardization organizations, such as the Trusted Computing Group (TCG) and the Internet Engineering Task Force (IETF), incorporate SWID Tags in their standards.

To track the lifecycle of a software component, the SWID specification has four types of tags: primary, patch, corpus, and supplemental. Corpus, primary, and patch tags serve similar purposes in that they describe the existence and presence of different types of software, such as software installers, software installations, and software patches, and the possible states of software products. On the other hand, supplemental tags provide additional details not found in corpus, primary, or patch tags.

Supplemental tags can be linked to any other tag to provide extra metadata that may be useful. Together, SWID tags can perform a variety of functions, such as software discovery, configuration management, and vulnerability management.

SWID tags can function as an SBOM, as they supply identifying information for a software component, a list of files and cryptographic hashes for the component’s artifacts, and provenance information about the SBOM (tag) creator and software component creator. The tags can also link to other tags, allowing for a representation of a dependency tree.

SWID tags can be generated during the build and packaging process, enabling the automatic creation of a SWID tag-based SBOM when the corresponding software component is packaged.

Why are Software Bills of Materials Important?

Software Bill of Materials (SBOMs) are becoming increasingly important for organizations as they aim to manage and secure the software they use. There’s no short answer to the question of what is an SBOM. SBOMs provide a comprehensive list of all the components and dependencies that make up a software package, including information such as version numbers, authors, and license information. This information is critical for security and compliance, as well as for tracking the provenance of software components.

Many organizations, including those in regulated industries, are using SBOMs to ensure compliance with regulations such as the General Data Protection Regulation (GDPR) and the Payment Card Industry Data Security Standard (PCI DSS). SBOMs can also assist in identifying and managing vulnerabilities in software, as well as in tracking the provenance of software components. In addition, SBOMs can assist in the management of software licenses, ensuring that organizations are using software in compliance with the terms of their licenses.

SBOMs can also be used to track the use of open-source software, which is becoming increasingly common in software development. By providing detailed information about the open-source components used in a software package, SBOMs can assist organizations in ensuring compliance with open-source licenses.

Furthermore, SBOMs can be used to support software development and maintenance. By providing detailed information about the components used in a software package, SBOMs can assist developers in understanding the dependencies of a software package, which can help them to identify potential compatibility issues and to make informed decisions about the use of new components.