The Linux Foundation and Harvard’s Lab for Innovation Science Release Census of Most Widely Used Open Source Application Libraries
The Linux Foundation | 02 March 2022
Census II identifies more than one thousand of the most widely deployed applications libraries that are most critical to operations and security
SAN FRANCISCO – March 2, 2022 — The Linux Foundation, the nonprofit organization enabling mass innovation through open source, today announced the final release of “Census II of Free and Open Source Software – Application Libraries.” This follows the preliminary release of Census II, “Vulnerabilities in the Core,’ a Preliminary Report and Census II of Open Source Software” and identifies more than one thousand of the most widely deployed open source application libraries found from scans of commercial and enterprise applications. This study informs what open source packages, components and projects warrant proactive operations and security support.
The original Census Project (“Census I”) was conducted in 2015 to identify which software packages in the Debian Linux distribution were the most critical to a Linux server’s operation and security. The goal of the current study (Census II) is to pick up where Census I left off and to identify and measure which open source software is most widely deployed within applications developed by private and public organizations. This Census II allows for a more complete picture of free and open source software (FOSS) adoption by analyzing anonymized usage data provided by partner Software Composition Analysis (SCA) companies Snyk, the Synopsys Cybersecurity Research Center (CyRC), and FOSSA and is based on their scans of codebases at thousands of companies.
“Understanding what FOSS packages are the most widely used in society allows us to proactively engage the critical projects that warrant operations and security support,” said Brian Behlendorf, executive director at Linux Foundation’s Open Source Security Foundation (OpenSSF). “Open source software is the foundation upon which our day-to-day lives run, from our banking institutions to our schools and workplaces. Census II provides the foundational detail we need to support the world’s most critical and valuable infrastructure.”
Census II includes eight rankings of the 500 most used FOSS packages among those reported in the private usage data contributed by SCA partners. These include different slices of the data based on versions, structure, and packaging system. For example, this research enables identification of the top 10 version-agnostic packages available on the npm package manager that were called directly in applications:
To review all of the Top 500 lists in their entirety, please visit Data.World.
The study also surfaces these five overall findings that are detailed in the report:
1) The need for a standardized naming schema for software components so that application libraries can be uniquely identified
2) The complexities associated with package versioning – SBOM guidance will need to reflect versioning information that is consistent with the public “main” repository for that package, rather than private repositories
3) Much of the most widely used FOSS is developed by only a handful of contributors – results in one dataset show that 136 developers were responsible for more than 80% of the lines of code added to the top 50 packages
4) The increasing importance of individual developer account security – the OpenSSF encourages the use of MFA tokens or organizational accounts to achieve greater account security
5) The persistence of legacy software in the open source space
Census II is authored by Frank Nagle, Harvard Business School; James Dana, Harvard Business School; Jennifer Hoffman, Laboratory for Innovation Science at Harvard; Steven Randazzo, Laboratory for Innovation Science at Harvard; and Yanuo Zhou, Harvard Business School.
“Our goal is to not only identify the most widely used FOSS but also provide an example of how the distributed nature of FOSS requires a multi-party effort to fully understand the value and security of the FOSS ecosystem. Only through data-sharing, coordination, and investment will the value of this critical component of the digital economy be preserved for generations to come,” said Frank Nagle, Assistant Professor, Harvard Business School.
“Open source software plays a foundational role in enabling global economic growth. Of course, the ubiquitous nature of OSS means that severe vulnerabilities — such as Log4Shell — can have a devastating and widespread impact. Mounting a comprehensive defense against supply chain threats starts with establishing strong visibility into software — and we at FOSSA are thrilled to be able to contribute our market-leading SBOM capabilities and experience helping thousands of organizations successfully manage their open source dependencies to improve transparency and trust in the software supply chain.” – Kevin Wang, Founder & CEO, FOSSA
“The Linux Foundation’s latest multi-party Census effort is further evidence that OSS is at the very heart of not only today’s modern application development process, but also plays an increasingly vital behind the scenes role throughout all of society,” said Guy Podjarny, Founder, Snyk. “We’re honored to have made significant contributions to this latest comprehensive assessment and welcome all future efforts that help to empower the developers building our future with the right information to also effectively secure it.”
“With businesses increasingly dependent upon open source technologies, if those same businesses aren’t contributing back to the open source projects they depend upon, then they are increasing their business risk. That risk ranges from projects becoming orphaned and containing potentially vulnerable code, through to implementation changes that break existing applications. The only meaningful way to mitigate that risk comes from assigning resources to contribute back to the open source powering the business. After all, while there are millions of developers contributing to open source, there might just be only one developer working on something critical to your success.” – Tim Mackey, Principal Security Strategist, Synopsys Cybersecurity Research Center
About the Linux Foundation
Founded in 2000, the Linux Foundation and its projects are supported by more than 1,800 members. The Linux Foundation is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, Hyperledger, RISC-V, and more. The Linux Foundation’s methodology focuses on leveraging best practices and addressing the needs of contributors, users, and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.
The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see its trademark usage page: www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.
About The Linux Foundation
The Linux Foundation is the world’s leading home for collaboration on open source software, hardware, standards, and data. Linux Foundation projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, ONAP, PyTorch, RISC-V, SPDX, OpenChain, and more. The Linux Foundation focuses on leveraging best practices and addressing the needs of contributors, users, and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see its trademark usage page: www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.