Posts

Open Source Compliance

This fully updated ebook provides detailed information on issues related to the licensing, development, and reuse of open source software.The Linux Foundation has released the second edition of Open Source Compliance in the Enterprise by Ibrahim Haddad, which offers organizations a practical guide to using open source code and participating in open source communities while complying with both the spirit and the letter of open source licensing.

This fully updated ebook — with new contributions from Shane Coughlan and Kate Stewart — provides detailed information on issues related to the licensing, development, and reuse of open source software. The new edition also includes all new chapters on OpenChain, which focuses on increasing open source compliance in the supply chain, and SPDX, which is a set of standard formats for communicating the components, licenses, and copyrights of software packages.

“Open source compliance is the process by which users, integrators, and developers of open source observe copyright notices and satisfy license obligations for their open source software components,” Haddad states in the book.

This 200+ page book encompasses the entire process of open source compliance, including an introduction on how to establish an open source management program, a description of relevant roles and responsibilities, an overview of common compliance tools and processes, and all new material to help navigate mergers and acquisitions. It offers proven best practices as well as practical checklists to help those responsible for compliance activities create their own processes and policies.

Essential topics covered in this updated ebook include:

  • An introduction to open source compliance
  • Compliance roles and responsibilities
  • Building a compliance program
  • Best practices in compliance management
  • Source code scanning tools

To learn more about the benefits of open source compliance and how to achieve it, download the free ebook today!

SPDX License Identifiers can be used to indicate relevant license information at any level, from package to the source code file level.

Accurately identifying the license for open source software is important for license compliance. However, determining the license can sometimes be difficult due to a lack of information or ambiguous information. Even when there is some licensing information present, a lack of consistent ways of expressing the license can make automating the task of license detection very difficult, thus requiring significant amounts of manual human effort.   There are some commercial tools applying machine learning to this problem to reduce the false positives, and train the license scanners, but a better solution is to fix the problem at the upstream source.

In 2013,  the U-boot project decided to use the SPDX license identifiers in each source file instead of the GPL v2.0 or later header boilerplate that had been used up to that point.   The initial commit message had an eloquent explanation of reasons behind this transition.

Licenses: introduce SPDX Unique Lincense Identifiers


Like many other projects, U-Boot has a tradition of including big

blocks of License headers in all files.  This not only blows up the

source code with mostly redundant information, but also makes it very

difficult to generate License Clearing Reports.  An additional problem

is that even the same lincenses are referred to by a number of

slightly varying text blocks (full, abbreviated, different

indentation, line wrapping and/or white space, with obsolete address

information, ...) which makes automatic processing a nightmare.


To make this easier, such license headers in the source files will be

replaced with a single line reference to Unique Lincense Identifiers

as defined by the Linux Foundation's SPDX project [1].  For example,

in a source file the full "GPL v2.0 or later" header text will be

replaced by a single line:


        SPDX-License-Identifier:        GPL-2.0+


We use the SPDX Unique Lincense Identifiers here; these are available

at [2].

. . .


[1] http://spdx.org/

[2] http://spdx.org/licenses/

The SPDX project liked the simplicity of this approach and formally adopted U-Boot’s syntax for embedding SPDX-License-Identifier’s into the project.  Initially, the syntax was available on the project WIKI and was formalized in SPDX specification version 2.1 “Appendix V: Using SPDX short identifiers in Source Files”.  Since then,  other upstream open source projects and repositories have adopted use of these short identifiers to identify the licenses in use, including github in its licenses-API.  In 2017, the Free Software Foundation Europe created a project called REUSE.software  that provided guidance for open source projects on how to apply the SPDX-License-Identifiers into projects.   The REUSE.software guidelines were followed for adding SPDX-License-Identifiers into the Linux kernel, later that year.

The SPDX-License-Identifier syntax used with short identifiers from the SPDX License List short form identifiers (referred here as SPDX LIDs) can be used to indicate relevant license information at any level,  from package to the source code file level. The “SPDX-License-Identifier” phrase and a license expresssion formed of SPDX LIDs in a comment form a precise, concise and language neutral way to document the licensing, that is simple to machine process.  This leads to source code that is easier to read, which appeals to developers, as well as enabling the licensing information to travel with the source code.

To use SPDX LIDs in your project’s source code,  just add a single line in the following format, tailored to your license(s) and the comment style for that file’s language.  For example:

// SPDX-License-Identifier: MIT

/* SPDX-License-Identifier: MIT OR Apache-2.0 */

# SPDX-License-Identifer: GPL-2.0-or-later

To learn more about how to use SPDXLIDs with your source code,  please see the guidance in the documentation in the SPDX project, REUSE.software  or David Wheeler’s SPDX tutorial.    

In addition to U-boot and Linux transitioning to use the SPDXLIDs,  newer projects like Zephyr and Hyperleger fabric have adopted them right from the start as a best practice.   Indeed, to achieve the Core Infrastructure Initiative’s gold badge, each file in the source code must have a license, and the recommended way is to use an SPDX LID.  

The project MUST include a license statement in each source file. This MAY be done by 
including the following inside a comment near the beginning of each file: 
SPDX-License-Identifier: [SPDX license expression for project].

When SPDX LIDs are used,  gathering license information across your project files can start to become as easy as running grep. If a source file gets reused in a different package,  the license information travels with the source, reducing the risk of licence identification errors, and making license compliance in the recipient project easier.  By using SPDX LIDs in license expressions, the meaning of license combinations is understood more accurately. Saying “this file is MPL/MIT” is ambiguous, and leaves recipients unclear about their compliance requirements. Saying “MPL-2.0 AND MIT” or “MPL-2.0 OR MIT” specifies precisely whether the licensee must comply with both licenses, or either license, when redistributing the file.

As illustrated by the transition underway in the Linux kernel,  SPDX LIDs can be adopted gradually. You can start by adding SPDX LIDs to new files without changing anything already present in your codebase.  A list of projects known to be using SPDX License Identifiers can be found at: https://spdx.org/ids-where,  and if you know of one that’s missing,  please send email to outreach@lists.spdx.org.  

Learn more in this presentation at Open Source Summit: Automating the Creation of Open Source BOMs

Fossology

To help celebrate Fossology’s 10th anniversary, we look at how the project makes it easier to understand and comply with open source licenses.

FOSSology turns ten this year. Far from winding down, the open source license compliance project is still going strong. The interest in the project among its thriving community has not dampened in the least, and regular contributions and cross-project contributors are steering it toward productive and meaningful iterations.

An example is the recent 3.2 release, offering significant improvements over previous versions, such as the import of SDPX files and word processor document output summarizing analysis information. Even so, the overall project goal remains the same: to make it easier to understand and comply with the licenses used in open source software.

There are thousands of licenses used in Open Source software these days, with some differing by only a few words and others pertaining to entirely different use universes. Together, they present a bewildering quagmire of requirements that must be adhered to, but only as set out in the appropriate license(s), the misunderstanding or absence of which can revert rights to a reserved status and bring about a complete halt to distribution.  

How FOSSology came to be

In short, there are over a 1000 different ways licensing can go mildly or horribly wrong, creating a desperate need to find one single way to make sure everything goes consistently right. Enter FOSSology, which as the website points out, is all about scanning: It’s a framework, toolbox and Web server application for examining software packages in a multi-user environment.

There are several important highlights since the first version of the FOSSology project was published in December 2007. The Linux Foundation started hosting it in 2015. The 3.2 release was in March 2018, which, as mentioned above, provides the ability to import SPDX files. SPDX (Software Package Data Exchange) is another Linux Foundation project that helps reduce complexity by defining standards for reporting and sharing licensing information. FOSSology is the first open source project to consume SPDX in this way.

“This project has been more successful than anticipated, because license compliance was a very special topic, and running it as an open source project is also difficult, because it has a naturally small community,” said Michael C. Jaeger, Senior Research Scientist Open Source Software at Siemens AG, Maintaining FOSSology and SW360, and Trainer at SW Compliance Academy.

When goal and delivery are tightly entwined, as are the benefactors and beneficiaries, good things come from any project.

“License compliance for open source projects is hard, and FOSSology helps here by doing most of the work, such as scanning the files to find licenses, copyright statements and more, to simplify the necessary clearing. It also generates reports which can be used to document the results, which is rather important in the context of larger companies,” said Maximilian Huber, a software consultant at TNG Technology Consulting.

A paper titled “The FOSSology Project: 10 Years Of License Scanning,” has been prepared to commemorate the 10th anniversary. Project members will be participating  at the FSFE’s Legal and Licensing Workshop in Barcelona this week to present on the project.

The project’s value and who benefits  

“It is important because it offers organizations a free software solution for license compliance – an area where commercial products have a very dominant position for more than a decade. However, with free software, especially open source projects can implement license compliance without upfront cost,” explained Jaeger.

FOSSology fits in well with the other open source compliance related projects like SPDX, OpenChain, and SW360. Indeed, there is even community and developer cross-over with some of these projects and FOSSology.

“There is one person who is a maintainer in both the SW360 and FOSSology projects, and there are some persons contributing to both projects in different roles,” said Jaeger. “Consequently and naturally, there is good coordination between both projects. The FOSSology project also has a long history for supporting SPDX since it represents the de facto standard for exchanging license compliance information.”

“With its review functionality, FOSSology was one of the first supporters of the concept of concluding a license in SPDX,” he added. “It was also the first project which allowed for importing SPDX descriptions, another elementary support because the “X” in SPDX stands for exchange and not “eXport.” As far as I know, OpenChain is not concerned very much with particular tooling; however, FOSSology helps to implement OpenChain conformance.”

And the momentum continues. More changes are on the horizon and some new obstacles as well.

“In the future, more and more open source projects will be straightforwardly licensed and the strong scan correction functionality and file review functionality of FOSSology will move to the background,” said Jaeger.

“However, questions still arise because of incompatibilities of licensings, or in considering obligations of licensing. Therefore, FOSSology needs to shift its focus from correcting scan results of not-well-formed licensing to licensing analysis and license problems on the component level.”

Modernization efforts are also under consideration.

“An important goal is to modularize the parts of FOSSology, to allow a smooth transition to a more modern architecture and software stack,” added Huber.

As even more licensing and related tools cross the horizon, simplifying the information exchange between them and FOSSology will be an ongoing task. That in turn will further cement FOSSlogy’s place in the license compliance ecosystem.

But today, all attention is on a decade of successes and the community that’s responsible for so many wins.

Happy anniversary, FOSSology!