What Do the Most Successful Open Source Projects Have In Common?

Thriving open source projects have many users, and the most active have thousands of authors contributing. There are now more than 60 million open source repositories, but the vast majority are just a public workspace for a single individual. What differentiates the most successful open source projects? One commonality is that most of them are backed by either one company or a group of companies collaborating together

I run the Cloud Native Computing Foundation, which is part of the Linux Foundation and is best known for hosting Kubernetes. Developer Łukasz Gryglicki and I worked together to visualize the 30 highest velocity open source projects. Open Source projects exhibit natural increasing returns to scale. That’s because most developers are interested in using and participating in the largest projects, and the projects with the most developers are more likely to quickly fix bugs, add features and work reliably across the largest number of platforms. So, tracking the projects with the highest developer velocity can help illuminate promising areas in which to get involved, and what are likely to be the successful platforms over the next several years.

Rather than debate whether to measure high-velocity projects via commits, authors, or comments and pull requests, we use a bubble chart to show all 3 axes of data, and plot on a log-log chart to show the data across large scales. In the graph, the bubbles’ area is proportional to the number of authors, the y-axis (height) is the total number of pull requests & issues, and the x-axis is the number of commits.

Who’s Backing the 30 Highest-Velocity Open Source Projects?

Looking at the top 30 projects breaks down as follows:

Foundations stand behind the Linux kernel (Linux Foundation), Kubernetes (CNCF), Cloud Foundry (Cloud Foundry Foundation), .NET (.NET Foundation), Nova, Neutron and Cinder (OpenStack), Node.js (Node.js Foundation) and Mesos (Apache Software Foundation). That’s nine.

Chromium, Tensorflow and AngularJS are backed by Google, React by Facebook, Docker/Moby by Docker, VS Code and Office Developer by Microsoft, Ansible by Red Hat, ElasticSearch by Elastic, Auth0 by Auth0, GitLab by GitLab, Ruby on Rails by Basecamp, Ionic by Ionic, Terraform by HashiCorp and Chef by Chef Software. In all cases, there are major contributions by developers from outside companies and independent developers, but many or most of the maintainers are employed by one company. That’s 14.

That leaves six projects that are not mainly backed by one company or software foundation: Homebrew, DefinitelyTyped, Vue.js, NixOS, Home Assistant and The Odin Project. Interestingly, Homebrew, DefinitelyTyped, NixOS and Home Assistant all represent a special kind of project where once the core infrastructure is in place, most of the value comes from “recipes” that can be individually contributed and updated by hundreds of independent contributors. The Odin Project is collaborative documentation. Vue.js seems to be something of a special case, in that as a front-end framework it competes directly with React (backed by Facebook) and AngularJS (backed by Google) but has achieved wide adoption and high levels of contribution without any corporate or consortium backing.

Structure and funding are key to growth

So, what’s the takeaway? Software development is hard. Running a large open source project is even harder. So, it is often helpful to have backing from an individual company or a consortium of them working through a software foundation.

Managing a successful project includes a lot of what Linux Foundation Executive Director Jim Zemlin calls “janitor” functions, such as triaging bugs, answering questions from new users and new developers, dealing with trademark and license issues, and generally being available to grease the inevitable frictions that occur in any kind of large collaboration. The structure and funding from a foundation or corporate sponsor provides more confidence that the project will remain active and stable over the long term. Ideally, this creates a positive feedback loop where high-velocity projects become the core of successful products or services. That adoption helps produce profits for companies, and the companies are able to reinvest those profits by hiring people to work on further incremental development of the project. The Linux Foundation’s goal is to help create these feedback loops.

Most projects need or want some help with these activities. Nevertheless, a small number of projects have reached the highest levels of collaboration without any corporate or organizational backing. For open source “recipe” projects like Homebrew and DefinitelyTyped, being able to leverage the infrastructure of GitHub for free has certainly been a major source of leverage.

All of the scripts used to generate this data are at https://github.com/cncf/velocity (under an Apache 2.0 license). If you see any errors, please open an issue there. What’s your biggest takeaway? Please join the discussion on Hacker News and let us know.

Dan Kohn

Dan is the Executive Director of Cloud Native Computing Foundation.