Click here to read the January 2021 Linux Foundation Newsletter

Preventing Supply Chain Attacks like SolarWinds

In late 2020, it was revealed that the SolarWinds Orion software, which is in use by numerous US Government agencies and many private organizations, was severely compromised. This was an incredibly dangerous set of supply chain compromises that the information technology community (including the Open Source community) needs to learn from and take action on.

The US Cybersecurity and Infrastructure Security Agency (CISA) released an alert noting that the SolarWinds Orion software included malicious functionality in March 2020, but it was not detected until December 2020. CISA’s Emergency Directive 21-01 stated that it was being exploited, had a high potential of compromise, and a grave impact on entire organizations when compromised. Indeed, because Orion deployments typically control networks of whole organizations, this is a grave problem. The more people look, the worse it gets. As I write this, it appears that a second and third malware have been identified in Orion.

Why the SolarWinds Attack Is Particularly Noteworthy

What’s especially noteworthy is how the malicious code was inserted into Orion: the attackers subverted something called the build environment. When software is being developed it is converted (compiled) from source code (the text that software developers update) into an executable package using a “build process.” For example, the source code of many open source software projects is then used in software that is built, compiled, and redistributed by other organizations, so that it is ready to install and run on various computing platforms. In the case of SolarWinds’ Orion, CrowdStrike found a piece of malware called Sunspot that watched the build server for build commands and silently replaced source code files inside the Orion app with files that loaded the Sunburst malware. The SolarWinds Orion compromise by Sunspot isn’t the first example of these kinds of attacks, but it has demonstrated just how dangerous they can be when they compromise widely-used software.

Unfortunately, a lot of conventional security advice cannot counter this kind of attack: 

SolarWinds’ Orion is not open source software. Only the company’s developers can legally review, modify, or redistribute its source code or its build system and configurations. If we needed further evidence that obscurity of software source code doesn’t automatically provide security, this is it.

Recommendations from The Linux Foundation 

Organizations need to harden their build environments against attackers. SolarWinds followed some poor practices, such as using the insecure ftp protocol and publicly revealing passwords, which may have made these attacks especially easy. The build system is a critical production system, and it should be treated like one, with the same or higher security requirements as its production environments. This is an important short-term step that organizations should already be doing. However, it’s not clear that these particular weaknesses were exploited or that such hardening would have made any difference. Assuming a system can “never be broken into” is a failing strategy.

In the longer term, I know of only one strong countermeasure for this kind of attack: verified reproducible builds. A “reproducible build” is a build that always produces the same outputs given the same inputs so that the build results can be verified. A verified reproducible build is a process where independent organizations produce a build from source code and verify that the built results come from the claimed source code. Almost all software today is not reproducible, but there’s work to change this. The Linux Foundation and Civil Infrastructure Platform has been funding work, including the Reproducible Builds project, to make it possible to have verified reproducible builds.

The software industry needs to begin shifting towards implementing and requiring verified reproducible builds. This will not be easy. Most software is not designed to be reproducible in their build environments today, so it may take years to make software reproducible. Many changes must be made to make software reproducible, so resources (time and money) are often needed. And there’s a lot of software that needs to be reproducible, including operating system packages and library level packages. There are package distribution systems that would need to be reviewed and likely modified. I would expect some of the most critical software to become reproducible first, and then less critical software would increase over time as pressure increases to make more software verified reproducible. It would be wise to develop widely-applicable standards and best practices for creating reproducible builds. Once software is reproducible, others will need to verify the build results for given source code to counter these kinds of attacks. Reproducible builds are much easier for open source software (OSS) because there’s no legal impediment to having many verifiers. Closed source software developers will have added challenges; their business models often depend on hiding source code. It’s still possible to have “trusted rebuilders” worldwide to verify closed source software, even though it’s more challenging and the number of rebuilders would necessarily be smaller.

The information technology industry is generally moving away from “black boxes” that cannot be inspected and verified and towards components that can be reviewed. So this is part of a general industry trend; it’s a trend that needs to be accelerated.

This is not unprecedented. Auditors have access to the financial data and review the financial systems of most enterprises. Audits are an independent entity verifying the data and systems for the benefit of the ecosystem. There is a similar opportunity for organizations to become independent verifiers for both open source and closed source software and build systems. 

Attackers will always take the easiest path, so we can’t ignore other attacks. Today most attacks exploit unintentional vulnerabilities in code, so we need to continue to work to prevent these unintentional vulnerabilities. These mitigations include changing tools & interfaces so those problems won’t happen, educating developers on developing secure software (such as the free courses from OpenSSF on edX), and detecting residual vulnerabilities before deployment through various detection tools. The Open Source Security Foundation (OpenSSF) is working on improving the security of open source software (OSS), including all these points.

Applications are mostly reused software (with a small amount of custom code), so this reused software’s software supply chain is critical. Reused components are often extremely out-of-date. Thus, they have many publicly-known unintentional vulnerabilities; in fact, reused components with known vulnerabilities are among the topmost common problems in web applications. The LF’s LFX security tools, GitHub’s Dependabot, GitLab’s dependency analyzers, and many other tools & services can help detect reused components with known vulnerabilities.

Vulnerabilities in widely-reused OSS can cause widespread problems, so the LF is already working to identify such OSS so that it can be reviewed and hardened further (see Vulnerabilities in the Core Preliminary Report and Census II of Open Source Software).

The supply chain matters for malicious code, too; most malicious code gets into applications through library “typosquatting” (that is, by creating a malicious library with a name that looks like a legitimate library). 

That means that Users need to start asking for a software bill of materials (SBOM) so they will know what they are using. The US National Telecommunications and Information Administration (NTIA) has been encouraging the adoption of SBOMs throughout organizations and the software supply chain process. The Linux Foundation’s Software Package Data Exchange (SPDX) format is a SBOM format by many. Once you get SBOM information, examine the versions that are included. If the software has malicious components, or components with known vulnerabilities, start asking why. Some vulnerabilities may not be exploitable, but too many application developers simply don’t update dependencies even when they are exploitable. To be fair, there’s a chicken-and-egg problem here: specifications are in the process of being updated, tools are in development, and many software producers aren’t ready to provide SBOMs.  So users should not expect that most software producers will have SBOMs ready today. However, they do need to create a demand for SBOMs.

Similarly, software producers should work towards providing SBOM information. For many OSS projects this can typically be done, at least in part, by providing package management information that identifies their direct and indirect dependencies (e.g., in package.json, requirements.txt, Gemfile, Gemfile.lock, and similar files). Many tools can combine this information to create more complete SBOM information for larger systems.

Organizations should invest in OpenChain conformance and require their suppliers to implement a process designed to improve trust in a supply chain.  OpenChain’s conformance process reveals specifics about the components you depend on that are a critical first step to countering many supply chain attacks.

Conclusion

The attack on SolarWinds’ Orion will have devastating effects for years to come. But we can and should learn from it. 

We can:

  1. Harden software build environments
  2. Move towards verified reproducible builds 
  3. Change tools & interfaces so unintentional vulnerabilities are less likely
  4. Educate developers (such as the free courses from OpenSSF on edX)
  5. Use vulnerability detection tools when developing software
  6. Use tools to detect known-vulnerable components when developing software
  7. Improve widely-used OSS (the OpenSSF is working on this)
  8. Ask for a software bill of materials (SBOMs), e.g., in SPDX format. Many software producers aren’t ready to provide one yet, but creating the demand will speed progress
  9. Determine if subcomponents we use have known vulnerabilities 
  10. Work towards providing SBOM information if we produce software for others
  11. Implement OpenChain 

Let’s make it much harder to exploit the future systems we all depend on. Those who do not learn from history are often doomed to repeat it.

David A. Wheeler, Director of Open Source Supply Chain Security at the Linux Foundation

Centaurus today is becoming a Linux Foundation Project. The Centaurus Infrastructure Project is a cloud infrastructure platform for building distributed cloud as well as a platform for modern cloud native computing. It supports applications and workloads for 5G, Edge and AI and unifies the orchestration, network provisioning and management of cloud compute and network resources at a regional scale. 

Founding members include Click2cloud, Distributed Systems, Futurewei, GridGain Systems, Reinvent Labs, SODA Foundation and Tu Wien Informatics. Centaurus is an umbrella project for modern distributed computing and hosts both Arktos and Mizar. Arktos is a compute cluster management system designed for large scale clouds, while Mizar is the high-performance cloud-network powered by eXpress Data Path (XDP) and Geneve protocol for high scale cloud. More members and projects are expected to be accepted in the coming months. 

“The market is changing and customers require a new kind of cloud infrastructure that will cater to modern applications and workloads for 5G, AI and Edge,” said Mike Dolan, senior vice president and general manager for Linux Foundation Projects. “Centaurus is a technical project with strategic vision, and we’re looking forward to a deep collaboration that advances cloud native computing for generations to come.” 

Current cloud infrastructure technology needs are evolving, requiring companies to manage a larger scale of compute and network resources across data centers and more quickly provision those resources. Centaurus unifies management across bare metal, VMs, containers and serverless, while reducing operational costs and delivering on the low latency and data privacy requirements of edge networks. Centaurus offers a consistent API experience to provision and manage virtual machines, containers, serverless and other types of cloud resources by  combining traditional (Infrastructure as a Service) IaaS and Platform as a Service (PaaS) layers into one common infrastructure platform that can simplify cloud management.

“The Linux Foundation’s support in expanding the Centaurus community will accelerate cloud native infrastructure for the most pressing compute and networking demands,” said Dr. Xiong Ying, the current acting TSC chair, Centaurus Infrastructure Project. “It’s large network of open source developers and projects already supporting this future will enable mass collaboration and important integrations for 5G, AI and Edge workloads.” 

To contribute to Centaurus, please visit: https://www.centauruscloud.io/

Supporting Member Quotes

Click2cloud
“Click2cloud has been part of the development of Centaurus, which is world class software that will lead organizations to have a clear transition from IaaS to Cloud Native Infrastructure. Click2cloud has already started a development program to enable the journey from IaaS (Openstack) to Cloud Native migration, 5G cloud based on Centaurus reference architecture to support the partner ecosystem. We are very excited for Centaurus to be a part of Linux Foundation,” said Prashant Mishra, CEO, Click2cloud. 

Futurewei
“Distributed cloud architecture is a natural evolution for cloud computing infrastructure. Centaurus is a cloud native infrastructure platform aiming to unify management and orchestration of virtual machines, containers, and other forms of cloud resources natively at scale and at the edge. We have seen many enterprise users and partners wanting a unified solution to build their distributed cloud to manage virtual machines, containers or bare metal-based applications running at cloud as well as at edge sites. We are very pleased to see, today, the Centaurus Infrastructure project becomes a Linux Foundation open-source project, providing an option for community and enterprise users to build their cloud infrastructure to run and manage next generation applications such as AI, 5G and IoT. We look forward to working with the open-source community to realize the vision of Centaurus,” said Dr. Xiong Ying, Sr. Technical VP, Head of Cloud Lab, Futurewei. 

GridGain Systems
“Creating and managing a unified and scalable distributed cloud infrastructure that extends from cloud to edge is increasingly a challenge for organizations worldwide. GridGain Systems has been a proud sponsor and active participant in the development of in-memory computing solutions to support the Centaurus project. We look forward to helping organizations realize the benefits of Centaurus and continuing to help extend its scalability and adoption,” said Nikita Ivanov, Co-Founder and CTO, GridGain Systems. 

Reinvent Labs
“We are a young company, which specializes in cloud computing and delivering cloud-native solutions to our customers across various industries. As such, we are ever stronger witnessing the need to manage cloud services and applications that span across complex and heterogeneous infrastructures, which combine containers, VMs and serverless functions. What is more, such infrastructures are also starting to grow beyond traditional cloud platforms towards the edge on the network. Being part of the Centaurus project will not only allow us to innovate in this space and deliver a platform for unified management of infrastructure resources across both large Cloud platforms and the Edge, but it will also enable us to connect and collaborate with like-minded members for thought leadership and industry best practices,” said Dr. Stefan Nastic, founder and CEO of Reinvent Labs GmbH. 

The SODA Foundation
“The SODA Open Data Framework is an open source data and storage management framework that goes from the edge to the core to the cloud. Centaurus offers the opportunity for SODA to be deployed in the next generation cloud infrastructure for 5G, AI and Edge, and allows both communities to innovate together,” said Steven Tan, SODA Foundation Chairman and VP & CTO Cloud Solution, Storage at Futurewei. 

TU Wien
“We are very excited to be part of the Centaurus ecosystem and honored to be part of this open source movement and contributing in the fields of IoT, Edge intelligence, and Edge and Cloud Computing, including networking and communication aspects, as well as orchestration, resource allocation, and task scheduling,” said Prof. Schahram Dustdar, IEEE Fellow, Member Academia Europaea Professor of Distributed Systems, TU Wien, Austria.

We at The Linux Foundation (LF) work to develop secure software in our foundations and projects, and we also work to secure the infrastructure we use. But we’re all human, and mistakes can happen.

So if you discover a security vulnerability in something we do, please tell us!

If you find a security vulnerability in the software developed by one of our foundations or projects, please report the vulnerability directly to that foundation or project. For example, Linux kernel security vulnerabilities should be reported to <security@kernel.org> as described in security bugs. If the foundation/project doesn’t state how to report vulnerabilities, please ask them to do so. In many cases, one way to report vulnerabilities is to send an email to <security@DOMAIN>.

If you find a security vulnerability in the Linux Foundation’s infrastructure as a whole, please report it to <security@linuxfoundation.org>, as noted on our contact page.

For example, security researcher Hanno Böck recently alerted us that some of the retired linuxfoundation.org service subdomains were left delegated to some cloud services, making them potentially vulnerable to a subdomain takeover. Once we were alerted to that, the LF IT Ops Team quickly worked to eliminate the problem and will also be working on a way to monitor and alert about such problems in the future. We thank Hanno for alerting us!

We’re also working to make open source software (OSS) more secure in general. The Open Source Security Foundation (OpenSSF) is a broad initiative to secure the OSS that we all depend on. Please check out the OpenSSF if you’re interested in learning more.

David A. Wheeler

Director, Open Source Supply Chain Security, The Linux Foundation

Introducing the Open Governance Network Model

Background

The Linux Foundation has long served as the home for many of the world’s most important open source software projects. We act as the vendor-neutral steward of the collaborative processes that developers engage in to create high quality and trustworthy code. We also work to build the developer and commercial communities around that code to sponsor each project’s members. We’ve learned that finding ways for all sorts of companies to benefit from using and contributing back to open source software development is key to the project’s sustainability. 

Over the last few years, we have also added a series of projects focused on lightweight open standards efforts — recognizing the critical complementary role that standards play in building the open technology landscape. Linux would not have been relevant if not for POSIX, nor would the Apache HTTPD server have mattered were it not for the HTTP specification. And just as with our open source software projects, commercial participants’ involvement has been critical to driving adoption and sustainability.

On the horizon, we envision another category of collaboration, one which does not have a well-established term to define it, but which we are today calling “Open Governance Networks.” Before describing it, let’s talk about an example.

Consider ICANN, the agency that arose after demands emerged from evolving the global domain name system (DNS) from its single-vendor control by Network Solutions. With ICANN, DNS became something more vendor-neutral, international, and accountable to the Internet community. It evolved to develop and manage the “root” of the domain name system, independent from any company or nation. ICANN’s control over the DNS comes primarily through its establishment of an operating agreement among domain name registrars that establishes rules for registrations, guarantees your domain names are portable, and a uniform dispute resolution protocol (the UDRP) for times when a domain name conflicts with an established trademark or causes other issues. 

ICANN is not a standards body; they happily use the standards for DNS developed at the IETF. They also do not create software other than software incidental to their mission, perhaps they also fund some DNS software development, but that’s not their core. ICANN is not where all DNS requests go to get resolved to IP addresses, nor even where everyone goes to register their domain name — that is all pushed to registrars and distributed name servers. In this way, ICANN is not fully decentralized but practices something you might call “minimum viable centralization.” Its management of the DNS has not been without critics, but by pushing as much of the hard work to the edge and focusing on being a neutral core, they’ve helped the DNS and the Internet achieve a degree of consistency, operational success, and trust that would have been hard to imagine building any other way. 

There are similar organizations that interface with open standards and software but perform governance functions. A prime example of this is the CA Browser Forum, who manages the root certificates for the SSL/TLS web security infrastructure.

Do we need such organizations? Can’t we go completely decentralized? While some cryptocurrency networks claim not to need formal human governance, it’s clear that there are governance roles performed by individuals and organizations within those communities. Quite a bit of governance is possible to automate via smart contracts (and repairing damage from exploiting them), promoting the platform’s adoption to new users, onboarding new organizations, or even coordinating hard fork upgrades still require humans in the mix. And this is especially important in environments where competitors need to participate in the network to succeed, but do not trust one competitor to make the decisions.

Network governance is not a solved problem

Network governance is not just an issue for the technical layers. As one moves up the stack into more domain-specific applications, it turns out that there are network governance challenges up here as well, which look very familiar.

Consider a typical distributed application pattern: supply chain traceability, where participants in the network can view, on a distributed database or ledger, the history of the movement of an object from source to destination, and update the network when they receive or send an object. You might be a raw materials supplier, or a manufacturer, or distributor, or retailer. In any case, you have a vested interest in not only being able to trust this distributed ledger to be an accurate and faithful representation of the truth. You also want the version you see to be the same ledger everyone else sees, be able to write to it fairly, and understand what happens if things go wrong. Achieving all of these desired characteristics requires network governance!

You may be thinking that none of this is strictly needed if only everyone agreed to use one organization’s centralized database to serve as the system of record. Perhaps that is a company like eBay, or Amazon, Airbnb, or Uber. Or perhaps, a non-profit charity or government agency can run this database for us. There are some great examples of shared databases managed by non-profits, such as Wikipedia, run by the Wikimedia Foundation. This scenario might work for a distributed crowdsourced encyclopedia, but would it work for a supply chain? 

This participation model requires everyone engaging in the application ecosystem to trust that singular institution to perform a very critical role — and not be hacked, or corrupted, or otherwise use that position of power to unfair ends. There is also a trust the entity will not become insolvent or otherwise unable to meet the community’s needs. How many Wikipedia entries have been hijacked or subject to “edit wars” that go on forever? Could a company trust such an approach for its supply chain? Probably not.

Over the last ten years, we’ve seen the development of new tools that allow us to build better-distributed data networks without that critical need for a centralized database or institution holding all the keys and trust. Most of these new tools use distributed ledger technology (“DLT”, or “blockchain”) to build a single source of truth across a network of cooperating peers, and embed programmatic functionality as “smart contracts” or “chaincode” across the network. 

The Linux Foundation has been very active in DLT, first with the launch of Hyperledger in December of 2015. The launch of the Trust Over IP Foundation earlier this year focused on the application of self-sovereign identity, and in many examples, usually using a DLT as the underlying utility network. 

As these efforts have focused on software, they left the development, deployment, and management of these DLT networks to others. Hundreds of such networks built on top of Hyperledger’s family of different protocol frameworks have launched, some of which (like the Food Trust Network) have grown to hundreds of participating organizations. Many of these networks were never intended to extend beyond an initial set of stakeholders, and they are seeing very successful outcomes. 

However, many of these networks need a critical mass of industry participants and have faced difficulty achieving their goal. A frequently cited reason is the lack of clear or vendor-neutral governance of the network. No business wants to place its data, or the data it depends upon, in the hands of a competitor; and many are wary even of non-competitors if it locks down competition or creates a dependency on a market participant. For example, what if the company doesn’t do well and decides to exit this business segment? And at the same time, for most applications, you need a large percentage of any given market to make it worthwhile, so addressing these kinds of business, risk, or political objections to the network structure is just as important as ensuring the software works as advertised.

In many ways, this resembles the evolution of successful open source projects, where developers working at a particular company realize that just posting their source code to a public repository isn’t sufficient. Nor even is putting their development processes online and saying “patches welcome.” 

To take an open source project to the point where it becomes the reference solution for the problem being solved and can be trusted for mission-critical purposes, you need to show how its governance and sustainability are not dependent upon a single vendor, corporate largess, or charity. That usually means a project looks for a neutral home at a place like the Linux Foundation, to provide not just that neutrality, but also competent stewarding of the community and commercial ecosystem.

Announcing LF Open Governance Networks

To address this need, today, we are announcing that the Linux Foundation is adding “Open Governance Networks” to the types of projects we host. We have several such projects in development that will be announced before the end of the year. These projects will operate very similarly to the Linux Foundation’s open source software projects, but with some additional key functions. Their core activities will include:

  • Hosting a technical steering committee to specify the software and standards used to build the network, to monitor the network’s health, and to coordinate upgrades, configurations, and critical bug fixes
  • Hosting a policy and legal committee to specify a network operating agreement the organizations must agree to for connecting their nodes to the network
  • Running a system for identity on the network, so participants to trust other participants who they say they are, monitor the network for health, and take corrective action if required.
  • Building out a set of vendors who can be hired to deploy peers-as-a-service on behalf of members, in addition to allowing members’ technical staff to run their own if preferred.
  • Convene a Governing Board composed of sponsoring members who oversee the budget and priorities.
  • Advocate for the network’s adoption by the relevant industry, including engaging relevant regulators and secondary users who don’t run their own peers.
  • Potentially manage an open “app store” approach to offering vetted re-usable deployable smart contracts of add-on apps for network users.

These projects will be sustained through membership dues set by the Governing Board on each project, which will be kept to what’s needed for self-sufficiency. Some may also choose to establish transaction fees to compensate operators of peers if usage patterns suggest that would be beneficial. Projects will have complete autonomy regarding technical and software choices – there are no requirements to use other Linux Foundation technologies. 

To ensure that these efforts live up to the word “open” and the Linux Foundation’s pedigree, the vast majority of technical activity on these projects, and development of all required code and configurations to run the software that is core to the network will be done publicly. The source code and documentation will be published under suitable open source licenses, allowing for public engagement in the development process, leading to better long-term trust among participants, code quality, and successful outcomes. Hopefully, this will also result in less “bike-shedding” and thrash, better visibility into progress and activity, and an exit strategy should the cooperation efforts hit a snag. 

Depending on the industry that it services, the ledger itself might or might not be public. It may contain information only authorized for sharing between the parties involved on the network or account for GDPR or other regulatory compliance. However, we will certainly encourage long term approaches that do not treat the ledger data as sensitive. Also, an organization must be a member of the network to run peers on the network, required to see the ledger, and particularly write to it or participate in consensus.

Across these Open Governance Network projects, there will be a shared operational, project management, marketing, and other logistical support provided by Linux Foundation personnel who will be well-versed in the platform issues and the unique legal and operational issues that arise, no matter which specific technology is chosen.

These networks will create substantial commercial opportunity:

  • For software companies building DLT-based applications, this will help you focus on the truly value-delivering apps on top of such a shared network, rather than the mechanics of forming these networks.
  • For systems integrators, DLT integration with back-office databases and ERP is expected to grow to be billions of dollars in annual activity.
  • For end-user organizations, the benefits of automating thankless, non-differentiating, perhaps even regulatorily-required functions could result in huge cost savings and resource optimization.

For those organizations acting as governing bodies on such networks today, we can help you evolve those projects to reach an even wider audience while taking off your hands the low margin, often politically challenging, grunt work of managing such networks.

And for those developers concerned before about whether such “private” permissioned networks would lead to dead cul-de-sacs of software and wasted effort or lost opportunity, having the Linux Foundation’s bedrock of open source principles and collaboration techniques behind the development of these networks should help ensure success.

We also recognize that not all networks should be under this model. We expect a diversity of approaches that will be long term sustainable, and encourage these networks to find a model that works for them. Let’s talk to see if it would be appropriate.

LF Governance Networks will enable our communities to establish their own Open Governance Network and have an entity to process agreements and collect transaction fees. This new entity is a Delaware nonprofit, a nonstock corporation that will maximize utility and not profit. Through agreements with the Linux Foundation, LF Governance Networks will be available to Open Governance Networks hosted at the Linux Foundation. 

If you’re interested in learning more about hosting an Open Governance Network at the Linux Foundation, please contact us at governancenetworks@linuxfoundation.org

Thanks!

Brian

The one-millionth commit: The search for the lucky Linux kernel contributor

This week has been “a week of millions” for the Linux Foundation, with our announcement that over 1 million people have taken our free Introduction to Linux course. As part of the research for our recently published 2020 Linux Kernel History Report, the Kernel Project itself determined that it had surpassed one million code commits. Here is how we established the identity of this lucky Kernel Project contributor. 

Methodology:

The historical repo of BitKeeper (converted to Git) has 63,428 commits. We then found the merge at which Linus Torvalds’ repo has at least 936,572 commits (his repo has at least this many commits).

At commit 92c59e126b21fd212195358a0d296e787e444087 the repo had 936,456 commits (116 shy of the million)

>git checkout 92c59e126b21fd212195358a0d296e787e444087

>git log --oneline | wc

 936456 7483489 62991540


The next merge 2f3fbfdaf77f3ac417d0511fac221f76af79f6fc passed that number, with 937,105

> git checkout 2f3fbfdaf77f3ac417d0511fac221f76af79f6fc

> git log --oneline | wc

 937105 7489456 63037625

So on merge 2f3fbfdaf77f3ac417d0511fac221f76af79f6fc Linus’ repo passed the 1M mark (to be precise, 1,000,533 including BitKeeper commits):

commit 2f3fbfdaf77f3ac417d0511fac221f76af79f6fc 92c59e126b21fd212195358a0d296e787e444087 f510ca05271b6f71bd532fe743b39f628110223f (HEAD)

Merge: 92c59e126b21 f510ca05271b

Author: Linus Torvalds <torvalds@linux-foundation.org>

Date:   Mon Aug 3 19:19:34 2020 -0700


Merge tag 'arm-dt-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

At this point, we can simply list the 936,572nd commit in the log:

>git log --oneline | tail -936572 | head -1

85b23fbc7d88 x86/cpufeatures: Add enumeration for SERIALIZE instruction

And the committer is…

git log -1 85b23fbc7d88

commit 85b23fbc7d88f8c6e3951721802d7845bc39663d

Author: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

Date:   Sun Jul 26 21:31:29 2020 -0700

    x86/cpufeatures: Add enumeration for SERIALIZE instruction

Ricardo’s momentous commit to the Kernel was to add enumeration support for the SERIALIZE instruction, supported in Intel’s forthcoming Sapphire Rapids and Alder Lake microarchitectures for their 10-nanometer server and workstation chips. Ricardo is a software engineer who has been working on Linux feature support for Intel’s microprocessors for 12 years as part of the company’s CPU enabling team.

For more about Intel Corporation’s Ricardo Neri, the one-millionth Linux Kernel code committer, please read and watch our interview, conducted by Swapnil Bhartiya on Linux.com.

The Linux Foundation would like to reiterate its statements and analysis of the application of US Export Control regulations to public, open collaboration projects (for example, open source software, open standards, open hardware, and open data) and the importance of open collaboration in the successful, global development of the world’s most important technologies.

Today’s announcement of prohibited transactions by the Department of Commerce regarding WeChat and TikTok in the United States confirms our initial impact analysis for open source collaboration. Nothing in the orders prevents or impacts our communities’ ability to openly collaborate with two valued members of our open source ecosystem, Tencent and ByteDance. From around the world, our members and participants engage in open collaboration because it is open and transparent, and those participants are clear that they desire to continue collaborating with their peers around the world.

As a reminder, we would like to point anyone with questions to our prior blog post on US export regulations, which links to our more detailed analysis of the topic. Both are available in English and Simplified Chinese for the convenience of our audiences.