Delta Lake's expanding ecosystem of connectors

This post originally appeared on the Delta Lake blog

We are happy to announce the release of the Delta Lake 2.0 (pypi, maven, release notes) on Apache Spark™ 3.2, with the following features including but not limited to:

The significance of Delta Lake 2.0 is not just a number – though it is timed quite nicely with Delta Lake’s 3rd birthday. It reiterates our collective commitment to the open-sourcing of Delta Lake, as announced by Michael Armbrust’s Day 1 keynote at Data + AI Summit 2022.

What’s new in Delta Lake 2.0?

There have been a lot of new features released in the last year between Delta Lake 1.0, 1.2, and now 2.0. This blog will review a few of these specific features that are going to have a large impact on your workload.

Delta 1.2 vs Delta 2.0 chart

Improving data skipping

When exploring or slicing data using dashboards, data practitioners will often run queries with a specific filter in place. As a result, the matching data is often buried in a large table, requiring Delta Lake to read a significant amount of data. With data skipping via column statistics and Z-Order, the data can be clustered by the most common filters used in queries — sorting the table to skip irrelevant data, which can dramatically increase query performance.

Support for data skipping via column statistics

When querying any table from HDFS or cloud object storage, by default, your query engine will scan all of the files that make up your table. This can be inefficient, especially if you only need a smaller subset of data. To improve this process, as part of the Delta Lake 1.2 release, we included support for data skipping by utilizing the Delta table’s column statistics.

For example, when running the following query, you do not want to unnecessarily read files outside of the year or uid ranges.

Select & from events example

When Delta Lake writes a table, it will automatically collect the minimum and maximum values and store this directly into the Delta log (i.e. column statistics). Therefore, when a query engine reads the transaction log, those read queries can skip files outside the range of the min/max values as visualized below.

code example

This approach is more efficient than row-group filtering within the Parquet file itself, as you do not need to read the Parquet footer. For more information on the latter process, please refer to How Apache Spark™ performs a fast count using the parquet metadata. For more information on data skipping, please refer to data skipping.

Support Z-Order clustering of data to reduce the amount of data read

But data skipping using column statistics is only one part of the solution. To maximize data skipping, what is also needed is the ability to skip with data clustering. As implied previously, data skipping is most effective when files have a very small minimum/maximum range. While sorting the data can help, this is most effective when applied to a single column.

Optimize deltaTable ZORDER BY (x, y)

Regular sorting of data by primary and secondary columns (left) and 2-dimensional Z-order data clustering for two columns (right).

But with ​​Z-order, its space-filling curve provides better multi-column data clustering. This data clustering allows column stats to be more effective in skipping data based on filters in a query. See the documentation and this blog for more details.

Support Change Data Feed on Delta tables

One of the biggest value propositions of Delta Lake is its ability to maintain data reliability in the face of changing records brought on by data streams. However, this requires scanning and reading the entire table, creating significant overhead that can slow performance.

With Change Data Feed (CDF), you can now read a Delta table’s change feed at the row level rather than the entire table to capture and manage changes for up-to-date silver and gold tables. This improves your data pipeline performance and simplifies its operations.

To enable CDF, you must explicitly use one of the following methods:

  • New table: Set the table property delta.enableChangeDataFeed = true in the CREATE TABLE command.

    CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.enableChangeDataFeed = true)
  • Existing table: Set the table property delta.enableChangeDataFeed = true in the ALTER TABLE command.

    ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)
  • All new tables:

    set spark.databricks.delta.properties.defaults.enableChangeDataFeed = true;

An important thing to remember is once you enable the change data feed option for a table, you can no longer write to the table using Delta Lake 1.2.1 or below. However, you can always read the table. In addition, only changes made after you enable the change data feed are recorded; past changes to a table are not captured.

So when should you enable Change Data Feed? The following use cases should drive when you enable the change data feed.

  • Silver and Gold tables: When you want to improve Delta Lake performance by streaming row-level changes for up-to-date silver and gold tables. This is especially apparent when following MERGEUPDATE, or DELETE operations accelerating and simplifying ETL operations.
  • Transmit changes: Send a change data feed to downstream systems such as Kafka or RDBMS that can use the feed to process later stages of data pipelines incrementally.
  • Audit trail table: Capture the change data feed as a Delta table provides perpetual storage and efficient query capability to see all changes over time, including when deletes occur and what updates were made.
Options Table (v1) Change data (merged as v2) change data feed output

See the documentation for more details.

Support for dropping columns

For versions of Delta Lake prior to 1.2, there was a requirement for Parquet files to store data with the same column name as the table schema. Delta Lake 1.2 introduced a mapping between the logical column name and the physical column name in those Parquet files. While the physical names remain unique, the logical column renames become a simple change in the mapping and logical column names can have arbitrary characters while the physical name remains Parquet-compliant.

Before column mapping and with column mapping

As part of the Delta Lake 2.0 release, we leveraged column mapping so that dropping a column is a metadata operation. Therefore, instead of physically modifying all of the files of the underlying table to drop a column, this can be a simple modification to the Delta transaction log (i.e. a metadata operation) to reflect the column removal. Run the following SQL command to drop a column:

ALTER TABLE myDeltaTable DROP COLUMN myColumn

See documentation for more details.

Support for Dynamic Partition Overwrites

In addition, Delta Lake 2.0 now supports Delta dynamic partition overwrite mode for partitioned tables; that is, overwrite only the partitions with data written into them at runtime.

When in dynamic partition overwrite mode, we overwrite all existing data in each logical partition for which the write will commit new data. Any existing logical partitions for which the write does not contain data will remain unchanged. This mode is only applicable when data is being written in overwrite mode: either INSERT OVERWRITE in SQL, or a DataFrame write with df.write.mode("overwrite"). In SQL, you can run the following commands:

SET spark.sql.sources.partitionOverwriteMode=dynamic;
INSERT OVERWRITE TABLE default.people10m SELECT * FROM morePeople;

Note, dynamic partition overwrite conflicts with the option replaceWhere for partitioned tables. For more information, see the documentation for details.

Additional Features in Delta Lake 2.0

In the spirit of performance optimizations, Delta Lake 2.0.0 also includes these additional features:

  • Support for idempotent writes to Delta tables to enable fault-tolerant retry of Delta table writing jobs without writing the data multiple times to the table. See the documentation for more details.
  • Experimental support for multi-part checkpoints to split the Delta Lake checkpoint into multiple parts to speed up writing the checkpoints and reading. See documentation for more details.
  • Other notable changes
    • Improve the generated column data skipping by adding the support for skipping by nested column generated column
    • Improve the table schema validation by blocking the unsupported data types in Delta Lake.
    • Support creating a Delta Lake table with an empty schema.
    • Change the behavior of DROP CONSTRAINT to throw an error when the constraint does not exist. Before this version, the command used to return silently.
    • Fix the symlink manifest generation when partition values contain space in them.
    • Fix an issue where incorrect commit stats are collected.
    • More ways to access the Delta table OPTIMIZE file compaction command.

Building a Robust Data Ecosystem

As noted in Michael Armbrust’s Day 1 keynote and our Dive into Delta Lake 2.0 session, a fundamental aspect of Delta Lake is the robustness of its data ecosystem.

Optimize ZOrder

As data volume and variety continue to rise, the need to integrate with the most common ingestion engines is critical. For example, we’ve recently announced integrations with Apache Flink, Presto, and Trino — allowing you to read and write to Delta Lake directly from these popular engines. Check out Delta Lake > Integrations for the latest integrations.

Delta's expanding ecosystem of connectors

Delta Lake will be relied on even more to bring reliability and improved performance to data lakes by providing ACID transactions and unifying streaming and batch transactions on top of existing cloud data stores. By building connectors with the most popular compute engines and technologies, the appeal of Delta Lake will continue to increase — driving more growth in the community and rapid adoption of the technology across the most innovative and largest enterprises in the world.

Updates on Community Expansion and Growth

We are proud of the community and the tremendous work over the years to deliver the most reliable, scalable, and performant table storage format for the lakehouse to ensure consistent high-quality data. None of this would be possible without the contributions from the open-source community. In the span of a year, we have seen the number of downloads skyrocket from 685K monthly downloads to over 7M downloads/month. As noted in the following figure, this growth is in no small part due to the quickly expanding Delta ecosystem.

The most widely used lakehouse format in the world

All of this activity and the growth in unique contributions — including commits, PRs, changesets, and bug fixes — has culminated in an increase in contributor strength by 633% during the last three years (Source: The Linux Foundation Insights).

But it is important to remember that we could not have done this without the contributions of the community.

Credits

Saying this, we wanted to provide a quick shout-out to all of those involved with the release of Delta Lake 2.0: Adam Binford, Alkis Evlogimenos, Allison Portis, Ankur Dave, Bingkun Pan, Burak Yilmaz, Chang Yong Lik, Chen Qingzhi, Denny Lee, Eric Chang, Felipe Pessoto, Fred Liu, Fu Chen, Gaurav Rupnar, Grzegorz Kołakowski, Hussein Nagree, Jacek Laskowski, Jackie Zhang, Jiaan Geng, Jintao Shen, Jintian Liang, John O’Dwyer, Junyong Lee, Kam Cheung Ting, Karen Feng, Koert Kuipers, Lars Kroll, Liwen Sun, Lukas Rupprecht, Max Gekk, Michael Mengarelli, Min Yang, Naga Raju Bhanoori, Nick Grigoriev, Nick Karpov, Ole Sasse, Patrick Grandjean, Peng Zhong, Prakhar Jain, Rahul Shivu Mahadev, Rajesh Parangi, Ruslan Dautkhanov, Sabir Akhadov, Scott Sandre, Serge Rielau, Shixiong Zhu, Shoumik Palkar, Tathagata Das, Terry Kim, Tyson Condie, Venki Korukanti, Vini Jaiswal, Wenchen Fan, Xinyi, Yijia Cui, Yousry Mohamed.

We’d also like to thank Nick Karpov and Scott Sandre for their help with this post.

How can you help?

We’re always excited to work with current and new community members. If you’re interested in helping the Delta Lake project, please join our community today through many forums, including GitHub, Slack, Twitter, LinkedIn, YouTube, and Google Groups.

Join the community today

OSPO mind map animation

TODO Group is proud to announce a new OSPO Mind Map version release. The mind map shows a Open Source Program Office’s (OSPO) responsibilities, roles, behavior, and team size within an organization. This post highlights the major improvements done by the community in this new version of the OSPO Mind Map.

Updates on Responsibilities section

OSPO Mind Map Responsibilities section has new OSPO-specific topics and different sub-sections defined, including:

  • 📘 Develop and Execute Open Source Strategy
  • 🧭 Eliminate Friction from Using and Contributing to Open Source
  • 🖥️ Manage Open Source IT Infrastructure
  • 📚 Give Advice on Open Source
  • 🫶 Grow and Retain Open Source Talent Inside the Organization
  • 🤝 Implement InnerSource Practices
  • ⏱️ Track Performance Metrics
  • 🤝 Collaborate with Open Source Organizations
  • 📈 Prioritize and Drive Open Source Upstream Development
  • 📝 Establish and Improve Open Source Policies and Processes
  • 🔍 Oversee Open Source Compliance
  • 📒 Support Corporate Development Activities

Initial pull request with these changes can be found here . OSPO mind map animation

Welcoming Contributors 👋

The TODO Community welcomes more contributors to the OSPO mind Map to bring together the various communities involved in OSPO-specific topics. This will help to improve open source professionals’ guidance across the OSPO ecosystem (e.g topics like “InnerSource”, “Open Source metrics”, “Open Source Compliance” and more).

Updates on display

Initially, the OSPO Mind Map displayed all sections by default, showing a huge mind map image. Now, when people access https://ospomindmap.todogroup.org/ the display view will only show the first 2 levels, so people can expand specific sections, avoiding unnecessary information and focusing on what matters to them at that time.

Welcoming Contributors 👋

We are looking for tech contributors to work on a process to automatically deploy new versions of OSPO mind map to the website . If you’d be interested to contribute, please open a PR !

About OSPO Mind Map and OSPOlogy

This Mind Map is part of the TODO Group’s OSPOlogy repository which encapsulates a set of open initiatives (including the OSPO Mind Map, virtual global & regional meetings, an OSPO discussion forum, monthly OSPO News, and now, in-person workshops) to work in collaboration and study the status of OSPOs.

Acknowledgments

Thanks to OSPO Mind Map’s v2.0 contributors and reviewers!

  • Thomas Steenbergen (EPAM)
  • Ana Jiménez (Linux Foundation)
  • Jari Koivisto
  • Josep Prat (Aiven)
  • Gergely Csatari (Nokia)

Special thanks to Ibrahim Haddad (Linux Foundation), we were inspired by the OSPO responsibilities section in A Close Look at Open Source Program Offices: Structure, Roles and Responsibilities .

This post originally appeared on the LF Energy’s blog. LF Energy is a project at the Linux Foundation that provides a neutral, collaborative community to build the shared digital investments that will transform the world’s relationship to energy.

The energy sector is amid a huge transformation that will impact the entire world and grid operators need new innovations to match those needs.

That’s why we’re especially excited to see the recognition awarded Antonello Monti, Director of the SOGNO logo Institute for Automation of Complex Power Systems at RWTH Aachen University and group Leader at Center for Digital Energy, Fraunhofer FIT, for his leadership with SOGNO, the “Service-based Open-source Grid automation platform for Network Operation” of the future.

Monti received the second most prestigious award given by the German government, the innovation prize of North Rhine-Westphalia. Awarded annually, this prize recognizes outstanding achievements and excellent research.

We are so proud of the work Monti, who also serves at the Technical Advisory Committee Chair for LF Energy, and Markus Mirz have undertaken. We also want to extend our congratulations to the many individuals, companies, and the European Commission who funded the original work for SOGNO (meaning “dream” in Italian).

SOGNO is an LF Energy project that is creating plug-and-play, cloud-native, micro-services to implement our next generation of data-driven monitoring and control systems. It will simplify the life of distribution utilities by enabling them to optimize their network operations through open source to deliver cost-effectively, and seamless, secure power to customers.

A breakthrough innovation is that SOGNO introduces the idea of grid automation as a modular system in which components can be added through time. This is in opposition to classical monolithic solutions, which weren’t constructed with today’s energy landscape in mind.

Today, as more renewables come onto the grid, the flow of energy moves from just one way, which was true in the past, to both ways on and off the grid.In the future, power system networks will be composed of assets whose profiles may shift between loads, resources, and the ability to provide flexibility back to the grid.

Reinforcing the current system is not sufficient to deal with the increasing complexity of distribution systems. Rather, we are at the cusp of needing deployment of advanced distribution management systems that can be implemented as centralized but even better as distributed architecture.

We reiterate our deep gratitude and support for this project, and the people and entities who’re making it happen.

Read here for more information

Microsoft joins over 25 organizations committed to democratizing 3D software development for games and simulations

SAN FRANCISCO – April 29, 2022 – The Open 3D Foundation (O3DF) is proud to welcome Microsoft as a Premier member alongside Adobe, AWS, Huawei, Intel, and Niantic. Microsoft’s participation in the project brings a wealth of knowledge and thought leadership that continues to reinforce how important the industry believes in working to make a high-fidelity and fully-featured open-source 3D engine available to every industry unencumbered by commercial terms. 

Microsoft Principal Group Program Manager Paul Oliver will join the Governing Board of O3DF, supporting the Foundation’s commitment to ensure balanced collaboration and feedback that meets the needs of the Open 3D community. The Governing Board cultivates innovative relationships among stakeholders to drive the Foundation’s strategic direction and its stewardship of 3D visualization and simulation projects. 

“Microsoft’s roots in creativity run deep, and we want to help creators wherever they are, whoever they are, and whatever platform they’re creating for. Having the Linux Foundation create the Open 3D Foundation is a fantastic step towards helping more creators everywhere and we are excited to be a part of it.”

This move builds on Microsoft’s continued commitment to democratizing game development and making its tools and technologies available to game creators worldwide. Last year, the company made its Game Development Kit available to all developers through GitHub. With its new engagement with O3DF, Microsoft is extending a commitment to opening up technology to everyone.

“We are elated to have Microsoft join the Open 3D Foundation as a Premier member,” said Royal O’Brien, Executive Director of O3DF and General Manager of Games and Digital Media at the Linux Foundation. “Having incredible industry veterans like Microsoft contributing and helping drive innovation with the community for 3D engines is a huge benefit to the open-source community and the companies that use it alike.”

A Growing Community

Microsoft is one of 25 member companies since the public announcement of the Open 3D Foundation in July 2021. In November 2021, Open 3D Engine (O3DE) announced its first major release. The 21.11 Release allows simulation developers to create 3D content with the new O3DE Linux editor and engine runtime. This release also added a new Debian package and Windows installer that provides a faster route to getting started with the engine. The O3DE community is very active, averaging up to 2 million line changes and 350-450 commits monthly from 60-100 authors across 41 repos.

Where to See the Open 3D Engine Next

On June 20, the Open 3D Foundation will host Open 3D Connect, a half-day interactive meet-up, co-located with the Linux Foundation’s Open Source Summit North America in Austin, Texas. Learn more here.

Additionally, on October 18-19, the Open 3D Foundation will host its flagship conference, bringing together technology leaders, indie and independent 3D developers, and the academic community to share ideas, discuss hot topics and foster the future of 3D development across a variety of industries and disciplines. For those interested in sponsoring this event, please contact pr@o3d.foundation

Anyone interested in the Open 3D Engine is invited to get involved and connect with the community on Discord.com/invite/o3de and GitHub.com/o3de

About the Open 3D Engine (O3DE) project

The Open 3D Engine (O3DE) is the flagship project managed by the Open 3D Foundation (O3DF). The open-source project is a modular, cross-platform 3D engine built to power anything from AAA games to cinema-quality 3D worlds to high-fidelity simulations. The code is hosted on GitHub under the Apache 2.0 license. To learn more, please visit o3de.org.

About the Open 3D Foundation

Established in July 2021, the mission of the Open 3D Foundation (O3DF) is to make an open-source, fully-featured, high-fidelity, real-time 3D engine for building games and simulations, available to every industry. The Open 3D Foundation is home to the O3DE project. To learn more, please visit o3d.foundation.

About the Linux Foundation

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation’s projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, and more. The Linux Foundation’s methodology focuses on leveraging best practices and addressing the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.

Media Inquiries:

pr@o3d.foundation

(Updated July 15, 2022)

For many of us, it has been several years since we’ve been in conference settings, or around many people at all. As we close in on a broader return to in-person events next month, this is the perfect time to reiterate that our events are gatherings intended for professional networking and collaboration for the open source community, that exist to encourage the open exchange of ideas. Thus, they require an environment that recognizes the inherent worth of every person and group. All event participants, whether they are attending an in-person or a virtual event, are expected to behave in accordance with our Event Code of Conduct. In short: Be kind. Be professional. Treat everyone with respect. 

The importance of a diverse, welcoming and inclusive open source community has been widely understood for some time. Progress is slowly being made, but there is a long way to go. We created our Event Code of Conduct in 2011 as one of many ways we at the Linux Foundation could help create a more welcoming community. Events play a huge role in how open source communities collaborate, and it is critical that these are safe spaces, free of harassment and discrimination. 

In the earlier years of our Event Code of Conduct, we received very few incident reports, but that number has grown, especially in recent years. This is a good thing. It means our event participants feel more comfortable speaking up. And the more people speak up, the sooner we can reach our shared goal of a truly inclusive community. 

To that end, we will begin publishing a round-up of Event Code of Conduct reports, starting with this 2021 summary. We only held a few in-person events in 2021, so expect these reports to be longer in the future as we continue to hold more in-person events. Moving forward, these reports will be published bi-annually. We will also publish event-specific reports for events with 2,000+ in-person attendees.

We look forward to seeing you all soon, online or in person.  

The Linux Foundation Events Team
events@linuxfoundation.org

———

2021 Code of Conduct Incidents By Event

KubeCon Europe (Virtual) 

  • 2 reports of concern that several CNCF ambassadors were airing grievances about not having talks accepted at the event, which belittled the work of the program committee
  • 1 report of inappropriate sexual advance in a virtual session via chat
    • Resolution: A warning was issued

Open Source Summit North America (In Person + Virtual)

  • 1 person videotaping other attendees without their consent (In Person)
    • Resolution: A warning was issued
  • 1 report of attendee violating the mask mandate
    • Resolution: A warning was issued

KubeCon North America (In Person + Virtual) 

  • 1 person videotaping other attendees without their consent (In Person) 
    • Resolution: A 2nd and final warning was issued and letting them know their action is illegal in California
  • 2 reports of attendees violating the mask mandate 
    • Resolution: warnings were issued 
  • 1 report of staff at a sponsor booth ignoring a woman attendee
    • Resolution: A warning was issued
  • 1 person banned from attending the event due to behavior prior to event showed up to the JW Marriott multiple times
    • Resolution: The individual was escorted out of the venue each time
  • 1 attendee was speaking unprofessionally to a member of the LF staff when asked to abide by Covid health + safety protocols
    • Resolution: A warning was issued
  • 2 sponsors were handing out collateral with profanity on them
    • Resolution: A warning was issued, and they refrained from passing out the offending materials thereafter
  • 1 attendee reported (on social media) a staff member at the JW Marriott restaurant was racially profiling them
    • Resolution: LF notified JW Marriott hotel management and LF staff followed up with the attendee that alerted LF of the issue
  • Multiple reports of harassment were received against the same attendee. Additional reports were received post-KubeCon as well, for a total of 5 reports.
    • Immediate Temporary Suspension: Promptly after the complaints were received, the accused person was suspended from participating in LF communities, including attending LF Events, to protect the community from risk of harm while the investigation was ongoing.  The accused person violated this temporary suspension by traveling to the city where a Linux Foundation conference was being held and engaging in social activities with conference attendees off-premises, outside of official conference spaces that LF Events staff and security were able to monitor and control.  LF Events staff did not learn that the accused violated the temporary suspension until after the conference had ended.   Additionally, because the accused person had previously been engaged to perform work for the CNCF community as a contractor, LF and CNCF immediately suspended the accused person’s service.  
    • The Investigation: The LF engaged a neutral outside professional investigator to conduct an in-depth investigation.  The investigator reviewed extensive documentary evidence and interviewed 15 people, including all reporters and the accused.  Due to the challenges of scheduling interviews with 15 witnesses and obtaining evidence from third parties including conference vendors, it took several months to complete the investigation.  Because the reported incidents took place at or in connection with LF and CNCF events, all aspects of the investigation and its outcome were managed by LF Events staff and the external investigator rather than by CNCF staff.
    • No Conflicts of Interest: No individual staff member on the Linux Foundation Events team or CNCF leadership team had either a close friendship or personal business relationship with the accused person that could have created a conflict of interest or impacted the investigation or its outcome, despite rumors that have come to our attention suggesting that such a relationship existed.  Even though no actual conflicts of interest existed, individuals who were the subject of these rumors did not participate in the investigation or any aspect of decision-making regarding its outcome.
    • Final Resolution: At the conclusion of the investigation, the decision was to ban the accused person from attending any future Linux Foundation or LF project events, and from participating in any committee or holding any leadership position with any Linux Foundation project (including but not limited to CNCF). Additionally, LF and CNCF permanently ended their contractual relationship with the accused individual.  The individual was notified of these decisions.
    • Confidentiality and Information Sharing: While the investigation was still in progress, we asked participants in the investigation to honor the confidential nature of the investigation in order to prevent attempts to influence other witnesses, tamper with evidence, or otherwise undermine the integrity of the investigation.  However, as soon as the investigation was completed, those risks no longer existed; therefore, reporters and witnesses may now speak freely about the incident and the investigation with anyone they choose to.  

PrestoCon Day (Virtual)

  • 1 Attendee was spamming links to YouTube videos and memes for competitors in the virtual chat.
    • Resolution: LF staff deleted posts and removed the user from the event platform. The attendee’s registration information was fake, so no further follow up could be done.

SAN FRANCISCO, April 6, 2022 — Automotive Grade Linux (AGL), a collaborative cross-industry effort developing an open source platform for all connected car technologies, announces IndyKite, Marelli and Red Hat as new Bronze members.

“Our active community of automakers and suppliers continues to expand and invest resources in AGL, demonstrating the value of participating in the AGL ecosystem,” said Dan Cauchy, Executive Director of Automotive Grade Linux at the Linux Foundation. “We are excited to welcome our new members to the AGL community, and we look forward to working with them as we continue to expand and enhance the AGL platform.”

AGL is an open source project at The Linux Foundation that is bringing together automakers, suppliers and technology companies to accelerate the development and adoption of a fully open, shared software platform for all technology in the vehicle, from infotainment to autonomous driving.

Supported by more than 150 members, including 10 automakers, the AGL Unified Code Base (UCB) is a shared software platform that serves as the de facto industry standard for infotainment, telematics, and instrument cluster applications. Sharing a single software platform across the industry reduces fragmentation and accelerates time-to-market by encouraging the growth of a global ecosystem of developers and application providers that can build a product once and have it work for multiple automakers.

New Member Quotes:

IndyKite
“IndyKite is building the identity layer for Web 3.0, with products that securely manage human, IoT, and machine identity. Based on open source standards, IndyKite’s identity platform leverages machine learning and data graphs to deliver context-aware authorization, dynamic policy decisions, computer vision and edge security, built on a knowledge graph data model,” said Lasse Andresen, Founder and CEO of IndyKite. “We are excited to join AGL and connect and collaborate with the community to build identity services for the next generation of automotive and transport software solutions.”

Marelli
“MARELLI is one of the world’s leading global independent suppliers to the automotive sector. Our mission is to transform the future of mobility through working with customers and partners to create a safer, greener and better-connected world,” said Yannick Hoyau, CTO, Electronic Systems, Marelli Corporation. “One way to achieve our mission is collaboration to further upgrade various automotive systems. Operating systems are becoming increasingly important for vehicles to manage complex vehicle systems. Under these circumstances, we believe that AGL will become one of the standard operating systems in the automotive industry in the near future. We are confident that the wealth of expertise and experience that MARELLI has accumulated in the automotive industry will surely contribute to the further development of AGL.”

Red Hat
“Red Hat is looking forward to working alongside AGL as we bring our open source, Linux-based expertise to the automotive software ecosystem,” said Francis Chow, Vice President, Red Hat In-Vehicle OS. “If we, as a community, set our sights on delivering a safe, reliable and flexible foundation for software-defined vehicles, automakers will be able to focus on open innovation – redefining the customer driving experience.”

###

About Automotive Grade Linux (AGL)

Automotive Grade Linux is a collaborative open source project that is bringing together automakers, suppliers and technology companies to accelerate the development and adoption of a fully open software stack for the connected car. With Linux at its core, AGL is developing an open platform from the ground up that can serve as the de facto industry standard to enable rapid development of new features and technologies. Although initially focused on In-Vehicle-Infotainment (IVI), AGL is the only organization planning to address all software in the vehicle, including instrument cluster, heads up display, telematics, advanced driver assistance systems (ADAS) and autonomous driving. The AGL platform is available to all, and anyone can participate in its development. Automotive Grade Linux is hosted at the Linux Foundation. Learn more at automotivelinux.org.

In a new case study released by Linux Foundation Research, in collaboration with the Academy Software Foundation, entitled Open Source in Entertainment: How the Academy Software Foundation Creates Shared Value, we learn a compelling story of how open technology and the people who create visual effects (VFX) for motion pictures transformed a highly competitive industry.


The Academy Software Foundation (ASWF) was formed as an entertainment industry collaboration with the Academy of Motion Picture Arts & Sciences, the organization behind the Academy Awards (aka the Oscars). ASWF has been steadily releasing software projects contributed since its inception in 2018. Four projects are fully adopted, and six are in incubation. 

Adopted Projects

  • OpenVDB is an industry-standard library for manipulating sparse dynamic volumes used by visual effects studios to create realistic volumetric images such as water/liquid simulations and environmental effects like clouds and ice. 
  • OpenColorIO is an industry standard for consistent color management across VFX and animation pipelines used on hundreds of feature film productions. It touches nearly every pixel of every visual effects frame in most major motion pictures. 
  • OpenEXR is a standard HDR image file format for high-quality image processing and storage, one of the foundational technologies in computer imaging. 
  • OpenCue is an open source render management system used to break down complex jobs into individual tasks. 

Incubating Projects

  • OpenTimelineIO is an Open Source application programming interface and interchange format for editorial timeline information.
  • MaterialX is an open standard for exchanging rich material and look-development content across applications and renderers. 
  • Rez is an open source, cross-platform package manager that creates standalone configured environments for third-party and proprietary digital content creation software. 
  • DPEL is the Digital Production Example Library, which are digital sample assets that content creators can use for instructional purposes 
  • RawtoACES  is a software package that converts digital camera RAW files to ACES container files containing image data encoded according to the Academy Color Encoding Specification (ACES) 

The entertainment industry now has a home, process, and governance structure to manage open source projects essential to movie, television, and gaming production. Any new project can be proposed, and projects are managed according to a project lifecycle policy that provides various requirements and project benefits. Many ASWF projects have been foundational to creating visual effects and major motion pictures in their entirety. These elements continue to thrill audiences around the world.

The ASWF has been steadily releasing new software projects since its inception in 2018.

In addition to hosting technologies for the entertainment industry, the ASWF provides a neutral forum to coordinate open source project efforts, a common build and test infrastructure, open governance, more consistent open source licensing, and a clear path to participation for individuals and organizations wanting to advance the open source ecosystem for the motion picture industry. 

In doing so, the ASWF has brought together leading studios such as DreamWorks Animation, Sony Pictures Imageworks, Walt Disney Studios (including Pixar, LucasFilm, Industrial Light & Magic, Blue Sky Studios), Warner Bros., DNEG, Netflix, and technology vendors that support the film and gaming industries. 

Open Source collaboration in the entertainment industry was not always such a pretty picture

Circa 2014, the motion picture industry faced fragmented software infrastructure issues, with proprietary solutions not based on open source software or running on open source operating systems. These platforms were also not providing the innovation needed to create the landmark films and television programs we enjoy today. So it necessitated that each VFX and film studio build their own tools.

The studios had a core desire to move from their closed systems to more open ones like Linux. However, the motion picture industry’s challenges were not about accepting open source software but about getting the industry ecosystem to participate and collaborate in open environments. 

As we learn in the case study, at visual effects studios such as SONY Pictures and ILM, there were no common build systems outside any company’s networks, so it became increasingly difficult to figure out the proper instructions to build the open source software that any industry contributor had released. 

It was challenging to align dependencies and versions, leading to “versionitis” as projects required different versions of dependencies. Additionally, when maintainers left a company that “owned the project,” the codebase languished – such was the case with SONY DreamWorks’ OpenColorIO and ILM’s OpenEXR software, as detailed in the report.

As a result, studios were reluctant to take dependence on other companies’ projects and even more unwilling to contribute their intellectual property back to another company’s project. Add in a layer of one-sided contribution agreements, modifications to standard open source licenses, and other legal impediments. It was clear the status quo could not scale to meet the industry’s growing needs. 

The entertainment industry’s open source ecosystem depends on its people

As detailed in the report, the Academy and Linux Foundation spent nearly two years working with industry stakeholders to build a better, collaborative solution, resulting in the ASWF and its associated projects. None of the success that ASWF now enjoys would have been possible without the engineers, the software developers, and the filmmakers that support the underlying ecosystem. And participating in this ecosystem has tangible benefits for the contributors.

ASWF has also become a focal point for driving new interest in software development in the motion picture industry and recognizing the contributions of its community members thanks to the “Behind the Screens” interview series featuring over two dozen software developers in the industry, along with the launch of a Diversity and Inclusion working group to raise the profile of underrepresented people in these roles.

While the ASWF has made great strides since its inception in 2018, it is still a young organization but has found its place in the industry. Diversity and Inclusion initiatives are leading the way towards educating the entertainment industry to help them attract more diversity within its ranks. New efforts underway, such as DPEL (formerly Open Asset Repository), will provide sample content to breed and help new aspiring content creators learn the trade.

Why is this research so valuable? We’ve seen related examples in telecommunications, energy, automotive, and public health, where many of these projects started as individual efforts looking for a neutral home at the Linux Foundation. Over time, these communities of competitive contributors found it beneficial to collaborate. 

Although the entertainment industry has unique requirements for its vertical applications, the story behind the creation of the ASWF can serve as a “roadmap” for leaders in other industries to get a win-win by shared investment and collaboration in open technologies. Open source in entertainment is another example of open source value creation. Read the full report HERE.

DENT 2.0

The DENT project is an open source network operating system utilizing the Linux Kernel, Switchdev, and other Linux based projects, hosted under the Linux Foundation. The project has announced DENT 2.0 is available for immediate download

The “Beeblebrox” release adds key features utilized by distributed enterprises in retail and remote facilities, providing a secure and scalable Linux-based Network Operating System (NOS) for disaggregated switches adaptable to edge deployment. This means DENT provides a smaller, more lightweight NOS for use at the small, remote edges of enterprise networks.

DENT 2.0 adds secure scaling with Internet Protocol version 6 (IPv6) and Network Address Translation (NAT) to support a broader community of enterprise customers. It also adds Power over Ethernet (PoE) control to allow remote switching, monitoring, and shutting down. Connectivity of IoT, Point of Sale (POS), and other devices is highly valuable to retail storefronts, early adopters of DENT. DENT 2.0 also adds traffic policing, helping mitigate attack situations that overload the CPU. 

“DENT has made great strides this past year and with its edge and native Linux approach, with a rich feature set for distributed enterprises like retail or remote facilities. DENT continues to expand into new use cases and welcomes community input with an open technical community, under the Linux Foundation,” said Arpit Joshipura, GM of Networking & Edge at The Linux Foundation.

DENT 2.0 Main Features to enable secure and scalable development

  • Secure scaling with IPv6 and NAT to appeal to a broader community of SME customers
  • PoE control to allow remote switching, monitoring, and shutting down
  • Rate limiting to protect against broadcast storms, creating a stronger OS under erroneous BUM (Broadcast, Unicast, Multicast) traffic

DENT enables enterprises to transition to disaggregated network switches and use cases available with the distributed enterprise and edge networking. The open source NOS provides key technology leverage in retail, a sector that is leading innovation in digital transformation. The Amazon public showcase of DENT hardware at re:Invent in November 2021 reached 20,000+ attendees.

“This new release of DENT 2.0 adds critical updates focused on smaller enterprise needs. This was the goal of DENT all along, and I would like to thank our members and the wider community for this broad, concerted effort to move DENT significantly forward,” said Steven Noble, DENT Technical Steering Committee Chair. “It’s not easy building a flexible, accessible network OS, and this is why I’m proud of all the effort and coordination by so many talented individuals. If you are looking for an open source disaggregated network OS, now is great timing for looking at DENT.”

Retail stores, warehousing, remote locations, enterprise, and Small and Mid-Size Enterprises are all ideal environments for DENT deployment. Wiring closets in many facilities are small. Staff expertise may be limited, and branch-office switches from leading suppliers can require costly contracts. DENT is easily deployed on white-box hardware in small spaces. It can be set up to support dozens of wireless access points and IoT sensors, creating a manageable network to track inventory, monitor shelf real estate, scan customer activity, and perform automated checkouts.

DENT premier members include Amazon, Delta Electronics Inc, Edgecore Networks, and Marvell. Important contributions to the DENT project have come from NVIDIA, Keysight Technologies, and Sartura.

“Delta has built complete white box networking platforms based on DENT technology, helping drive a disaggregation model in edge that offers cost and flexibility benefits to customers looking for OEM solutions,” said Charlie Wu, Vice President, Solution Center at Delta Networks. “The deployment of our 1G and 10G Ethernet switch boxes with Marvell’s Prestera® devices and the DENT OS in real world applications demonstrates the power of open source to accelerate technology innovation in networking.” 

“Edgecore Networks, as the premier member of DENT, is pleased to see the groundbreaking second release of DENT 2.0, enabling DENT community members to use the DENT’s simplified abstracts, APIs, drivers, to lessen development and deployment overhead,” said Taskin Ucpinar, Senior Director of SW Development. “This innovative product development approach enables the community to build robust solutions with minimal effort and immediately help System Integrators deploy a networking solution to remote campuses and retail stores.”

“As the chairing company for DENT Test Working Group, Keysight has partnered with the open-source community to host the system integration test bed in Keysight labs,” said Dean Lee, Senior Director Cloud Solution Team. “Being a neutral test vendor, we have worked with the community to harden the DENT NOS in multi-vendor interoperability, performance, and resiliency. We are delighted to contribute to the success and wide adoption of DENT.”

“Marvell is accelerating the build-out of Ethernet switching infrastructure in emerging edge and borderless enterprise applications, and DENT is a key component to our offerings,” said Guy Azrad, Senior Vice President and General Manager, Switch Business Unit at Marvell. “With DENT incorporated on our Prestera® switch platforms, we are currently enabling retailers to transform physical stores to smart retail connected environments that benefit consumers through easy and efficient in-store experiences.”

Download and test DENT 2.0: https://github.com/dentproject/dentOS

Additional DENT Resources

OSPOCon with Austin skyline in background

Share and learn by speaking at OSPOCon, joining Work Day activities, and more opportunities from TODO

Do you engage in open source-related tasks within your organization? You know that collaboration is key. Here are three ways to engage and network with your open source peers and leverage your organization’s open source program! 

1) Speak at OSPOCon, the premier event for OSPOs

Aiming to provide continuous education and ease OSPO adoption across organizations, the TODO Group, in collaboration with the Linux Foundation, launches OSPOCon 2022 Call for Proposals. OSPOCon is the premier event for Open Source Program Offices to share information, solve problems, and learn how to build effective Open Source initiatives within organizations. 

Why consider submitting a proposal to speak at OSPOCon?

OSPOCon is a go-to place where those working in open source program offices (or similar initiatives) in organizations can:

  • Share best practices, tooling, and lessons learned
  • Learn the newest OSPO trends
  • Connect and learn from the wide diversity of open source professionals’ visions
  • Take part in real-time discussions and give to get feedback from the community

Overall, people can come together to learn and share best practices, experiences, and tools to overcome OSPO challenges and similar open source initiatives.

OSPOCon NA and Europe are in-person and virtual events that are part of Open Source Summit conference umbrella. To submit a proposal  via the OSSummit CFP (people will also get access to all the other events in the Open Source Summit collection).

Please remember the CFP submissions deadlines for each of the events. We hope to see you in the upcoming OSPOCon series!

2) Contribute to OSPO resources with the broader community in the new TODO Work Day Activities

TODO comprises individual community contributors and 70+ organizations with years of experience running open source programs. They all want to collaborate on practices, tools, and other ways to run successful and effective open source projects and programs. We have a wide range of ongoing OSPO initiatives where everyone (from the most seasoned OSPOers to students) can participate and become a contributor.

Why consider attending the next Work Day meeting?

A good practice to keep learning from OSPOs is to share knowledge and be inspired by other community participants that run open source initiatives when working on common tooling and resources. 

TODO organizes Work Day activity monthly meetings to ease community participation and work together with other OSPOers and open source experts on the various issues and PRs in the TODO Group GitHub organization.

Work Days have even a handful of things sorted by TODO project contribution level that we expect people to work during these meetings.

Learn more in the dedicated repo and review the upcoming meeting dates:

  • Wednesday, March 9, 2022, at 16:30 PM UTC
  • Monday, March 14, 2022, at 10:00 AM UTC

3) Study and discuss the status of OSPOs with OSPOlogy and TODO Sync calls

The OSPOlogy repo provides continuous OSPO learning and discussions with other OSPOers thanks to the OSPOlogy monthly community meetings, TODO Sync calls, and OSPO Forum.

Bonus: Resources for practical OSPO implementation

We went through three popular OSPO networking spaces where people can engage with the different professionals involved in open source program offices or similar open source initiatives within organizations. 

The good news is that TODO Group goes far beyond a place to connect with other OSPOers. This group also drives open source education and adoption powered by course materials, research studies, and resources created by experienced professionals to keep learning about OSPOs, anytime.

Here is a list of the most popular resources that can help people find inspiration by the vision of open source professionals and guidance.

  • [NEW]  The Evolution of the Open Source Program Office Study: provides a set of patterns and directions, as well as a checklist, to help implement an OSPO or an open source initiative within corporate environments. This includes an OSPO maturity model, practical implementation from noted OSPO programs across regions and sectors, and a handful of broad OSPO archetypes (or personas), which drive differentiation in OSPO behavior
  • TODO Guides: A collection of best practices from the leading companies engaged in open source development aims to help organizations successfully implement and run an open source program office.
  • OSPO Survey:  The TODO Group is committed to running an annual survey of the status of Open Source Program Offices and sharing the results and data with the wider community. People can find the open data and previous results at Linux Foundation Research
  • OSPONews: Never miss a thing of the newest OSPO trends! This is the monthly newsletter to stay up to date on Open Source Program Office (OSPO) trends.

TODO Group is a great place to begin and advance in the OSPO journey. The open source community is always welcome to be part of TODO. Welcome to the OSPOverse!