Overcoming open data’s unique challenges: Findings from the 2024 World Open Innovation Conference Challenge Session
Anna Hermansen | 06 May 2025
In the ecosystem of open source, we often think first of open code. But open data is a large and increasingly critical component of openness – particularly as we grapple with the opportunities and challenges of AI, it becomes clear that opening up datasets is a key element of our digital future. Across sectors, from financial services to healthcare to marketing, the integration and triangulation of various data sources provides a breadth and depth of insights that make activities such as pharmaceutical R&D, KYC and AML certification, and personalized marketing much more effective. But how do we bring this data together, particularly when the information is sensitive?
At the World Open Innovation Conference in November 2024, Linux Foundation Research hosted a challenge session on this topic: what are the pathways to open data? As described by Marc Prioleau, Executive Director of Overture Maps Foundation, data has characteristics that present unique challenges to its openness. We asked session participants to share their own experiences with these challenges, whether and how they use open data, whether they open up their own data, and some potential avenues toward openness. The session hosts captured the participant discussion and wrote up the findings in a report recently published on the Linux Foundation website.
Key findings from the session include:
- Data silos hamstring research & innovation, at a time when open data is key for increasing innovation, reliability, and trust. The unique qualities of data as compared to software, such as maintenance, quality, privacy, license diversity, make this process challenging. Participants noted these complexities when discussing their experience with accessing data and opening up their own data.
- To build open data, significant human resources are required for cleaning, standardizing, and maintaining a dataset. The costs of this maintenance cause a tradeoff between the quality of the data and the cost of accessing it. Many participants noted the expense of data curation and how this incentivizes data owners to keep their data proprietary – and how, on the flip side, open data often lacks curation.
- Other open data obstacles include data privacy concerns, stemming from compliance with regulations such as GDPR, as well as the desire to have proprietary control over data, which gives companies greater certainty around quality and allows them to maintain their competitive advantage. Participants expressed concern that opening up their data would reduce their advantage amongst competitors, and that it was safer to keep data proprietary.
- Some examples of success in this area include “semi-open” data platforms, which allow for collaborators to share best practices and aggregate data while maintaining their competitive advantage. Overture Maps Foundation is a key example of an open data project, which has built an agnostic and standardized geospatial data platform for data owners and service providers to leverage. Building an open data infrastructure requires a reworking of current data collection and sharing practices and incentivizing collaboration around a pre-competitive layer.
The WOIC Challenge session highlighted the tradeoffs between open and closed data, along with key concerns and expectations surrounding open databases. The session emphasized the need for a cultural shift toward openness, encouraging collaboration, data sharing, and organizational change despite evolving technological and economic conditions. Data availability is more important than ever – and LF projects exist (for example, check out the Confidential Computing Consortium’s work on protecting data in use) to help organizations strike a balance between privacy preservation and innovation.
To read the complete insights from the session, download the report!