Posts

Big Data

A new whitepaper from ODPi helps businesses gain insight into how Business Intelligence can be addressed by Big Data through multi-structured data and advanced analytics.

Companies today are collecting data at an unprecedented rate, but how much of the collected data actually makes an impact on their business? According to ODPi, by 2020, the accumulated volume of Big Data will increase from 4.4 zettabytes to roughly 44 zettabytes or 44 trillion GB.

It’s a tall order for companies to translate this data into ROI, and many businesses still don’t know how to combine Business Intelligence (BI) with Big Data to get insightful business value.

Cupid Chan, CTO of Index Analytics and ODPi lead for the BI & AI Special Interest Group (SIG), tells his clients, “It doesn’t matter how much data you have; unless you can get the insight from it, it is just bits and bytes occupying the storage.”

To help such businesses gain insight into how BI can be addressed by Big Data through multi-structured data and advanced data analytics, ODPi has released a new whitepaper called “BI”g Data – How Business Intelligence and Big Data Work Together.

The latest whitepaper shares best practices for combining BI and Big Data. It also shares real end-user perspectives on how businesses are using Big Data tools, the challenges they face, and where they are looking to enhance their investments.

“BI”g Data Highlights

  • Preferred BI/SQL connectors (Hive, Presto, Impala…etc.) for a BI tool to connect to Hadoop
  • Best practices to connect to both Hadoop and RDBMS
  • Recommended BI architecture to query data in Hadoop
  • How BI runs advanced analytics, including Machine Learning algorithm on Hadoop

Chan said that even though vendors vary in tackling the Big Data problem, there are some common themes:

  1. The traditional way to store data in RDBMS is fading, and people are leveraging more and more Big Data platforms. Therefore, BI has to adapt in order to meet customer expectations.
  2. Users want the results now, not a few hours of batch processing once a query is executed. Therefore, BI vendors need to respond creatively, including proprietary connectors, in-memory, and hybrid approaches to meet this requirement.
  3. Instead of creating a brand new standard, vendors are Integrating with existing industry standards, such has R and Python, for advanced analytics to allow users to leverage broader community support.

How much of this data has value?

One trend we have noticed is that companies are collecting massive amounts of data without actually knowing the value of that data and what to do with it. Chan agrees that this is true, especially for those companies who have the budget to ingest as much data as they want.

“Even though this may not be the optimal way to do analytics, it’s not wrong either. In fact, another argument for this practice is unless you have such data available for further analysis, there is no way to prove that the data is worthless,” said Chan.

Chan came up with the “AI + BI = CI” concept, which he first presented at the Conference on Health IT and Analytics (CHITA) organized by the University of Maryland. He is of the opinion that the true intelligence we should pursue is Cognitive Intelligence (CI). This can be achieved by combining the Speed of Machine Learning (provided by AI) with the Direction Intuited from Human Insight (provided by BI). “If companies can focus more on the putting the right subject matter experts for a domain needed to be examined, we can be more efficient to pull the right data for the analytics,” he explained.

When asked about which Big Data/ML platforms and frameworks these companies should take advantage of, Chan said that for data, the most prominent tools are Apache Hadoop (Cloudera/Hortonworks), AWS (S3, EBS, etc.), Azure Storage (Block Blobs, Azure Data Lake Storage, etc.), and Google Cloud (BigTable, Cloud Storage, etc.). For ML, TensorFlow, Keras, Pytorch, and Apache MXNet are all popular.  

According to Chan, companies that are just getting started with this effort can pick any of these frameworks to begin their journey. Companies that have already started should leverage their existing resources in-house first, before deciding to overhaul what they already have, he noted.

Data is the new soil

Modern companies must include Big Data/ML as part of their digital transformation strategy if they want to succeed. “Companies should look at Big Data/ML today the way they looked at building a website 25 years ago. It was expensive to build a website because it was the ‘cutting-edge’ technology. Could you delay building a website in your ‘digital transformation strategy’? Yes, but the result is you will lose the lead to your competitor. Not having Big Data/ML in your digital transformation strategy will be even more impactful due to the fast and furious nature of the technology. So it’s better to have the plan now, and improve it incrementally in an agile fashion,” he said.

You may have heard that “data is the new oil.” Chan, however, prefers the view that data is the new soil. “You can have very fruitful result if you plant your business model properly, but do not expect the fruit will come overnight. And, it requires more than soil for your business to bloom. You also need DevSecOps sunlight to provide photosynthesis, financial support as the fertilizer, proper temperature of the Industry trend, and managerial dedication to water consistently, even though the result can’t be seen immediately. All these need to work together to reap the fruit of new business model,” he said.

Hosted by The Linux Foundation, ODPi aims to be a standard for simplifying, sharing and developing an open big data ecosystem. Through a vendor-neutral, industry-wide approach to data governance and data science, ODPi members bring maturity, choice and collaboration to an open ecosystem.

open mainframe

To learn more about open source and mainframe, join us May 15 at 1:00 pm ET for a webinar led by Open Mainframe Project members Steven Dickens of IBM, Len Santalucia of Vicom Infinity, and Mike Riggs of The Supreme Court of Virginia.

When I mention the word “mainframe” to someone, the natural response is colored by a view of an architecture of days gone by — perhaps even invoking a memory of the Epcot Spaceship Earth ride. This is the heritage of mainframe, but it is certainly not its present state.

From the days of the System/360 in the mid 1960s through to the modern mainframe of the z14, the systems have been designed along four guiding principles of security, availability, performance, and scalability. This is exactly why mainframes are entrenched in the industries where those principles are top level requirements — think banking, insurance, healthcare, transportation, government, and retail. You can’t go a single day without being impacted by a mainframe — whether that’s getting a paycheck, shopping in a store, going to the doctor, or taking a trip.

What is often a surprise to people is how massive open source is on mainframe. Ninety percent of mainframe customers leverage Linux on their mainframe, with broad support across all the top Linux distributions along with a growing number of community distributions. Key open source applications such as MongoDB, Hyperledger, Docker, and PostgreSQL thrive on the architecture and are actively used in production. And DevOps culture is strong on mainframe, with tools such as Chef, Kubernetes, and OpenStack used for managing mainframe infrastructure alongside cloud and distributed.

Learn more

You can learn more about open source and mainframe, both the history along with the current and future states of open source on mainframe, in our upcoming presentation. Join us May 15 at 1:00pm ET for a session led by Open Mainframe Project members Steven Dickens of IBM, Len Santalucia of Vicom Infinity, and Mike Riggs of The Supreme Court of Virginia.

In the meantime, check out our podcast series “I Am A Mainframer” on both iTunes and Stitcher to learn more about the people who work with mainframe and what they see the future of mainframe to be.

Open Data

NOAA is working to make all of its data available to an even wider group of people and make it more easily understood (Image: NOAA).

The goal of the National Oceanic and Atmospheric Administration (NOAA) is to put all of its data — data about weather, climate, ocean coasts, fisheries, and ecosystems – into the hands of the people who need it most. The trick is translating the hard data and making it useful to people who aren’t necessarily subject matter experts, said Edward Kearns, the NOAA’s first ever data officer, speaking at the recent Open Source Leadership Summit (OSLS).  

NOAA’s mission is similar to NASA’s in that it is science based, but “our mission is operations; to get the quality information to the American people that they need to run their businesses, to protect their lives and property, to manage their water resources, to manage their ocean resources,” said Kearns, during his talk titled “Realizing the Full Potential of NOAA’s Open Data.”

He said that NOAA was doing Big Data long before the term was coined and that the agency has way too much of it – to the tune of 30 petabytes in its archives with another 200 petabytes of data in a working data store. Not surprisingly, NOAA officials have a hard time moving it around and managing it, Kearns said.

Data Sharing

NOAA is a big consumer of open source and sharing everything openly is part of the organization’s modus operandi. On a global level, “the agency has been a leader for the entire United States in trying to broker data sharing among countries,” Kearns said. One of the most successful examples has been through the United Nations, with an organization called World Meteorological Organization (WMO).

Agency officials have a tendency to default making their products accessible in the public domain, something Kearns said he’d like to change. By adopting some modern licensing practices, he believes the NOAA could actually share even more information with the public. “The Linux Foundation has made progress on the community data license agreement. This is one the things I’d like to possibly consider adopting for our organization,’’ he added.

One of the great success stories the NOAA has in terms of getting critical data to the public was after Hurricane Irma hit Florida in September 2017, he said.

“As you can imagine, there were a lot of American citizens that were hungry for information and were hitting the NOAA websites very hard and data sites very hard,’’ he said. “Typically, we have a hard time keeping up with that kind of demand.” The National Hurricane Center is part of the NOAA, and the agency took the NHC’s website and put it on Amazon Cloud.

This gave the agency the ability to handle over a billion hits a day during the peak hurricane season. But, he continued, “we are still … just starting to get into how to adopt some of these more modern technologies to do our job better.”

Equal Access

Now the NOAA is looking to find a way to make the data available to an even wider group of people and make it more easily understood. Those are their two biggest challenges: how to disseminate data and how to help people understand it, Kearns said.

“We’re getting hammered every day by a lot of companies that want the data… and we have to make sure everybody’s got an equal chance of getting the data,” he said.

This is becoming a harder job because demand is growing exponentially, he said. “Our costs are going up because we need more servers, we need more networks,” and it’s a problem due to budget constraints.

The agency decided that partnering with industry would help facilitate the delivery of data.

The NOAA is going into the fourth year of a deal it signed with Amazon, Microsoft, IBM, Google, and a nonprofit out of the University of the Chicago called the Open Commons Consortium (OCC), Kearns said. The agreement is that NOAA data will remain free and open and the OCC will host it at no cost to taxpayers and monetize services around the data.

The agency is using an academic partner acting as a data broker to help it “flip this data and figure out how to drop it into all of our collaborators’ cloud platforms, and they turn it around and serve many consumers from that,” Kearns explained. “We went from a one-to-many, to a one-to-a-few, to a many model of distribution.”

People trust NOAA’s data today because they get it from a NOAA data service, he said. Now the agency is asking them to trust the NOAA data that exists outside the federal system on a partner system.

On AWS alone the NOAA has seen an improvement of over two times the number of people who are using the data, he said. The agency in turn, has seen a 50 percent reduction in hits on the NOAA servers.

Google has loaded a lot of the agency’s climate data to its BigQuery data warehouse, “and they’ve been able to move petabytes of this data just in a few months, just because the data now has been loaded into a tool people are already using.”

This “reduces that obstacle of understanding,’’ Kearns noted. “You don’t have to understand a scientific data format, you can go right into BigQuery… and do analyses.”

Data Trust

Being able to trust data is also an important component of any shared initiative, and through the NOAA’s Big Data Project, the agency is seeking ways of ensuring that the trust that comes with the NOAA brand is conveyed with the data, he said, so people continue to trust it as they use it.  

“We have a very proud history of this open data leadership, we’re continuing on that path, and we’re trying to see how we can amplify that,’’ Kearns said.

NOAA officials are now wondering if the data is being made available through these modern cloud platforms will make it easier for users to create information products for themselves and their customers.

“Of course, we’re also looking for other ways of just doing our business better,’’ he added. But they want to figure out if it makes sense to continue this experiment with its partners. That, he said, they will likely know by early next year.

Watch the complete presentation below:

New members support efforts to advance data governance and data science approaches

Berlin, Germany – April 16, 2018 – DataWorks Summit — ODPi, a nonprofit organization accelerating the open ecosystem of big data solutions, today announced that Attunity and ING have joined the initiative to advance data governance and data science approaches.

Many vendors have focused on productizing Apache Hadoop® as a distribution, which led to inconsistency that increased the cost and complexity for application vendors and end-users to  fully embrace Apache Hadoop. Founded in 2015, ODPi is an industry effort to accelerate the adoption of Apache Hadoop and related big data technologies. ODPi’s members aim to accelerate Apache Hadoop adoption through a neutral, industry-wide approach to data governance and data science. Together, they are supporting the mission of creating an open data ecosystem through collaboration with subject matter experts and data platform and tools vendors.

The Big Data market has, in part due to efforts by ODPi and its members, achieved the desired simplification of the Apache Hadoop landscape. However, barriers to broader and more rapid enterprise Hadoop adoption exist and can benefit from a neutral, industry-wide approach to data governance and data science,” said John Mertic, director of program management, ODPi. “We are thrilled to have Attunity and ING on board as ODPi members to help us further these industry-wide approaches.”

The new ODPi members will join a diverse and growing group of members that include well-known Apache Hadoop software companies, service providers and end users, as well as a rapidly growing community.

ING Information Architect and Application Developer, Maryna Strelchuk, and ODPi Director of Program Management, John Mertic, will be co-presenting at DataWorks Summit on The rise of big data governance: Insight on this emerging trend from active open source initiatives.

About the newest members:

Attunity is a leading provider of modern data integration and Big Data management software solutions that enable availability, delivery, and management of data across heterogeneous enterprise platforms in organizations worldwide. Its flagship solution, with change data capture technology, offers real-time data integration and ingestion across all databases, data warehouses, Hadoop and the cloud. Leading businesses choose Attunity to enable data lakes for real-time analytics, and ultimately, maximize the value of their IT and data investments.

“Attunity is excited to become a member of ODPi, helping to set a vision and technology ecosystem for metadata management that will benefit enterprises building modern data architectures,” said Itamar Ankorion, Chief Marketing Officer at Attunity. “Attunity shares ODPi’s belief that automated discovery and maintenance of metadata has to be an integral part of all modern data integration tools like ours that access, change and move information. We look forward to being part of ODPi’s efforts to standardize, support and accelerate growth of the Big Data Ecosystem.”

ING is a global financial institution with a strong European base, offering banking services. We draw on our experience and expertise, our commitment to excellent service and our global scale to meet the needs of a broad customer base, comprising individuals, families, small businesses, large corporations, institutions and governments. Our customers are at the heart of what we do.

“ING decided to become a member of ODPi to help drive standardization around open metadata,” said Ferd Scheepers, Chief Information Architect at ING. “Analytics is one of our strategic priorities, and we believe that standardization of metadata is a key enabler to be successful with analytics. ODPi as an independent group plays a key role in helping standardization across vendors, for ING the key reason to join and support ODPi.”

Additional Resources

About ODPi

ODPi is a nonprofit organization committed to simplification and standardization of the big data ecosystem with a common reference platform called ODPi Core. As a shared industry effort, ODPi members represent big data technology, solution provider and end user organizations focused on promoting and advancing the state of Apache Hadoop® and big data technologies for the enterprise. For more information about ODPi, please visit: http://www.ODPi.org

###

Media Contact:

Natasha Woods

ODPi

(415) 312-5289

pr@odpi.org

New members support efforts to advance data governance and data science approaches

Berlin, Germany – April 16, 2018 – DataWorks Summit — ODPi, a nonprofit organization accelerating the open ecosystem of big data solutions, today announced that Attunity and ING have joined the initiative to advance data governance and data science approaches.

Many vendors have focused on productizing Apache Hadoop® as a distribution, which led to inconsistency that increased the cost and complexity for application vendors and end-users to  fully embrace Apache Hadoop. Founded in 2015, ODPi is an industry effort to accelerate the adoption of Apache Hadoop and related big data technologies. ODPi’s members aim to accelerate Apache Hadoop adoption through a neutral, industry-wide approach to data governance and data science. Together, they are supporting the mission of creating an open data ecosystem through collaboration with subject matter experts and data platform and tools vendors.

The Big Data market has, in part due to efforts by ODPi and its members, achieved the desired simplification of the Apache Hadoop landscape. However, barriers to broader and more rapid enterprise Hadoop adoption exist and can benefit from a neutral, industry-wide approach to data governance and data science,” said John Mertic, director of program management, ODPi. “We are thrilled to have Attunity and ING on board as ODPi members to help us further these industry-wide approaches.”

The new ODPi members will join a diverse and growing group of members that include well-known Apache Hadoop software companies, service providers and end users, as well as a rapidly growing community.

ING Information Architect and Application Developer, Maryna Strelchuk, and ODPi Director of Program Management, John Mertic, will be co-presenting at DataWorks Summit on The rise of big data governance: Insight on this emerging trend from active open source initiatives.

About the newest members:

Attunity is a leading provider of modern data integration and Big Data management software solutions that enable availability, delivery, and management of data across heterogeneous enterprise platforms in organizations worldwide. Its flagship solution, with change data capture technology, offers real-time data integration and ingestion across all databases, data warehouses, Hadoop and the cloud. Leading businesses choose Attunity to enable data lakes for real-time analytics, and ultimately, maximize the value of their IT and data investments.

“Attunity is excited to become a member of ODPi, helping to set a vision and technology ecosystem for metadata management that will benefit enterprises building modern data architectures,” said Itamar Ankorion, Chief Marketing Officer at Attunity. “Attunity shares ODPi’s belief that automated discovery and maintenance of metadata has to be an integral part of all modern data integration tools like ours that access, change and move information. We look forward to being part of ODPi’s efforts to standardize, support and accelerate growth of the Big Data Ecosystem.”

ING is a global financial institution with a strong European base, offering banking services. We draw on our experience and expertise, our commitment to excellent service and our global scale to meet the needs of a broad customer base, comprising individuals, families, small businesses, large corporations, institutions and governments. Our customers are at the heart of what we do.

“ING decided to become a member of ODPi to help drive standardization around open metadata,” said Ferd Scheepers, Chief Information Architect at ING. “Analytics is one of our strategic priorities, and we believe that standardization of metadata is a key enabler to be successful with analytics. ODPi as an independent group plays a key role in helping standardization across vendors, for ING the key reason to join and support ODPi.”

Additional Resources

About ODPi

ODPi is a nonprofit organization committed to simplification and standardization of the big data ecosystem with a common reference platform called ODPi Core. As a shared industry effort, ODPi members represent big data technology, solution provider and end user organizations focused on promoting and advancing the state of Apache Hadoop® and big data technologies for the enterprise. For more information about ODPi, please visit: http://www.ODPi.org

###

Media Contact:

Natasha Woods

ODPi

(415) 312-5289

pr@odpi.org

Through a collaborative effort from enterprises and communities invested in cloud, big data, and standard APIs, I’m excited to welcome the OpenMessaging project to The Linux Foundation. The OpenMessaging community’s goal is to create a globally adopted, vendor-neutral, and open standard for distributed messaging that can be deployed in cloud, on-premise, and hybrid use cases.

Alibaba, Yahoo!, Didi, and Streamlio are the founding project contributors. The Linux Foundation has worked with the initial project community to establish a governance model and structure for the long-term benefit of the ecosystem working on a messaging API standard.

As more companies and developers move toward cloud native applications, challenges are developing at scale with messaging and streaming applications. These include interoperability issues between platforms, lack of compatibility between wire-level protocols and a lack of standard benchmarking across systems.

In particular, when data transfers across different messaging and streaming platforms, compatibility problems arise, meaning additional work and maintenance cost. Existing solutions lack standardized guidelines for load balance, fault tolerance, administration, security, and streaming features. Current systems don’t satisfy the needs of modern cloud-oriented messaging and streaming applications. This can lead to redundant work for developers and makes it difficult or impossible to meet cutting-edge business demands around IoT, edge computing, smart cities, and more.

Contributors to OpenMessaging are looking to improve distributed messaging by:

  • Creating a global, cloud-oriented, vendor-neutral industry standard for distributed messaging
  • Facilitating a standard benchmark for testing applications
  • Enabling platform independence
  • Targeting cloud data streaming and messaging requirements with scalability, flexibility, isolation, and security built in
  • Fostering a growing community of contributing developers

You can learn more about the new project and how to participate here: http://openmessaging.cloud

These are some of the organizations supporting OpenMessaging:

“We have focused on the messaging and streaming field for years, during which we explored Corba notification, JMS and other standards to try to solve our stickiest business requirements. After evaluating the available alternatives, Alibaba chose to create a new cloud-oriented messaging standard, OpenMessaging, which is a vendor-neutral and language-independent and provides industrial guidelines for areas like finance, e-commerce, IoT, and big data. Moreover, it aims to develop messaging and streaming applications across heterogeneous systems and platforms. We hope it can be open, simple, scalable, and interoperable. In addition, we want to build an ecosystem according to this standard, such as benchmark, computation, and various connectors. We would like to have new contributions and hope everyone can work together to push the OpenMessaging standard forward.” — Von Gosling, senior architect at Alibaba, co-creator of Apache RocketMQ, and original initiator of OpenMessaging

“As the sophistication and scale of applications’ messaging needs continue to grow, lack of a standard interface has created complexity and inflexibility barriers for developers and organizations. Streamlio is excited to work with other leaders to launch the OpenMessaging standards initiative in order to give customers easy access to high-performance, low-latency messaging solutions like Apache Pulsar that offer the durability, consistency, and availability that organizations require.” — Matteo Merli, software engineer at Streamlio, co-creator of Apache Pulsar, and member of Apache BookKeeper PMC

“Oath–a Verizon subsidiary of leading media and tech brands including Yahoo and AOL– supports open, collaborative initiatives and is glad to join the OpenMessaging project.” Joe Francis, director, Core Platforms

“In Didi, we have defined a private set of producer API and consumer API to hide differences among open source MQs such as Apache Kafka, Apache RocketMQ, etc. as well as to provide additional customized features. We are planning to release these to the open source community. So far, we have accumulated a lot of experience on MQs and API unification, and are willing to work in OpenMessaging to construct a common standard of APIs together with others. We sincerely believe that a unified and widely accepted API standard can benefit MQ technology and applications that rely on it.” — Neil Qi, architect at Didi

“There are many different open source messaging solutions, including Apache ActiveMQ, Apache RocketMQ, Apache Pulsar, and Apache Kafka. The lack of an industry-wide, scalable messaging standard makes evaluating a suitable solution difficult. We are excited to support the joint effort from multiple open source projects working together to define a scalable, open messaging specification. Apache BookKeeper has been successfully deployed in production at Yahoo (via Apache Pulsar) and Twitter (via Apache DistributedLog) as their durable, high-performance, low-latency storage foundation for their enterprise-grade messaging systems. We are excited to join the OpenMessaging effort to help other projects address common problems like low-latency durability, consistency and availability in messaging solutions.” — Sijie Guo, co-founder of Streamlio, PMC chair of Apache BookKeeper, and co-creator of Apache DistributedLog

2016 was a pivotal year for Apache Hadoop, a year in which enterprises across a variety of industries moved the technology out of PoCs and the lab and into production. Look no further than AtScale’s latest Big Data Maturity survey, in which 73 percent of respondents report running Hadoop in production.

ODPi recently ran a series of its own Twitter polls and found that 41 percent of respondents do not use Hadoop in-production, while 41% of respondents said they do. This split may partly be due to the fact that the concept of “production” Hadoop can be misleading. For instance, pilot deployments and enterprise-wide deployments are both considered “production,” but they are vastly different in terms of DataOps, as Table 1 below illustrates.

YiNSxpTWDbZhddVcZmA13-qBFp8yp7gqIKpNPcU2

Table 1: DataOps Considerations from Lab to Enterprise-wide Production.

As businesses move Apache Hadoop and Big Data out of Proof of Concepts (POC)s and into enterprise-wide production, hybrid deployments are the norm and several important considerations must be addressed. 

Dive into this topic further on June 28th for a free webinar with John Mertic, Director of ODPi at the Linux Foundation, hosting Tamara Dull, Director of Emerging Technologies at SAS Institute.

The webinar will discuss ODPi’s recent 2017 Preview: The Year of Enterprise-wide Production Hadoop and explore DataOps at Scale and the considerations businesses need to make as they move Apache Hadoop and Big Data out of Proof of Concepts (POC)s and into enterprise-wide production, hybrid deployments.

Register for the webinar here.

As a sneak peek to the webinar, we sat down with Mertic to learn a little more about production Hadoop needs.

Why is it that the deployment and management techniques that work in limited production may not scale when you go enterprise wide?

IT policies kick in as you move from Mode 2 IT — which tends to focus on fast moving, experimental projects such as Hadoop deployments — to Mode 1 IT — which controls stable, enterprise wide deployments of software. Mode 1 IT has to consider both the enterprise security and access requirements, but also data regulations that impact how a tool is used. On top of that, cost and efficiency come into play, as Mode 1 IT is cost conscious.

What are some of the step-change DataOps requirements that come when you take Hadoop into enterprise-wide production? 

Integrating into Mode 1 IT’s existing toolset is the biggest requirement. Mode 1 IT doesn’t want to manage tools it’s not familiar with, nor those it doesn’t feel it can integrating into the existing management tools the enterprise is already using. The more Hadoop uniformly fits into the existing devops patterns – the more successful it will be.

Register for the webinar now.

This week in open source news, The Linux Foundation’s Open Networking Summit unites software-defined networking and network functions virtualization (SDN/NFV) pros, academics, and enthusiasts for announcements and collaboration, Microsoft has announced the end of CodePlex in favor of GitHub, & more! Keep reading to get caught up on the biggest headlines in open source this week.
 
1) At the annual Open Networking Summit, SDN & NFV leaders gather. Announcements included the Data Plane Development Kit (DPDK) becoming a Linux Foundation Project and CORD Project working on new OSS service delivery platform.
2) Microsoft “acknowledges that GitHub is the go-to option for project hosting” and announces the end of CodePlex in Q4.
 
3) ONAP Project names SVP of AT&T Labs Chris Rice as chair.
 
4) “After six years of pitching the dream of a converged Linux desktop experience that crosses desktop, mobile, server and cloud, Canonical pulls the plug.”
 
5) Uber’s open source deck.gl tool for data virtualization is getting scalable updates after being released in November.

We’re pleased to kick off 2017 by announcing that JanusGraph, a scalable graph database project, is joining The Linux Foundation. The project is starting with an initial codebase based on the Titan graph database project. Today we see strong interest in the project among developers who are looking to bring the graph database together, as well as support from organizations such as Expero, Google, GRAKN.AI, Hortonworks, IBM and others. We look forward to working with them to help create a path forward for this exciting project.

Several members of the JanusGraph community, including developers from Expero, GRAKN.AI and IBM, will be at Graph Day Texas this weekend and invite discussion about the project.

JanusGraph is able to support thousands of concurrent users in real time. Its features include elastic and linear scalability, data distribution and replication for performance and fault tolerance, high availability and hot backups, integration with big data platforms such as Apache Spark, Apache Giraph and Apache Hadoop, and more.

To get learn more and get involved, visit https://github.com/JanusGraph/janusgraph.

The Linux Foundation’s Hadoop project, ODPi, and Enterprise Strategy Group (ESG) are teaming up on November 7 for a can’t miss webinar for Chief Data Officers and their Big Data Teams.

esg-whitepaper-render-odpi-797×1024.png

Big Data report

As a bonus, all registrants will receive a free copy of Nik’s latest Big Data report.

Join ESG analyst Nik Rouda and ODPi Director John Mertic for “Taking the Complexity out of Hadoop and Big Data” to learn:

  1. How ODPi pulls complexity out of Hadoop, freeing enterprises and their vendors to innovate in the application space

  2. How CDOs and app vendors port apps easily across cloud, on prem and Hadoop distros. Nik revels ESG’s latest research on where enterprises are deploying net new Hadoop installs across on-premise, public, private and hybrid cloud

  3. What big data industry leaders are focusing on in the coming months

Removing Complexity

As ESG’s Nik Rouda observes, “Hadoop is not one thing, but rather a collection of critical and complementary components. At its core are MapReduce for distributed analytics jobs processing, YARN to manage cluster resources, and the HDFS file system. Beyond those elements, Hadoop has proven to be marvelously adaptable to different data management tasks. Unfortunately, too much variety in the core makes it harder for stakeholders (and in particular, their developers) to expand their Hadoop-enhancing capabilities.”
The ODPI Compliant certification program ensures greater simplicity and predictability for everyone downstream of Hadoop Core – SIs, app vendors and end users.

Application Portability

ESG reveals their latest findings on how enterprises are deploying Hadoop, and you may be surprised at the percent moving to the cloud. Find out who’s deploying on premise (dedicated and shared), who’s using pre-configured on-prem infrastructure, what percent are moving to private, public and hybrid cloud.

Where Industry Leaders are Headed

ESG interviewed leaders like Capgemini, VMWare, and more as part of this ODPi research – let their thinking light your way as you develop your Hadoop and Big Data Strategy.

Reserve your spot for this informative webinar. 

As a bonus, all registrants will receive a free copy of Nik’s latest Big Data report.