Big Data

A new whitepaper from ODPi helps businesses gain insight into how Business Intelligence can be addressed by Big Data through multi-structured data and advanced analytics.

Companies today are collecting data at an unprecedented rate, but how much of the collected data actually makes an impact on their business? According to ODPi, by 2020, the accumulated volume of Big Data will increase from 4.4 zettabytes to roughly 44 zettabytes or 44 trillion GB.

It’s a tall order for companies to translate this data into ROI, and many businesses still don’t know how to combine Business Intelligence (BI) with Big Data to get insightful business value.

Cupid Chan, CTO of Index Analytics and ODPi lead for the BI & AI Special Interest Group (SIG), tells his clients, “It doesn’t matter how much data you have; unless you can get the insight from it, it is just bits and bytes occupying the storage.”

To help such businesses gain insight into how BI can be addressed by Big Data through multi-structured data and advanced data analytics, ODPi has released a new whitepaper called “BI”g Data – How Business Intelligence and Big Data Work Together.

The latest whitepaper shares best practices for combining BI and Big Data. It also shares real end-user perspectives on how businesses are using Big Data tools, the challenges they face, and where they are looking to enhance their investments.

“BI”g Data Highlights

  • Preferred BI/SQL connectors (Hive, Presto, Impala…etc.) for a BI tool to connect to Hadoop
  • Best practices to connect to both Hadoop and RDBMS
  • Recommended BI architecture to query data in Hadoop
  • How BI runs advanced analytics, including Machine Learning algorithm on Hadoop

Chan said that even though vendors vary in tackling the Big Data problem, there are some common themes:

  1. The traditional way to store data in RDBMS is fading, and people are leveraging more and more Big Data platforms. Therefore, BI has to adapt in order to meet customer expectations.
  2. Users want the results now, not a few hours of batch processing once a query is executed. Therefore, BI vendors need to respond creatively, including proprietary connectors, in-memory, and hybrid approaches to meet this requirement.
  3. Instead of creating a brand new standard, vendors are Integrating with existing industry standards, such has R and Python, for advanced analytics to allow users to leverage broader community support.

How much of this data has value?

One trend we have noticed is that companies are collecting massive amounts of data without actually knowing the value of that data and what to do with it. Chan agrees that this is true, especially for those companies who have the budget to ingest as much data as they want.

“Even though this may not be the optimal way to do analytics, it’s not wrong either. In fact, another argument for this practice is unless you have such data available for further analysis, there is no way to prove that the data is worthless,” said Chan.

Chan came up with the “AI + BI = CI” concept, which he first presented at the Conference on Health IT and Analytics (CHITA) organized by the University of Maryland. He is of the opinion that the true intelligence we should pursue is Cognitive Intelligence (CI). This can be achieved by combining the Speed of Machine Learning (provided by AI) with the Direction Intuited from Human Insight (provided by BI). “If companies can focus more on the putting the right subject matter experts for a domain needed to be examined, we can be more efficient to pull the right data for the analytics,” he explained.

When asked about which Big Data/ML platforms and frameworks these companies should take advantage of, Chan said that for data, the most prominent tools are Apache Hadoop (Cloudera/Hortonworks), AWS (S3, EBS, etc.), Azure Storage (Block Blobs, Azure Data Lake Storage, etc.), and Google Cloud (BigTable, Cloud Storage, etc.). For ML, TensorFlow, Keras, Pytorch, and Apache MXNet are all popular.  

According to Chan, companies that are just getting started with this effort can pick any of these frameworks to begin their journey. Companies that have already started should leverage their existing resources in-house first, before deciding to overhaul what they already have, he noted.

Data is the new soil

Modern companies must include Big Data/ML as part of their digital transformation strategy if they want to succeed. “Companies should look at Big Data/ML today the way they looked at building a website 25 years ago. It was expensive to build a website because it was the ‘cutting-edge’ technology. Could you delay building a website in your ‘digital transformation strategy’? Yes, but the result is you will lose the lead to your competitor. Not having Big Data/ML in your digital transformation strategy will be even more impactful due to the fast and furious nature of the technology. So it’s better to have the plan now, and improve it incrementally in an agile fashion,” he said.

You may have heard that “data is the new oil.” Chan, however, prefers the view that data is the new soil. “You can have very fruitful result if you plant your business model properly, but do not expect the fruit will come overnight. And, it requires more than soil for your business to bloom. You also need DevSecOps sunlight to provide photosynthesis, financial support as the fertilizer, proper temperature of the Industry trend, and managerial dedication to water consistently, even though the result can’t be seen immediately. All these need to work together to reap the fruit of new business model,” he said.

Hosted by The Linux Foundation, ODPi aims to be a standard for simplifying, sharing and developing an open big data ecosystem. Through a vendor-neutral, industry-wide approach to data governance and data science, ODPi members bring maturity, choice and collaboration to an open ecosystem.

Contributed by Uber, Pyro enables flexible and expressive deep probabilistic modeling

SAN FRANCISCO – February 21, 2019 – The LF Deep Learning Foundation (LF DL), a Linux Foundation project that supports and sustains open source innovation in artificial intelligence (AI), machine learning (ML), and deep learning (DL), announces the Pyro project, started by Uber, as its newest  incubation project. Built on top of the PyTorch framework, Pyro is a deep probabilistic programming framework that facilitates large-scale exploration of AI models, making deep learning model development and testing quicker and more seamless. This is the second project LF DL has voted in from Uber, following last December’s Horovod announcement.

Pyro is used by large companies like Siemens, IBM, and Uber, and startups like Noodle.AI, in addition to Harvard University, MIT, Stanford University, University of Oxford, University of Cambridge, and The Broad Institute. At Uber, Pyro solves a range of problems including sensor fusion, time series forecasting, ad campaign optimization and data augmentation for deep image understanding.

Pyro is the fifth project to join LF DL, which provides financial and intellectual resources, infrastructure, marketing, research, creative services and events support. This rich neutral environment spurs the rapid advancement of its projects, including Acumos AI, the Angel project, EDL project and Horovod, by encouraging additional contributors as well as broader collaboration across the open source community.

“The LF Deep Learning Foundation is excited to welcome Pyro to our family of projects. Today’s announcement of Uber’s contribution of the project brings us closer to our goal of building a comprehensive ecosystem of AI, machine learning and deep learning,” said Ibrahim Haddad, Executive Director of the LF DL. “We look forward to helping to grow the community contributing to and using Pyro to further improve forecasting and other capabilities.”

Pyro was designed with four key principles in mind:

  • Universal: Pyro can represent any computable probability distribution.
  • Scalable: Pyro scales to large data sets with little overhead.
  • Minimal: Pyro is implemented with a small core of powerful, composable abstractions.
  • Flexible: Pyro aims for automation when you want it, control when you need it.

“Pyro was originally created at Uber AI Labs to help make deep probabilistic programming faster and more seamless for AI practitioners in both industry and academia,” said Zoubin Ghahramani, Head of Uber AI Labs. “By incorporating Pyro into the LF DL portfolio, we hope to facilitate greater opportunities for researchers worldwide and make deep learning and Bayesian modeling more accessible.”

Pyro joins existing LF DL projects: Acumos AI, a platform and open source AI framework; Angel, a high-performance distributed machine learning platform based on Parameter Server; EDL, an Elastic Deep Learning framework designed to help cloud service providers to build cluster cloud services using deep learning frameworks; and Horovod, a distributed training framework for TensorFlow, Keras, and PyTorch.

Pyro Background
Pyro provides a language for probabilistic modeling and inference, together with well-tested scalable implementations of inference algorithms including Stochastic Variational Inference and Hamiltonian Monte Carlo. The project was developed at Uber AI Labs as a platform for research in deep Bayesian models, including Bayesian Neural Nets and amortized Bayesian inference. The project currently has nearly 1,500 commits from 50 committers, and is licensed under the MIT license. More information on Pyro can be found on the Uber Engineering BlogUber also recently joined the Linux Foundation as a Gold member and contributed Jaeger, an open source distributed tracing system, to the Cloud Native Computing Foundation.

Additional Resources

About LF Deep Learning
The LF Deep Learning Foundation, a Linux Foundation project, accelerates and sustains the growth of artificial intelligence, machine learning and deep learning open source projects. The LFDL portfolio of projects focuses on Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL). Backed by many of the world’s largest technology leaders, LF Deep Learning is a neutral space for harmonization and ecosystem engagement to advance AI, DL and ML innovation. To get involved with the LF Deep Learning Foundation, please visit https://www.deeplearningfoundation.org.  

About The Linux Foundation
The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and industry adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

# # #

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.

Media Contact
Nancy McGrory
The Linux Foundation
nmcgrory@linuxfoundation.org

Read more at VentureBeat.

Contributed by Uber, Horovod makes distributed deep learning fast and easy to use

SEATTLE – KubeCon + CloudNativeCon North America – December 13, 2018 – The LF Deep Learning Foundation, a community umbrella project of The Linux Foundation that supports and sustains open source innovation in artificial intelligence, machine learning, and deep learning, announces the Horovod project, started by Uber, as its newest project. Horovod, a distributed training framework for TensorFlow, Keras and PyTorch, improves speed, scale and resource allocation in machine learning training activities.

“The LF Deep Learning Foundation is focused on building an ecosystem of AI, deep learning and machine learning projects. Today’s announcement of Uber’s contribution of the Horovod project represents significant progress toward achieving this vision,” said Ibrahim Haddad, Linux Foundation Director of Research. “This project has proven highly effective in training machine learning models quickly and efficiently, and we look forward to working to further grow the Horovod community and encourage adoption of this exciting project.”

Horovod makes it easy to take a single-GPU TensorFlow program and successfully train it on many GPUs faster. Horovod also achieved significantly improved GPU resource usage figures. The project uses advanced algorithms and leverages features of high-performance networks to provide data scientists, researchers and AI developers with tooling to scale their deep learning models with ease and high performance. In benchmarking Horovod against standard distributed TensorFlow, Uber has observed large improvements in its ability to scale, with Horovod coming in roughly twice as fast.

Real-world activities Uber has used Horovod to support include self-driving vehicles, fraud detection, and trip forecasting. It is also being used by Alibaba, Amazon and NVIDIA. Contributors to the project outside Uber include Amazon, IBM, Intel and NVIDIA.

“Uber built Horovod to make deep learning model training faster and more intuitive for AI researchers across industries,” said Alex Sergeev, Horovod Project Lead. “In this spirit, we are honored to contribute Horovod to the deep learning community as the LF Deep Learning Foundation’s newest project. As Horovod continues to mature in its functionalities and applications, this collaboration will enable us to further scale its impact in the open source ecosystem for the advancement of AI.”

Horovod joins existing LF Deep Learning projects: Acumos AI, a platform and open source AI framework; Angel, a high-performance distributed machine learning platform based on Parameter Server; and EDL, an Elastic Deep Learning framework designed to help cloud service providers to build cluster cloud services using deep learning frameworks. Horovod complements these existing projects and future collaboration is anticipated between them.

Horovod Background

Contributed to the LF Deep Learning Foundation by Uber, the project currently has 175 commits from 26 committers, and is licensed under Apache-2.0.

Horovod, which has secured a Linux Foundation Core Infrastructure Initiative Best Practices Badge, is also included in deep learning distributions including AWS Deep Learning AMI, Azure Data Science VM, Databricks Runtime, GCP Deep Learning VM, IBM FfDL, IBM Watson Studio and NVIDIA GPU Cloud. More information on Horovod can be found on the Uber Engineering blog and in this Q&A with Horovod creator, Alex Sergeev.

Following recent news of Uber joining the Linux Foundation as a Gold member, Uber continues to deepen its contributions to open source technology. Another hallmark open source technology from Uber, Jaeger, is a Cloud Native Computing Foundation project.

Organizations and developers interested in contributing projects and learning more about LF Deep Learning Foundation, can go to www.deeplearningfoundation.org.

About LF Deep Learning

The LF Deep Learning Foundation, a Linux Foundation project, accelerates and sustains the growth of artificial intelligence, machine learning and deep learning open source projects. The initiative’s Acumos AI Project is a platform and open source framework that makes it easy to build, share and deploy AI models. Backed by many of the world’s largest technology leaders, LF Deep Learning is a neutral space for harmonization and ecosystem engagement to advance AI, DL and ML innovation. To get involved with the LF Deep Learning Foundation, please visit https://www.deeplearningfoundation.org.

About The Linux Foundation

The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and industry adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

# # #

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.

machine learning

Patrick Ball, Director of Research, Human Rights Data Analysis Group, offered examples of when statistics and machine learning have proved useful and when they’ve failed in this presentation from Open Source Summit Europe.

Machine learning and statistics are playing a pivotal role in finding the truth in human rights cases around the world – and serving as a voice for victims, Patrick Ball, director of Research for the Human Rights Data Analysis Group, told the audience at Open Source Summit Europe.

Ball began his keynote, “Digital Echoes: Understanding Mass Violence with Data and Statistics,” with background on his career, which started in 1991 in El Salvador, building databases. While working with truth commissions from El Salvador to South Africa to East Timor, with international criminal tribunals as well as local groups searching for lost family members, he said, “one of the things that we work with every single time is trying to figure out what the truth means.”

In the course of the work, “we’re always facing people who apologize for mass violence. They tell us grotesque lies that they use to attempt to excuse this violence. They deny that it happened. They blame the victims. This is common, of course, in our world today.”

Human rights campaigns “speak with the moral voice of the victims,’’ he said. Therefore, it is critical that statistics, including machine learning, are accurate, Ball said.

He gave three examples of when statistics and machine learning proved to be useful, and where they failed.

Finding missing prisoners

In the first example, Ball recalled his participation as an expert witness in the trial of a war criminal, the former president of Chad, Hissène Habré. Thousands of documents were presented, which had been discovered as a pile of trash in an abandoned prison and which turned out to be the operational records of the secret police.

The team honed in one type of document that detailed the number of prisoners that were held at the beginning of the day, the number held at the end of the day, and the difference between the number of prisoners who were released, new prisoners brought in, those transferred to other places, and those who had died during the course of the day. Dividing the number of people who died throughout the day by the number alive in the morning produces the crude mortality rate, he said.

The status of the prisoners of war was critical in the trial of Habré because the crude mortality rate was “extraordinarily high,” he said.

“What we’re doing in human rights data analysis is … trying to push back on apologies for mass violence. In fact, the judges in the [Chad] case saw precisely that usage and cited our evidence … to reject President Habré’s defense that conditions in the prison were nothing extraordinary.”

That’s a win, Ball stressed, since human rights advocates don’t see many wins, and the former head of state was sentenced to spend the rest of his life in prison.

Hidden graves in Mexico

In a more current case, the goal is to find hidden graves in Mexico of the bodies of people who have disappeared after being kidnapped and then murdered. Ball said they are using a machine learning model to predict where searchers are likely to find those graves in order to focus and prioritize searches.

Since they have a lot of information, his team decided to randomly split the cases into test and training sets and then train a model. “We’ll predict the test data and then we’ll iterate that split, train, test process 1,000 times,’’ he explained. “What we’ll find is that over the course of four years that we’ve been looking at, more than a third of the time we can perfectly predict the counties that have graves.”

“Machine learning models are really good at predicting things that are like the things they were trained on,” Ball said.

A machine learning model can visualize the probability of finding mass graves by county, which generates press attention and helps with the advocacy campaign to bring state authorities into the search process, he said.

That’s machine learning, contributing positively to society,” he said. Yet, that doesn’t mean that machine learning is necessarily positive for society as a whole.

Predictive Policing

Many machine learning applications “are terribly detrimental to human rights and society,’’ Ball stressed.  In his final example, he talked about predictive policing, which is the use of machine learning patterns to predict where crime is going to occur.

For example, Ball and his team looked at drug crimes in Oakland, California. He displayed a heat map of the density of drug use in Oakland, based on a public health survey, showing the highest drug use close to the University of California.

Ball and his colleagues re-implemented one of the most popular predictive policing algorithms to predict crimes based on this data. Then he showed the model running in animation, with dots on the grid representing drug arrests. Then the model made predictions in precisely the same locations as where the arrests were observed, he said.

If the underlying data turns out to be biased, then “we recycle that bias. Now, biased data leads to biased predictions.” Ball went on to clarify that he was using the term bias in a technical, not racial sense.

When bias in data occurs, he said, it “means that we’re over predicting one thing and that we’re under predicting something else. In fact, what we’re under predicting here is white crime,’’ he said. Then the machine learning model teaches police dispatchers that they should go to the places they went before. “It assumes the future is like the past,” he said.

“Machine learning in this context does not simply recycle racial disparities in policing, [it] amplifies the racial disparities in policing.” This, Ball said, “is catastrophic. Policing already facing a crisis of legitimacy in the United States as a consequence of decades, or some might argue centuries, of unfair policing. ML makes it worse.”

“In predictive policing, a false positive means that a neighborhood can be systematically over policed, contributing to the perception of the citizens in that neighborhood that they’re being harassed. That erodes trust between the police and the community. Furthermore, a false negative means that police may fail to respond quickly to real crime,” he said.

When machine learning gets it wrong

Machine learning models produce variances and random errors, Ball said, but bias is a bigger problem. “If we have data that is unrepresentative of a population to which we intend to apply the model, the model is unlikely to be correct. It is likely to reproduce whatever that bias is in the input side.”

We want to know where a crime has occurred, “but our pattern of observation is systematically distorted. It’s not that [we] simply under-observe the crime, but under-observe some crime at a much greater rate than other crimes.” In the United States, he said, that tends to be distributed by race. Biased models are the end result of that.

The cost of a machine learning being wrong can also destroy people’s lives, Ball said. It also raises the question of who bears the cost of being wrong. You can hear more from Ball and learn more about his work in the complete video presentation below.

Hilary Mason, general manager for machine learning at Cloudera, discussed AI in the real world in her keynote the recent Open FinTech Forum.

We are living in the future – it is just unevenly distributed with “an outstanding amount of hype and this anthropomorphization of what [AI] technology can actually provide for us,” observed Hilary Mason, general manager for machine learning at Cloudera, who led a keynote on “AI in the Real World: Today and Tomorrow,” at the recent Open FinTech Forum.

AI has existed as an academic field of research since the mid-1950s, and if the forum had been held 10 years ago, we would have been talking about big data, she said. But, today, we have machine learning and feedback loops that allow systems continue to improve with the introduction of more data.

Machine learning provides a set of techniques that fall under the broad umbrella of data science. AI has returned, from a terminology perspective, Mason said, because of the rise of deep learning, a subset of machine learning techniques based around neural networks that has provided not just more efficient capabilities but the ability to do things we couldn’t do at all five years ago.

Imagine the future

All of this “creates a technical foundation on which we can start to imagine the future,’’ she said. Her favorite machine learning application is Google Maps. Google is getting real-time data from people’s smartphones, then it is integrating that data with public data sets, so the app can make predictions based on historical data, she noted.

Getting this right, however, is really hard. Mason shared an anecdote about how her name is a “machine learning-edge case.” She shares her name with a British actress who passed away around 2005 after a very successful career.

Late in her career, the actress played the role of a ugly witch, and a search engine from 2009 combined photos with text results. At the time, Mason was working as a professor, and her bio was paired with the actress’s picture in that role. “Here she is, the ugly hag… and the implication here is obvious,’’ Mason said. “This named entity disambiguation problem is still a problem for us in machine learning in every domain.”

This example illustrates that “this technology has a tremendous amount of potential to make our lives more efficient, to build new products. But it also has limitations, and when we have conferences like this, we tend to talk about the potential, but not about the limitations, and not about where things tend to go a bit wrong.”

Machine learning in FinTech

Large companies operating complex businesses have a huge amount of human and technical expertise on where the ROI in machine learning would be, she said. That’s because they also have huge amounts of data, generally created as a result of operating those businesses for some time. Mason’s rule of thumb when she works with companies, is to find some clear ROI on a cost savings or process improvement using machine learning.

“Lots of people, in FinTech especially, want to start in security, anti-money laundering, and fraud detection. These are really fruitful areas because a small percentage improvement is very high impact.”

Other areas where machine learning can be useful is in understanding your customers, churn analysis and marketing techniques, all of which are pretty easy to get started in, she said.

“But if you only think about the ROI in the terms of cost reduction, you put a boundary on the amount of potential your use of AI will have. Think also about new revenue opportunities, new growth opportunities that can come out of the same technologies. That’s where the real potential is.”

Getting started

The first thing to do, she said is to “drink coffee, have ideas.” Mason said she visits lots of companies and when she sees their list of projects, they’re always good ideas. “I get very worried, because you are missing out on a huge amount of opportunity that would likely look like bad ideas on the surface.”

It’s important to “validate against robust criteria” and create a broad sweep of ideas. Then, go through and validate capabilities. Some of the questions to ask include: is there research activity relevant to what you’re doing? Is there work in one domain you can transfer to another domain? Has somebody done something in another industry that you can use or in an academic context that you can use?

Organizations also need to figure out whether systems are becoming commoditized in open source; meaning “you have a robust software and infrastructure you can build on without having to own and create it yourself.” Then, the organization must figure out if data is available — either within the company or available to purchase.

Then it’s time to “progressively explore the risky capabilities. That means have a phased investment plan,’’ Mason explained. In machine learning, this is done in three phases, starting with validation and exploration: Does the data exist? Can you build a very simple model in a week?

“At each [phase], you have a cost gate to make sure you’re not investing in things that aren’t ready and to make sure that your people are happy, making progress, and not going down little rabbit holes that are technically interesting, but ultimately not tied to the application.”

That said, Mason said predicting the future is of course, very hard, so people write reports on different technologies that are designed to be six months to two years ahead of what they would put in production.

Looking ahead

As progress is made in the development of AI, machine learning and deep learning, there are still things we need to keep in mind, Mason said. “One of the biggest topics in our field right now is how we incorporate ethics, how we comply with expectations of privacy in the practice of data science.”

She gave a plug to a short, free ebook called “Data Driven: Creating a Data Culture,” that she co-authored with DJ Patil, who worked as chief data scientist for President Barack Obama. Their goal, she said, is “to try and get folks who are practicing out in the world of machine learning and data science to think about their tools [and] for them to practice ethics in the context of their work.”

Mason ended her presentation on an optimistic note, observing that “AI will find its way into many fundamental processes of the businesses that we all run. So when I say, ‘Let’s make it boring,’ I actually think that’s what makes it more exciting.’”

You can watch the complete presentation below:

Acumos

First Acumos AI release enables deployment of AI applications in private/public cloud environments, and enhances the platform user interface experience

SHANGHAI (KUBECON + CLOUDNATIVECON CHINA) – November 14, 2018 – The LF Deep Learning Foundation, a project of The Linux Foundation that supports open source innovation in artificial intelligence (AI), machine learning (ML), and deep learning (DL), today announced the availability of its first software release of the Acumos AI Project – Athena.

Acumos AI is a platform and open source framework that makes it easy to build, share and deploy AI applications. Acumos AI standardizes the infrastructure stack and components required to run an out-of-the-box general AI environment. This frees data scientists and model trainers to focus on their core competencies and accelerate innovation.

“The Acumos Athena release represents a significant step forward in making AI models more accessible for builders of AI applications and models along with users and trainers of those models and applications,” said Scott Nicholas, senior director of strategic planning at The Linux Foundation. “This furthers the goal of LF Deep Learning and the Acumos project of accelerating overall AI innovation.”

Major highlights of the Athena release include:

  • One-click deployment of the platform utilizing Docker or Kubernetes;
  • The ability to deploy models into a public or private cloud infrastructure or in a Kubernetes environment on users’ own hardware including servers and virtual machines;
  • A design studio, which is a graphical interface for chaining together multiple models, data translation tools, filters and output adapters into a full end-to-end solution;
  • Use of a security token to allow simple onboarding of models from an external toolkit directly to an Acumos AI repository;
  • Decoupling of microservices generation from the model onboarding process to easily repurpose models for different environments and hardware; and
  • An advanced user portal with the ability to personalize marketplace view by theme, data on model authorship as well as to share models privately or publicly and user experience upgrades.

All of these features are designed to make it quick and easy to get started deploying and sharing Acumos AI applications.

Full release notes can be accessed at https://wiki.acumos.org/display/REL/Athena+Release.

“LF Deep Learning members, including Amdocs, AT&T, Orange, Tech Mahindra and others, are contributing to the evolution of the platform to ease the onboarding and the deployment of AI models,” said LF Deep Learning Outreach Committee Chair Jamil Chawki. “The Acumos AI Marketplace, a catalog of community-contributed AI models with common APIs that can be shared securely across multiple systems, remains open and accessible to anyone who wishes to download or contribute models and applications.”

“The LF Deep Learning Foundation is focused on building an ecosystem of AI, deep learning and machine learning projects, and today’s announcement represents a significant milestone toward achieving this vision,” said LF Deep Learning Technical Advisory Council Chair Ofer Hermoni of Amdocs.

“We’re already inspired and energized by the progress of the Acumos AI Project since its initial launch earlier this year,” said Mazin Gilbert, Vice President of Advanced Technology and Systems at AT&T and Governing Board Chair of LF Deep Learning. “Athena is the next step in harmonizing the AI community, furthering adoption and accelerating innovation.”

What’s Next for Acumos AI

The developer community for Acumos AI is already working on the next release, which will be available in mid-2019, introducing convenient model training as well as data extraction pipelines to make models more flexible. Additionally, the next release will include updates to assist closed-source model developers, including secure and reliable licensing components to provide execution control and performance feedback across the community.

Organizations interested in contributing projects and more information about LF Deep Learning Foundation, can go to www.deeplearningfoundation.org.

Supporting Quotes

“Amdocs is proud to be an active member of The Linux Foundation, and in particular, a Founding Member of the Acumos AI project. The Acumos AI Athena release is a big milestone for open source AI and its role in driving intelligence, automation and machine learning in the communication and media industries.  Acumos AI is positioned to become the de-facto marketplace for machine learning models, both for open source as well as peer to peer (company to company) models, delivering faster time to innovation.” – Anthony Goonetilleke, Group President of Amdocs Technology

“As one of the founding members of LF Deep Learning Foundation, Huawei is excited to see this growing AI developer ecosystem continue to help the Acumos AI platform mature, and reach the first major technical milestone – the Acumos Athena release! The Acumos project offers significant value to Huawei SoftCOM AI strategy – building autonomous networks with the goal of automation, self-optimization, and self-healing to help operators significantly improve network utilization and maintenance efficiency. Huawei is proud to be part of this project and will continue to work with the Acumos developer community to further enhance the Acumos AI platform and unleash the power that AI can bring to our operators.” – Xiaoli Jiang, GM, Cloud Open Source Development Team, Huawei

“Orange has been actively involved in Acumos since April 2018 through a Project Team Leader for the model onboarding module to manage and drive evolutions in the onboarding capabilities. This involvement shows the willingness of Orange to take part and promote the AI ecosystem in the telecom domain. Orange also considered the coherency of integrating Acumos in the continuity of all the works performed in the network automation LFN/ONAP project. Acumos is seen as a common platform that can bridge existing AI technologies and new ones through its openness. It can also favor cross business AI-based developments through its federative approach and thanks to its marketplace.” – François Jezequel, Head of ITsation, Procurement and Operators, Orange

“As a key contributor to the Acumos Athena release we are excited to foster a collaborative ecosystem through the Acumos AI Marketplace. It is also a testimony to our COPA framework (Co-Create using Open Source to create Platforms and bring in Automation and AI). With companies and customers increasingly adopting Acumos, industrialization of AI through a common standard and marketplace like Acumos is bound to gain more traction.” – Dr. Satish Pai Sr. Vice President, Americas Communications, Media and Entertainment, Tech Mahindra

About LF Deep Learning

The LF Deep Learning Foundation, a Linux Foundation project, accelerates and sustains the growth of artificial intelligence, machine learning and deep learning open source projects. Backed by many of the world’s largest technology leaders, LF Deep Learning is a neutral space for harmonization and ecosystem engagement to advance AI, DL and ML innovation. To get involved with the LF Deep Learning Foundation, please visit https://www.deeplearningfoundation.org.

About The Linux Foundation

The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and industry adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

# # #

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.

     

Contributed by Baidu and Tencent, new projects accelerate machine learning, artificial intelligence and deep learning innovation and opportunity

SAN FRANCISCO and RIO DE JANEIRO (VLDB 2018) – August 27, 2018 – The LF Deep Learning Foundation, a community umbrella project of The Linux Foundation that supports and sustains open source innovation in artificial intelligence, machine learning, and deep learning, announces that two additional projects have been accepted into the foundation: the Angel Project and the EDL Project. The LF Deep Learning Foundation is focused on building an ecosystem of AI, deep learning and machine learning projects, and today’s announcement represents a significant milestone toward achieving this vision.

The new LF Deep Learning Foundation includes the Acumos AI platform and open source framework that makes it easy to build, share and deploy AI apps. With these two new projects, the Foundation adds technology finely tuned for big data and deep learning via clusters with Baidu’s PaddlePaddle and Kubernetes container orchestration.

Angel Project Background

The Angel Project is a high-performance distributed machine learning platform based on Parameter Server, running on YARN and Apache Spark. It is tuned for performance with big data and provides advantages in handling higher dimension models. It supports big and complex models with billions of parameters, partitions parameters of complex models into multiple parameter-server nodes and implements a variety of machine learning algorithms using efficient model-updating interfaces and functions, as well as flexible consistency models for synchronization.

Contributed to the LF Deep Learning Foundation by member Tencent, the project currently has more than 1,000 commits and is licensed under Apache-2.0.

“Angel shares a common goal with the LF Deep Learning Foundation: to make deep learning easier to use. By becoming a part of the LF Deep Learning Foundation, we believe Angel will be more active in the open source community, accumulate more use cases, expand usage scenarios and actively cooperate with other partners,” said Xiaolong Zhu, Tencent senior AI researcher and TAC member of the LF Deep Learning Foundation. “As a new project under the Foundation, Angel will continue working on a consistent and continuous user experience to make deep learning technology easier to apply and develop.”

The system is designed for efficient iteration computation, so that machine learning algorithms can benefit from it. Algorithms in Angel are out-of-the-box so analysts and data scientists can submit jobs without writing a single line code.

EDL Project Background

EDL is an Elastic Deep Learning framework designed to help deep learning cloud service providers to build cluster cloud services using deep learning frameworks such as PaddlePaddle and TensorFlow.

EDL includes a Kubernetes controller, PaddlePaddle auto-scaler, which changes the number of processes of distributed jobs to the idle hardware resource in the cluster, and a new fault-tolerable architecture.

Contributed by member Baidu, the project currently has nearly 1,000 commits and uses the Apache-2.0 license.

“We are excited to see that EDL has been accepted to LF Deep Learning Foundation,” said Yanjun Ma, Head of Deep Learning Technology Department, Baidu. “As an elastic deep learning framework for PaddlePaddle, we believe that EDL will substantially benefit the deployment of large-scale deep learning services, and the broader deep learning open source community.”

Organizations interested in contributing projects and learning more about LF Deep Learning Foundation, can go to www.deeplearningfoundation.org.

About LF Deep Learning

The LF Deep Learning Foundation, a Linux Foundation project, accelerates and sustains the growth of artificial intelligence, machine learning and deep learning open source projects. The initiative’s Acumos AI Project is a platform and open source framework that makes it easy to build, share and deploy AI models. Backed by many of the world’s largest technology leaders, LF Deep Learning is a neutral space for harmonization and ecosystem engagement to advance AI, DL and ML innovation. To get involved with the LF Deep Learning Foundation, please visit https://www.deeplearningfoundation.org.

About The Linux Foundation

The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and industry adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

# # #

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.

New event hosted by The Linux Foundation brings together leaders in open source technology and the financial services industry

SAN FRANCISCO, August 23, 2018The Linux Foundation, the nonprofit organization enabling mass innovation through open source, today announced the agenda of sessions and speakers for Open FinTech Forum, taking place October 10-11 in New York.

Featured sessions include:

 

  • Enterprise Blockchain Adoption – Trends and Predictions – Saurabh Gupta, HfS Research
  • Build Intelligent Applications with Azure Cognitive Service and CNTK – Bhakthi Liyanage, Bank of America
  • Why Two Sigma Contributes to Open Source – Julia Meinwald, Two Sigma
  • Real-World Kubernetes Use Cases in Financial Services: Lessons learned from Capital One, BlackRock and Bloomberg – Jeffrey Odom, Capital One; Michael Francis, BlackRock; Kevin Fleming, Bloomberg; Paris Pittman, Google; and Ron Miller, TechCrunch
  • Smart Money Bets on Open Source Adoption in AI/ML Fintech Applications – Laila Paszti, GTC Law Group P.C.
  • Three Cs to an Open Source Program Office – Justin Rackliffe, Fidelity Investments
  • Distributed Ledger Technology Deployments & Use Cases in Financial Services – Hanna Zubko, IntellectEU; Jesse Chenard, MonetaGo; Umar Farooq, JP Morgan; Julio Faura, Santander Bank; and Robert Hackett, Fortune

Updated keynotes include:

  • AI in the Real World: Today and Tomorrow – Hilary Mason, Cloudera
  • Adapting Kubernetes for Machine Learning Workflows – Ania Musial and Keith Laban, Bloomberg

The full agenda is available here.

Focusing on the intersection of financial services and open source, Open FinTech Forum will provide CIOs and senior technologists guidance on building internal open source programs as well as an in-depth look at cutting-edge open source technologies, including AI, blockchain/distributed ledger and Kubernetes/containers, which drive efficiencies and flexibility.

Registration is $995 for both days of the event. Those wishing to only attend the first day, consisting of keynotes along with a series of tutorials and working discussions around containers, cloud native, blockchain, establishing an open source program office, using open source, complying with open source licenses and contributing to open source may do so for $449. Registration for only the second day, which includes technical tracks and keynotes, is $649. Additional academic, non-profit and CIO discounts are available as well; details are available on the event registration page.

The Linux Foundation events are where the world’s leading technologists meet, collaborate, learn and network in order to advance innovations that support the world’s largest shared technologies.

Members of the press who would like to request a press pass to attend should contact Dan Brown at dbrown@linuxfoundation.org.

Open FinTech Forum is made possible thanks to sponsors Black Duck by Synopsys, Cloud Native Computing Foundation, GitLab and Sensu.

Additional Resources

YouTube: Why Attend Linux Foundation Events (https://youtu.be/X_rLxfmLlYY)

About The Linux Foundation

The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and industry adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage.

Linux is a registered trademark of Linus Torvalds.

# # #

Foundation welcomes 5 new members and new Governing Board Chair Mazin Gilbert to expand global footprint to accelerate AI innovation

SAN FRANCISCO, August 8, 2018 – The LF Deep Learning Foundation, an umbrella organization of The Linux Foundation that supports and sustains open source innovation in artificial intelligence, machine learning, and deep learning, today announced five new members: Ciena, DiDi, Intel, Orange and Red Hat. The support of these new members will provide additional resources to the community to develop and expand open source AI, ML and DL projects, such as the Acumos AI Project, the foundation’s comprehensive platform for AI model discovery, development and sharing.

These companies join founding members Amdocs, AT&T, B.Yond, Baidu, Huawei, Nokia, Tech Mahindra, Tencent, Univa and ZTE. The LF Deep Learning Foundation is a neutral space for harmonization and acceleration of separate technical projects focused on AI, ML and DL technologies.

“We are very pleased to build off the launch momentum of the LF Deep Learning Foundation and welcome new members with vast resources and technical expertise  to support our growing community and ecosystem of AI projects,” said Lisbeth McNabb, Chief Operating Officer of The Linux Foundation.

Mazin Gilbert, Vice President of Advanced Technology and Systems at AT&T, has also been elected to the role of Governing Board Chair of LF Deep Learning. This position leads the board in supporting various AI and ML open source projects, including infrastructure and support initiatives related to each project.

“The Deep Learning Foundation is a significant achievement by the open source community to drive harmonization among tools and platforms in deep learning and artificial intelligence,” said Mazin Gilbert, Vice President of Advanced Technology and Systems at AT&T. “This effort will enable an open marketplace of analytics and machine learning capabilities to help expedite adoption and deployments of DL solutions worldwide.”

LF Deep Learning will also host a half day workshop at the upcoming Open Source Summit, August 28 in Vancouver, BC, which will provide a scope review of Acumos AI project, an AI full stack overview, overview of new projects and details of how to get involved.

Additionally, the inaugural Acumos AI Day was recently hosted by Orange at their Paris offices and garnered so much interest, there will be a worldwide tour of these events.

Support Quotes from New LF Deep Learning Members

Ciena

“The progression of artificial intelligence and machine learning technologies calls for a shift in how we design and implement networks and services,” said Adan Pope Chief Information Technology Officer at Ciena. “Joining the LF Deep Learning community supports our AI strategy, reinforces our mission to drive intelligent network and service automation, and puts us in a stronger position to help shape the future of this evolving industry.”

DiDi

“DiDi is excited to support the LF Deep Learning Foundation, which is helping fill an important gap in the AI space by providing a space for the open source community to innovate,” said Wensong Zhang, Senior Vice President at DiDi. “We look forward to collaborating with the foundation and global open source and AI communities to develop useful solutions now and into the future.”

Intel Corporation

“Intel is a longtime believer in democratizing technology and making it accessible to all developers. We’re taking that same approach to our AI work today and look forward to working closely with the LF Deep Learning Foundation and AI community,” said Carlos Morales, Senior Director of Deep Learning Systems, AI Product Group, Intel.

Orange

“We believe that the LF Deep Learning Foundation and the Acumos project will accelerate the development of telecom use cases, in an open source environment for communication services, networks, security, and customer care,” said Jamil Chawki, Director IT of Cloud Standards and Open Source at Orange. “By providing a common framework for machine learning and deep learning, LFDL will contribute to lowering the barriers to AI innovation for Telcos.”

Red Hat

“Deep learning has the potential to change everything about how we learn from data,” said Chris Wright, Vice President and CTO at Red Hat. “Open source communities are at the heart of advancing deep learning frameworks and we’re excited to see further collaboration with the LF Deep Learning Foundation around model discovery, development, and lifecycles, and bringing open source software development best practices to deep learning models.”

About LF Deep Learning

The LF Deep Learning Foundation, a Linux Foundation project, accelerates and sustains the growth of artificial intelligence, machine learning and deep learning open source projects. The initiative’s Acumos AI Project is a platform and open source framework that makes it easy to build, share and deploy AI models. Backed by many of the world’s largest technology leaders, LF Deep Learning is a neutral space for harmonization and ecosystem engagement to advance AI, DL and ML innovation. To get involved with the LF Deep Learning Foundation, please visit https://www.deeplearningfoundation.org.

About The Linux Foundation

The Linux Foundation is the organization of choice for the world’s top developers and companies to build ecosystems that accelerate open technology development and industry adoption. Together with the worldwide open source community, it is solving the hardest technology problems by creating the largest shared technology investment in history. Founded in 2000, The Linux Foundation today provides tools, training and events to scale any open source project, which together deliver an economic impact not achievable by any one company. More information can be found at www.linuxfoundation.org.

# # #

The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.