Posts

When people talk about cloud native applications you almost inevitably hear a reference to a success story using Apache Mesos as an application delivery framework at tremendous scale. With adoption at Twitter, Uber, Netflix, and other companies looking for scale and flexibility Mesos provides a way to abstract resources (CPU, memory, storage, etc.) in a way that enables distributed applications to be run in fault-tolerant and elastic environments. The Mesos kernel provides access to these abstractions via APIs and scheduling capabilities in much the same way that the Linux kernel does but geared towards consumption at the application layer rather than the systems layer.

Benjamin Hindman (@benh), the co-creator of Apache Mesos, developed the open source powerhouse as a Ph.D. student at UC Berkeley before bringing it to Twitter.  The software now runs on tens of thousands of machines powering Twitter’s data centers and is often credited for killing the fail whale and providing the scale Twitter needed to serve its growing base of over 300 million users. It’s also causing a huge ground swell in companies developing cloud native applications.

Ben, now founder of Mesosphere, will give the welcome address at MesosCon North America, the Apache Mesos conference going on in Denver on June 1-2. This event is a veritable who’s who from across the industry of those using Mesos as a framework to develop cloud native applications.

MesosCon is a great place to learn about how to design application clusters running on Apache Mesos from engineers who have done it like Craig Neth (@cneth), distinguished member of the technical staff at Verizon, who will walk attendees through how they got a 600 node Mesos cluster powered up and running tasks in 14 days.

Your Uber has arrived, thanks to Open Source Software

Traditionally, machines were statically partitioned across the different services at Uber. In an effort to increase the machine utilization, Uber has recently started transitioning most of its services, including the storage services, to run on top of Apache Mesos.

At MesosCon, Uber engineers will describe the initial experience building and operating a framework for running Cassandra on top of Mesos across multiple data centers at Uber. This framework automates several Cassandra operations such as node repairs, the addition of new nodes, and backup/restore. It improves efficiency by co-locating CPU-intensive services as well as multiple Cassandra nodes on the same Mesos agent. And it handles failure and restart of Mesos agents by using persistent volumes and dynamic reservations.

Running Cassandra on Apache Mesos Across Multiple Datacenters at Uber at MesosCon

Microservices, Allowing us to binge watch House of Cards on Netflix

Netflix customers worldwide streamed more than forty-two billion hours of content last year. Service-style applications, batch jobs, and stream processing alike, from a variety of use cases across Netflix, rely on executing container-based applications in multi-tenant clusters powered by Apache Mesos and Fenzo, a scheduler Java library for Apache Mesos frameworks. These applications are consuming microservices that allows Netflix to build composable applications at massive scale.  

Based on the experiences from Netflix projects Mantis and Titus, Netflix Software Engineer Sharma Podila (@podila) will share his experiences running Docker and Cgroups based containers in a cloud native environment.

Lessons from Netflix Mesos Clusters at Mesoscon.

How Microservices are being Implemented at Adobe

Dragos Sccalita Haut is a solutions architect at Adobe’s API Platform, adobe.io, building a high scale distributed API Gateway running in the cloud. He realized that as the number of microservices increase and the communication between them becomes more complicated. This brings new questions to light:

How do microservices authenticate?
How do we monitor who’s using the APIs they expose?
How do we protect them from attacks?
How do we set throttling and rate limiting rules across a cluster of microservices?
How do we control which services allow public access and which ones we want to keep private?
How about Mesos APIs and frameworks, can they benefit from these features as well?

The answer to these questions was using the Mesos API management layer to expose microservices in a secure, managed and highly available way.

Let Dragos teach you to Be a Microservices Hero at MesosCon.

MesosCon in the Mile High City June 1-2

If you are interested in hearing how Apache Mesos is being developed and deployed by the world’s most interesting and progressive companies the place to see this is MesosCon on June 1-2, in Denver. The conference will feature two days of sessions to learn more about the Apache Mesos core, an ecosystem developed around the project, and related technologies. The program will include workshops to get started with Apache Mesos, keynote speakers from industry leaders, and sessions led by adopters and contributors.

 

ApacheCon North America and Apache Big Data are coming up in just a few weeks and it’s an event that Apache and open source community members won’t want to miss.

Apache products power half the Internet, manage exabytes of data, execute teraflops of operations, store billions of objects in virtually every industry, and enhance the lives of countless users and developers worldwide. And behind those projects is a thriving community of more than 4,500 committers from around the world.

ApacheCon, the annual conference of The Apache Software Foundation, is the place where all of those users and contributors can meet to collaborate on the next generation of cloud, Internet, and big data technologies.

Here, five attendees of last year’s ApacheCon and Apache Big Data, explain how they benefitted from the conference.  

1. Learn from experienced developers

“You meet the best people around the globe who share the same passion for software and sharing. It’s great listening to experienced senior programmers and the interesting use cases they have been solving.” – Yash Sharma, a contributor to Apache Drill, Apache Calcite, and a committer to Apache Lens.

2. Reach consensus faster

“You’re able to meet with some of the folks and talk about things that may take more time than on the (mailing) lists. You’re able to exchange ideas before bringing them to the community. Face to face can have a huge impact on attitude and interaction moving forward. Sometimes it’s tough to put tone in email, so it’s good to share in a personal manner.” – Jeff Genender, who is involved in several Apache projects including Camel, CXF, ServiceMix, Mina, TomEE, and ActiveMQ.

3. Meet your ecosystem partners

“I had the opportunity to talk with committers and PMC members of other projects that are built on top of Apache jclouds. At the time of ApacheCon we had to make some unpopular decisions such as dropping support for unmaintained providers, or rejecting some pull requests that had little hope to progress, and one of the objectives I had was to directly discuss with the jclouds ecosystem which impact that could have, how the projects could collaborate better, and how we could better align our roadmaps.” – Ignasi Barrera, Chair of Apache jclouds.

4. Explore other open source projects

“For me ApacheCon is all about community. I met so many great people, had a lot of thoughtful conversations, and heard about dozens of very interesting projects I had no idea existed.” – Andriy Redko, who participates in Apache CXF.

5. Meet your family

“Only after the ApacheCon did I understand the real power of Apache. For me, before ApacheCon it was just a group of geeks who try to write awesome code to make the world a better place, but now I feel like I’m a member of a huge family who cares very much for each other. It was like, what it seems to be a code base became home for me and now I’m not just trying to improve the code base but rather to make the family bigger in every aspect.” – Dammina Sahabandu, who’s involved in Apache Bloodhound.

ApacheCon North America and Apache Big Data take place May 11-13 in Vancouver, B.C.

 

Register Now for ApacheCon North America

Register Now for Apache: Big Data

 

While our updated Linux.com boasts a clean look and fresh interface for our users, there’s also an entirely new infrastructure stack that we’re happy to take you on a tour of. Linux.com serves over two million page views a month: providing news and technical articles as well as hosting a dynamic community of logged-in users in our forums and Q&A parts of our site.

The previous platform running Linux.com suffered from several scalability problems.  Most significantly, it had no native ability to cache and re-serve the same pages to anonymous visitors, but beyond that, the underlying web application and custom code was also slow to generate each individual pageview.

The new Linux.com is built on Drupal, an open source content management platform (or web development framework, depending on your perspective). By default, Drupal serves content in a such a way as to ensure that pages served to anonymous users are general enough (not based on sessions or cookies), and have the correct flags in them (HTTP cache control headers), to allow Drupal to be placed below a stack of standards-compliant caches to improve the performance and reliability of both page content (for anonymous visitors) and static content like images (to all visitors including logged-in users).

The Drupal open-source ecosystem provides many modular components that can be assembled in different ways to build functionality. One advantage of reusable smaller modules is the combined development contributions of the many developers building sites with Drupal who use, reuse, and improve the same modules. While developers may appreciate more features, or fewer bugs in code refined by years of development, on the operations side this often translates into consistent use of performance best practices, like widespread use of application caching mechanisms and implementing extensible backends that can swap between basic configurations and high availability ones.

Linux.com takes the performance-ready features of Drupal and combines it with the speed and agility of the Fastly CDN network to re-serve our content from locations around the world that are closer to our site visitors and community. Fastly is a contributor to and supporter of open source software, including the open source Varnish cache. Their use of Varnish provides an extra level of transparency into the caching configuration, making it easier to integrate with the rest of our stack. Fastly provides a very flexible cache configuration interface, but as an added bonus, they let you add your own custom Varnish VCL configuration. The Drupal ecosystem already provides a module to integrate with Fastly, which in typical Drupal fashion doesn’t reinvent the wheel, but leverages the Expire module, a robust community module that provides configurable cache clearing triggers and external integrations used on over 25,000 sites (as of April 2016).

While Varnish provides a very powerful cache configuration language, Linux.com also uses another caching reverse proxy, NGINX, as an application load balancer in front of our FastCGI Drupal application servers. While NGINX is less flexible for advanced caching scenarios, it is also a full-featured web server. This allows us to use NGINX to re-serve some cached dynamic content from our origin to Fastly at the same time as serving the static content portions of our site (like uploads and aggregated CSS and JS, which are shared between NGINX and our PHP backends with NFS). We run two bare-metal NGINX load balancers to distribute this load, running Pacemaker to provide highly available virtual IPs. We also use separate bare-metal servers to horizontally scale out our Drupal application servers. These run the PHP FastCGI Process Manager. Our NGINX load balancers maintain a pool of FastCGI connections to all the application backends (that’s right, no Apache httpd is needed!).

We’re scaling out the default Drupal caching system by using Redis, which provides much faster key/value storage than storing the cache in a relational database. We have opted to use Redis in a master/slave replication configuration, with Redis Sentinel handling master failover and providing a configuration store that Drupal uses to query the current master. Each Drupal node has its own Redis Sentinel process for a near-instant master lookup. Of course, the cache isn’t designed to store everything, so we have separate database servers to store Linux.com’s data. These are in a fairly-typical MySQL replication setup, using slaves to scale out reads and for failover.

Finally, we’ve replaced the default Drupal search system with a search index powered by SolrCloud: multiple Solr servers in replication, with cluster services provided by ZooKeeper. We’re using Drupal SearchAPI with the Solr backend module, which is pointing to an NGINX HTTP reverse proxy that load balances the Solr servers.

I hope you’ve enjoyed this tour and that it sparks some ideas for your own infrastructure projects. I’m proud of the engineering that went into assembling these—configuration nuances, tuning, testing, and myriad additional supporting services—but it’s also hard to do a project like this and not appreciate all the work done by the individual developers and companies who contribute to open source and have created incredible open source technologies. The next time the latest open source pro blog or technology news loads snappily on Linux.com, you can be grateful for this too!