The Question of Docker, The Future of OS Virtualization

In this article I’m going to take a look at Docker and OS Virtualization autonomously of each other. There’s a reason, which will unfold as I dig through some data and provide this look into what is and isn’t happening in the virtualization space.

It’s important to also note what methods were used to attain the information provided in this article. I have obtained information through speaking with Docker employees and key executives including Ben Golub and founder Solomon Hykes over the years since the founding of Docker (and it’s previous incarnation dotCloud, before the pivot and name change to Docker).

Beyond communicating directly with the Docker team and gaining insight from them I have also done a number of interviews over the course of 4 days. These interviews have followed a fairly standard set of questions and conversation about the Docker technology, including but not limited to the following questions.

  • What is your current use of Docker visualization technologies?
  • What is your future intended use of Docker technologies?
  • What is the general current configuration and setup of your development team(s) and tooling that they use (i.e. stack: .NET, java, python, node.js, etc)
  • Do you find it helps you to move forward faster than without?

The History of OS-Level Virtualization

First, let’s take a look at where virtualization has been, then I’ll dive into where it is now, and then I’ll take a look at where it appears to be going in the future and derive some information from the interviews and discussions that I’ve had with various teams over the last 4 days.

The Short of It

OS-level virtualization is a virtualization application that allows the installation of software in a complete file system, just like a hypervisor based virtualization server, but dramatically faster installation and prospectively speed overall by using the host OS for OS-level virtualization. This cuts down on excess redundancies
within the core system and the respective virtual clients on the host.

Virtualization in concept has been around since the 1960s, with IBM being heavily involved at the Cambridge Scientific Center. Over time developments continued, but the real breakthrough in pushing virtualization into the market was VMware in 1999 with their virtual platform. This, hypervisor level virtualization great into a huge industry with the help of VMware.

However OS-level virtualization, which is what Docker is based on, didn’t take off immediately when introduced. There were many product options that came out over time around OS-level virtualization, but nothing made a huge splash in the industry similar to what Docker has. Fast forward to today and Docker was released in 2013 to an ever increasing developer demand and usage.

Timeline of Virtualization Development

Docker really brought OS-level virtualization to the developer community at the right time in regards to demands around web development and new ways to implement effective continuous delivery of applications. Docker has been one of the most extensively used OS-level virtualization tools to implement immutable infrastructure, continuous build, integration, and deployment environments, and to use as a general virtual environment to spool up resources as needed for development.

Where we Are With Virtualization

Currently Docker holds a pretty dominant position in the OS-level virtualization market space. Let’s take a quick review of their community statistics and involvement from just a few days ago.

The Stats: Docker on Github -> https://github.com/docker/docker

Watchers: 2017
Starred: 22941
Forks: 5617

16,472 Commits
3 Branches
102 Releases
983 Contributors

Just from that data we can ascertain that the Docker Community is active. We can also take a deep look into the forks and determine pull requests, acceptance of and related data to find out that the overall codebase is healthy with involvement. This is good to know since at one point there were questions if Docker had the capability to manage the open source legions pushing the product forward while maintaining the integrity, reputation, and quality of the product.

Now let’s take a look at what that position is based on considering the interviews I’ve had in the last 4 days. Out of the 17 people I spoke with all knew what Docker is. That’s a great position to be in compared to just a few years ago.

Out of the 17 people I spoke with, 15 of the individuals are working on teams that have, are implementing or are in some state between having and implementing Docker into their respective environments.

Of the 17, only 13 said they were concerned in some significant way about Docker Security. All of these individuals were working on teams attempting to figure out a way to use Docker in a production way, instead of only in development or related uses.

The list of uses that the 17 want to use or are using Docker for vary as much as the individual work that each is currently working on. There are however some core similarities in what they’re working on where Docker comes into play.

The most common similarity among Docker uses is simply as a platform to build out development testing environments or test servers. This is routinely a database server or simple distributed database like Cassandra or Riak, that can be built immutably, then destroyed and recreated whenever it is needed again for test and development. Some of the build outs are done with Docker specifically to work up a mock distributed database environment for testing. Mind you, I’m probably hearing about and seeing this because of my past work with Basho and other distributed systems programmers, companies, and efforts around this type of technology. It’s still interesting and very telling none the less.

The second most common usage is for Docker to be used somewhere in the continuous delivery chain. The push to move the continuous integration and delivery process to a more immutable, repeatable, and reliable process has been a perfect marriage between Docker and these needs. The ability to spin up entire environments in a matter of seconds and destroy them on whim, creating them again a matter of moments later, as made continuous delivery more powerful and more possible than it has ever been.

Some of the less common, yet still key uses of Docker, that came up during the interviews included; in memory cache servers, network virtualization, and distributed systems.

Virtualization’s Future

Pathing

With the history covered, the core uses of Docker discussed, let’s put those on the table with the acquisitions. The acquisitions by Docker have provided some insight into the future direction of the company. The acquisitions so far include: Kitematic, SocketPlane, Koality, and Orchard.

From a high level strategic play, the path Docker is pushing forward into is a future of continued virtualization around, as the hipsters might say “all the things”. With their purchase of Kitematic and SocketPlane. Both of these will help Docker expand past only OS virtualization and push more toward systemic virtualization of network environments with programmatic capabilities and more. These are capabilities that are needed to move past the legacy IT environments of yesteryear which will open up more enterprise possibilities too.

To further their core use that exists today, Docker has purchased Koality. Koality provides parallelizable continuous integration, deployment, and related services. This enables Docker to provide more built out services around this very important.

The other acquisition was Orchard (orchardup.com). This is a startup that provides a Docker host in the cloud, instantly. This is a similar purchase as the Koality one. It bulks up capabilities that Docker had some level of already. It also pushes them forward with two branches of capabilities: SaaS based on the web and prospectively offering something behind the firewall, which the Koality acquisition might have some part to play also.

Threat Vectors

Even though the pathways toward the future seem clear for Docker in many ways, in other ways they see dramatically less clear. For one, there are a number of competitive options that are in play now, gaining momentum and on the horizon. One big threat is Google’s lack of interest in Docker has led them to build competing tooling. If they push hard into the OS level virtualization space they could become a substantial threat.

The other threat vector, is the simple unknown of what could become a threat. Something like Mesos might explode in popularity and determine it doesn’t want to use Docker, and focus on another virtualization path. In the same sense, Mesos could commoditize Docker to a point that the value add at that level of virtualization doesn’t retain a business market value that would sustain Docker.

The invisible threat around this area right now is fairly large. There’s no greater way to determine this then to just get into a conversation with some developers about Docker. In one sense they love what it allows them to do, but the laundry list of things they’d like would allow for a disruptor to come in and steal the Docker thunder pretty easily. To put it simply, there isn’t a magical allegiance to Docker, developers will pick what helps them move the ball forward the fastest and easiest.

Another prospective threat is a massive purchase by a legacy software company like Oracle, Microsoft, or someone else. This could effectively destabilize the OSS aspects of the product and slow down development and progress, yet it could increase corporate adoption many times over what it is now. So this possibility is something that shouldn’t be ruled out.

Summary

Docker has two major threats: the direct competitor and their prospectively being leapfrogged by another level of virtualization. The other prospective threat to part of the company is acquisition of Docker itself, while it could mean a huge increase in enterprise penetration. In the future path the company and technology is moving forward in, there will be continued growth in usage and capabilities. The growth will maintain in the leading technology startups and companies of this kind, while the mid-size and larger corporate environments will continue to adopt and deploy at a slower pace.

A Question for You

I’ve put together what I’ve noticed, and I’d love to see things that you dear reader might notice about the Docker momentum machine. Do you see networking as a strength, other levels of virtualization, deployment of machines, integration or delivery, or some other part of this space as the way forward into the future. Let me know what your thoughts are on Twitter or whatever medium you feel like reaching out on. Of course, I’d also love to know if you think I’m wrong about anything I’ve written here.

Docker Red Hat and Containerization Wreck Virtualization

Conversation has popped up around a few tweets Alex Williams regarding virtualization at the Red Hat Summit. One of the starts to the conversation.

Paraphrased the discussion has been shaped around asking,

“Why is OS-level virtualization via containers (namely Docker) become such a massive hot topic?”

With that, it got me thinking about some of the shifts around containerization and virutalization at the OS level versus at the hyper-visor level. Here’s some of my thoughts, which seemed to match more than a few thoughts at Red Hat.

  1. Virtualization at the hyper-visor level is starting to die from an app usage level in favor of app deployment via OS-level virtualization. Virtualization at the OS level is dramatically more effective in almost every single scenario that virtualization is used today for application development. I’m *kind of* seeing this, interesting that RH is I suppose seeing this much more.
  2. Having a known and identified container such as what Docker works with provides a dramatically improved speed and method for deployment over traditional hyper-visor based virtualized or pure OS based deployment.

There are other points that have brought up but this also got me thinking on a slight side track, what are the big cloud providers doing now? Windows Azure, AWS, Rackspace or GCE, you name it and they’re all using a slightly different implementation of virtualized environments. Not always ideally efficient, fast or effective but they’re always working on them.

…well, random thoughts aside, I’m off to the next session and some hacking in between. Feel free to throw your thoughts into the fray on twitter or in the comments below.

New Relic, The King Makers, MS Open Tech, Riak VMs and Life Gets Easier Today

Today Microsoft released, with partnerships with a number of companies including Basho, Hupstream and Bitnami, the VM Depot. I’ve always followed Bitnami, so it’s really cool to see their VM releases for Jenkins (CI Build Server), WordPress, Ruby 1.9.3 stackNode.js and about everything you can imagine out their along side our Basho Riak CentOS image. If you want a great way to get kick started with Riak and you’re setup with Windows Azure, now there is an even easier way to get rolling.

Over on the Basho blog we’ve announced the MS Open Tech and Basho Collabortation. I won’t repeat what was stated there, but want to point out two important things:

  1. Once you get a Riak image going, remember there’s the whole community and the Basho team itself that is there to help you get things rolling via the mail list. If you’re looking for answers, you’ll be able to get them there. Even if you get everything running smoothly, join in anyway and at least just lurk. 🙂
  2. The RTFM value factor is absolutely huge for Riak. Basho has a superb documentation site here. So definitely, when jumping into or researching Riak as software you may want to build on, use for your distributed systems or the Riak Key Value Databases, check out the documentation. Super easy to find things, super easy to read, and really easy to get going with.

So give Riak a try on Windows Azure via the VM Depot. It gets easier by the day, and gives you even more data storage options, distribution capabilities and high availability that is hard to imagine.

New Relic & The Rise of the New Kingmakers

In other news, my good friends at New Relic have released a new book in partnership with Redmonk Analyst Stephen O’Grady @, have released a book he’s written titled The New Kingmakers, How Developers Conquered the World. You may know New Relic as the huge developer advocates that they are with the great analytics tools they provide. Either way, give a look see and read the book. It’s not a giant thousand page tomb, so it just takes a nice lunch break and you’ll get the pleasure of flipping the pages of the book Stephen has put together. You might have read the blog entry that started the whole “Kingmakers” statement, if you haven’t, give that a read first.

I personally love the statement, and have used it a few times myself. In relation to the saying and the book, I’ll have a short review and more to say in the very near future. Until then…

Cheers, enjoy the read, the virtual images and happy hacking.

Sputtering Windows Instances

I had a concern about Windows OS being used for cloud computing.  The instances in Windows Azure take a significant amount of time to boot up.  In Amazon Web Services the Windows EC2 Instances also take a long time to boot up.  Compared to Linux, Windows takes 2-4x longer to spool up in the cloud.  (Compare a boot time of about ~1 minute for Linux in EC2 vs 8-15 minutes for Windows)

Before today, this just seemed like it might be a problem I was experiencing.  I tend to believe I’m doing something wrong before I go on the warpath, but today that concern that I’d done something wrong has ended.  RightScale posted a blog entry about the difficulties of Windows in EC2.  They’re seeing the same issues I was.

Another issue that they noticed, which I too noticed, was the issues around the clocks being off.  This is a similar problem to Windows being used with VMWare and setting up images.  The clock just doesn’t ync the first time, or subsequent times.  Usually a few manual attempts need to be made.

In another entry I caught another list of issues with Windows that Linux just doesn’t have.  None of these are work stoppage issues, but they are all very annoying and would push one toward using Linux instead if at all possible.

Putting Windows Azure and Amazon Web Services EC2 side by side Network World has found them to be on a collision course.

Boiling it Down, Where Does Windows Stand?

After some serious analysis by individuals of Windows running in Cloud Environments it appears that Windows just isn’t as suited to running in virtualized environments as Linux.  A number of friends have pointed out to me how much friendlier Linux is in virtualized spaces such as VMWare’s ESX Environment.

Also based on hard analysis of VMWare versus Hyper-V, the later doesn’t appear to be as sophisticated or capable of virtualized hosting.  Is this going to cause a price point issue for Windows Azure versus AWS EC2?  Just from the perspective of requiring more hardware for Hyper-V Virtualization versus VMWare & Amazon’s AMI Virtualization it makes me ponder if this could be a major competitive advantage for Linux based clouds.  Already there is the licensing price points, so how does MS own up to that?

I would be curious to see what others have experienced.  Have you seen virtualized differences that cause issues hosting Linux vs. Windows in VMWare, Hyper-V, or AWS?  Do you foresee any other problems that could become big problems?