See I view this as akin to me currently deciding containers are king and really learning them inside out right now. Maybe I’m right but then I’ve got Rym and Scott saying this is a solution to the problem of not knowing how computers work.
Eh, it depends.
Containers, at least as implemented by Docker on Linux, are basically just a bunch of wrapper scripts and programs around stuff that’s already built in to Linux like cgroups, chroot jails, and so on. You could go ahead and implement the same functionality manually if you’d like, but there is no real reason to do so as long as you have Docker’s wrappers.
That said, it is a good idea to learn just what the heck Docker is doing under the covers, both to make sure you really understand what’s going on and to help you deal with fixing problems in case somehow you or the Docker wrappers screwed things up.
So, what does it mean to “learn containers inside and out”? I’d argue that it consists of both learning how to use Docker and learning what it’s doing under the covers. By learning what Docker does under the covers, I’d say you are also fulfilling Rym and Scott’s desire that you should “know how computers work.”
The only time I’ve ever found containers useful was a cyber defense competition.
I could be wrong, but my understanding of scryms criticism is that only idiots use docker, because the only problem it solves is you don’t know how computers or don’t employ people who know how computers work. If that’s true, then world will figure that out and docker won’t be widely used. So worst case I pick up a new skill that has limited use. Best case every major project for the next 30 years is containerized and I end up a total pro using something universally used.
It seems similar to me. So I guess I’m doubling down.
I’m sorry but Chroot Jails sound like bizarely-but-not-unappealing tasting european cigarettes.
Protip, take the amazon ops automated screening exam.
It’s been a while since I heard what precisely their criticism of Docker was. If memory serves me, I think it was the fact that you can do all it does without using Docker itself and not the precise concept of “containerization” itself as a lighter weight alternative to full-on virtualization.
If you understand what Docker does under the covers, you will know a lot more than just how to do containerization, though, so it would be a very well-rounded set of skills to have. For example, my company’s product uses cgroups all the time to put resource limitations on the various processes we run on our hardware. Cgroups are one of the built-in Linux components that makes Docker work, so if you know how they work, you can use that skill for general purpose resource allocation on any modern Linux system.
The point is that with cloud hosting we are already using virtual machines for everything. There’s no reason to use docker or other containers to separate application A from application B if you just don’t install them on the same “computer.” For example, if you are using AWS, don’t use one EC2 instance with 2 containers. Just use 2 (or more) EC2 instances.
The only problem that I have found containers useful for is setting up development environments. Container systems like Docker allow developers who know nothing about *NIX and can’t setup their own local dev environments, to set them up very quickly and easily without learning anything. They can just
git clone and
docker-compose up and start working right away.
Even if the developers are all skilled, it can help to guarantee that every developer’s environment matches production identically, especially when their local machines differ greatly.
Containers in production are pretty much bad news. System that offer cloud hosting for containerized applications are charging more money for the equivalent service if you just setup your application normally. Instead of architectonic your own system, they just magically host your containers in the cloud, and you pay a price in money to make up for your lack of knowledge.
This may depend on pricing. How much does one EC2 instance with 2 containers cost vs. 2 EC2 instances? Each EC2 instance gives you additional compute as well as process/environment separation, but if all you need is separation and a single EC2 instance with containers is cheaper than 2 EC2 instances, an argument could be made to use containers.
That is pretty true, so I agree with you there. We kind of have a “homebrew” container process using chroot here for our dev build environment. While we use Ubuntu for our build workstations, our product runs on CentOS, so we use our “homebrew container” to automatically set up a CentOS build environment on our Ubuntu machines.
Also, other than guaranteeing environments that match production, even for skilled developers that can set everything up by hand, it also makes it convenient to have to just type a single command to create said environment instead of multiple steps to set things up just right.
In what way are they bad news, or is it just bad news to use them with services that charge for cloud hosting of containers? That may be more to do with hosts that rip you off instead of leaving things up to you, but I guess it depends on just what they mean by “container cloud hosting.” If all they do is stick your containers in the same ol’ VMs you’d be using anyway and they charge extra for it, that is certainly a rip off. If they offer some sort of automatic migration of containers to new VMs as more compute is needed, then it might be worth paying for that service, depending on how much the other options are.
Again, it comes down to price and use cases. Containers are super hyped up and while they aren’t the end-all, be-all, they certainly may be useful for certain situations.
Most cloud hosts offer VMs of a very large number of sizes and configurations. Look at how many fucking EC2 instance types there are.
They range from nano for $0.0052 per hour to p3.16xlarge for $24.48 per hour. No matter what application you have, there is an instance, or (combination of instances), that is exactly the right size. Running multiple apps on one instance just decreases stability and reliability and security by giving you fewer points of failure. It also creates more labor now that you have to work on configuring container/isolation systems to get the apps running on the same instance. If you used more instances, you could just do nothing and work on your app instead. You already have virtual machines that solve the problem of isolation. You don’t need two tools to solve the same problem!
It’s bad news because your architecture is now a magical black box. It’s bad news because you may now be dependent on a particular hosting company’s implementation. It’s bad news because things like kubernetes and other container orchestration systems are janky as fuck. It’s bad news because every problem these things solve is a problem that is already solved by just doing things the normal way. The only additional problem they solve is allowing someone who doesn’t know how to administer or architect systems to get their app running without acquiring more skills or knowledge.
When I think of containers I’m thinking like using aws fargate (which I havn’t yet) to handle one off tasks that don’t need a dedicated server.
Amazon’s price scaling does seem fair. I mean two nano instances cost the same as one “double-sized nano” instance, so it would make more sense to go with the two nanos instead of the double-nano with containers in that case. In this scenario, I agree that containers are probably not the wisest choice to make. The only possibly valid use for containers in this scenario are maybe if you have an extremely good reason why you don’t want things on different VMs (which, to be honest, I can’t think of any) or if a single nano really is all you need to to run all your applications (which is likely not to be the case for long if your application is in any way successful).
Well, if you’re dealing with nearly any cloud hosting provider, containers or not, you’re dealing with a black box. I mean, do you know 100% of the details as to how EC2 instances work, are allocated, scale up or down, etc.? I don’t see how that matters with respect to VMs vs. containers.
You may be correct in that Kubernetes and friends may be janky as all hell. I’ve never worked with any of them, so I can’t judge one way or the other.
I assume the “normal way” is just spinning up a new VM? Well, there may be scenarios where spinning up a new VM may not be the best way to do these sorts of things. If you’re using Amazon’s cloud hosting, you’re right in that there are very few, if any, scenarios where this makes sense. However, if you’re doing some sort of on-premise hosting, then you can possibly make some arguments to use containers vs. spinning up a new VM.
I’ll say this about the whole “getting apps up and running without acquiring more skills or knowledge” thing. Someone has to set up the container and not everyone necessarily needs the skills or knowledge to fully administer the system it runs on. The UI/UX expert whose only job is to make a functional, usable, and attractive interface for the web app almost certainly doesn’t need to know how to administer it. Now, coders working on parts of the web app closer to the OS almost certainly do, but they’ll be the ones making the containers, or at least collaborating with the people who eventually end up making them.
If you are using on-premises hosting you can basically turn it into a “cloud” using something like KVM.
Or VMWare or VirtualBox (ugh) or tons of other products out there. This is not new. This tech has been around for about 20 years on PC-style hardware and since at least the 1970’s on IBM mainframes. However, the overhead of an entire VM is much greater than that of a container within a VM. At the very least, you’ll have multiple instances of the OS in RAM, on disk, and so on (although some hypervisors can at least copy-on-write identical portions of RAM used between VMs). Now, if you’re Amazon with essentially infinite resources to throw at the problem and an entire business unit where people pay you to throw said resources at the problem, this overhead is negligible. However, if you don’t have the resources of an Amazon, it may be worth looking at containers as a lighter-weight and more cost-effective alternative. Again, it comes down to budget and use cases.
This kind of virtualization used by cloud hosts like Linode, EC2, Digital Ocean, Azure, and others is not nearly as resource intensive as when you run VMWare or VirtualBox locally. The performance is also way way faster. In my experience I get more performance out of a tiny Linode than I get out of VirtualBox on a very powerful desktop. They’re both virtual machines, so make of that what you will.
Well, VirtualBox is crap relative to server hypervisors, and VMWare Workstation isn’t something you’d be using for your servers anyway. A better comparison to KVM would be VMWare vSphere.
Whatever virtualization the big cloud providers are running may not necessarily be available to a company wanting to do that stuff in-house.
Of course, this may all go back to the old software adage when it comes to optimization: measure first, then optimize.
I love this discussion because it kinda gets to the point where Amazon is selling you something that is less than the “bare metal” you want. And I mean, it’s probably good for you.
It’s kinda the problem of a software developer. There exists a bare-metal solution to your problem that’s almost infinitely better, but you choose abstractions because it’s easier. Then they get compounded. Modern cloud infrastructure depends on your ability to realize the moving parts and figure out “how” they should actually work in the cloud.
Very true. And a good software developer would only switch to the bare metal (or at least closer to the metal) solutions if the chosen abstractions somehow aren’t up to snuff.
It’s like those guys who insist you need to write everything in assembly in order to get peak performance. Never mind that:
- Your choice of algorithm almost always matters more to performance than the language or level of abstraction you write the algorithm in. (i.e. quick sort in Python will pretty much always blow the doors off of bubble sort in assembly).
- Modern compilers are usually much better at optimizing for modern CPU architectures (and often optimizing in general) than the vast majority of human beings. One of the most amazing things I’ve ever seen with respect to compiler optimization was when the Clang C++ compiler recognized a naive loop-based implementation of an algorithm to sum all the numbers from 1 to N and automatically replaced it with Gauss’s formula for summing up numbers.
- Even when you absolutely do need to get close to the bare metal, it’s often for only a small part of your code, so you only have worry about it for that small part, i.e. the 80/20 rule where 80% of your program’s time is spent in 20% of the code.
If I’m correctly remembering the Bryan Cantrill talks I watched videos of a few years ago, another one was that Docker really only ran securely in VMs and there was a big need for containers/jails on the metal instead. I haven’t kept up with the current state of that but it seemed like a problem only the hugest hosting operations really worried about, rather than shops just needing a quick and easy dev environment.