I never understood how to use Docker, what makes it so special? I would really like to use it on my Rapsberry Pi 3 Model B+ to ease the setup process of selfhosting different things.
I’m currently running these things without Docker:
- Mumble server with a Discord bridge and a music bot
- Maubot, a plugin-based Matrix bot
- FTP server
- Two Discord Music bots
All of these things are running as systemd services in the background. Should I change this? A lot of the things I’m hosting offer Docker images.
It would also be great if someone could give me a quick-start guide for Docker. Thanks in advance!
IMHO with docker and containerization in general you are trading drive space for consistency and relative simplicity.
a hypothetical:
You set up your mumble server and it requires the leftpad 3.7 package to run. you install it and everything is fine.
Now you install your ftp server but it needs leftpad 5.5. what do you do? hope the function that mumble uses in 3.7 still exists in 5.5? run each app in its own venv?Docker and containerization resolve this by running each app in its own mini virtual machine. A container running mumble and leftpad 3.7 can coexist on host that also has a container running a ftp server with leftpad 5.5.
Here is a good video on what hole docker and containerization looks to fill
https://www.youtube.com/watch?v=Nm1tfmZDqo8Docker containers aren’t running in a virtual machine. They’re running what amounts to a fancy chroot jail… It’s just an isolated environment that takes advantage of several kernel security features to make software running inside the environment think everything is normal despite being locked down.
This is a very important distinction because it means that docker containers are very light weight compared to a VM. They use but a fraction of the resources a VM would and can be brought up and down in milliseconds since there’s no hardware to emulate.
FYI docker engine can use different runtimes and there is are lightweight vm runtimes like kata or firecracker. I hope one day docker will default with that technology as it would be better for the overall security of containers.
To put it in simpler terms, I’d say that containers virtualise only the operating system rather than the whole underlying machine.I guess not then.
It virtualises only parts of operating system (namely processes and network namespaces with ability to passthru devices and mount points). It is still using host kernel, for example.
I wouldn’t say that namespaces are virtualization either. Container don’t virtualize anything, namespaces are all inherited from the root namespaces and therefore completely visible from the host (with the right privileges). It’s just a completely different technology.
The word you’re all looking for is sandboxing. That’s what containers are - sandboxes. And while they a different approach to VMs they do rely on some similar principals.
I never said that it is a virtualization. Yet for easy understanding I named created namespaces “virtualized”. Here I mean “virtualized” = “isolated”. Systemd able to do that with every process btw.
Also, some “smart individuals” called comtainerization as type 3 hypervisors, that makes me laugh so hard :)
The operating system is explicitly not virtualised with containers.
What you’ve described is closer to paravirtualisation where it’s still a separate operating system in the guest but the hardware doesn’t pretend to be physical anymore and is explicitly a software interface.
Not exactly IMO, as containers themselves can simultaneously access devices and filesystems from the host system natively (such as VAAPI devices used for hardware encoding & decoding) or even the docker socket to control the host system’s Docker daemon.
They also can launch directly into a program you specify, bypassing any kind of init system requirement.
OC’s suggestion of a chroot jail is the closest explanation I can think of too, if things were to be simplified
I would also add security, or at least accessible security. Containers provide a number of isolation features out-of-the-box or extremely easy to configure which other systems require way more effort to achieve, or can’t achieve.
Ironically, after some conversation on the topic here on Lemmy I compiled a blog post about it.
Tbf, systemd also makes it relatively easy to sandbox processes. But it’s opt-in, while for containers it’s opt-out.
Yeah, and it also requires quite many options, some with harder-to-predict outcomes. For example RootDirectory can be used to effectively chroot the process, but that carries implications such as the application not having access to CA certificates anymore, which in general in containers is a solved problem.
Here is an alternative Piped link(s):
https://www.piped.video/watch?v=Nm1tfmZDqo8
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I’m open-source; check me out at GitHub.
Doesn’t that mean that docker containers use up much more resources since you’re installing numerous instances & versions of each program like mumble and leftpad?
Kinda, but it depends on the size of the dependencies, with drive space bing so cheap these days do you really worry about 50Mb of storage being wasted on 4 different versions of glib or leftpad
Docker and containerization resolve this by running each app in its own mini virtual machine
While what you’ve written is technically wrong, I get why you did the comparison that way. Now there are tons of other containerization solutions that can exactly what you’re describing without the dark side of Docker.
There have been some great answers on this so far, but I want to highlight my favourite part of Docker: the disposability.
When you have a running Docker container, you can hop in, fuck about with files, break stuff as you try to figure something out, and then kill the container and all of the mess you’ve created is gone. Now tweak your config and spin up a fresh one exactly the way you need it.
You’ve been running a service for 6 months and there’s a new upgrade. Delete your instance and just start up the new one. Worried that there might be some cruft left over from before? Don’t be! Every new instance is a clean slate. Regular, reproducible deployments are the norm now.
As a developer it’s even better: the thing you develop locally is identical to the thing that’s built, tested, and deployed in CI.
I <3 Docker!
What about your preferences/configs/files (when you spun up a fresh one)?
The most popular way of configuring containers are by using environment variables that live outside the container. But for apps that use files to store configuration, you can designate directories on your host that will be available inside the container (called “volumes” in Docker land). It’s also possible to link multiple containers together, so you can have a database container running alongside the app.
If you have all of that set up then, what benefit is there to blowing away your container and spinning up a ‘fresh’ one? I’ve never been able to wrap my head around docker, and I think this is a big part of it.
There’s a lot more to an application than its configuration. It may require certain specific system libraries, need a certain way of starting up, or a whole host of other special things. With a container, the app dev can precreate a perfect environment for their program and save you LOADS of hassle trying to set it up.
The benefit of all this is that you can know exactly where application state is stored, know that you’re running the app in it’s right environment, and it becomes turbo easy to install updates, or roll back if needed.
Totally spin up a VM, install docker on it, and deploy 2-3 web apps. You’ll notice that you use the same way of configuring them, starting and stopping them, and you might not want to look back ;)
I’ve played with it a bit. I think I was using something called DockStarter and Portainer. Like I said though, I could never quite grasp what was going on. Now for my home webapps I use Yunohost, and for my media server I use Swizzin CE. I’ve found these to be a lot easier, but I will try Docker again sometime.
It’s virtual machines but faster, more configurable with a considerably larger set of automation, and it consumes less computer resources than a traditional VM. Additionally, in software development it helps solve a problem summarized as “works on my machine.” A lot of traditional server creation and management relied on systems that need to be set up perfectly identical every deployment to prevent dumb defects based on whose machine was used to write it on. With Docker, it’s stupid easy to copy the automated configuration from “my machine” to “your machine.” Now everyone, including the production systems, are running from “my machine.” That’s kind of a big deal, even if it could be done in other ways naturally on Linux operating systems. They don’t have the ease of use or the same shareability.
What you’re doing is perfectly expected. That’s a great way of getting around using Docker. You aren’t forced into using it. It’s just easier for most people
This is exactly the answer.
I’d just expand on one thing: many systems have multiple apps that need to run at the same time. Each app has its own dependencies, sometimes requiring a specific version of a library.
In this situation, it’s very easy for one app to need v1 of MyCleverLibrary (and fails with v2) and another needs v2 (and fails with v1). And then at the next OS update, the distro updates to v2.5 and breaks everything.
In this situation, before containers, you will be stuck, or have some difficult workrounds including different LD_LIBRARY_PATH settings that then break at the next update.
Using containers, each app has its own libraries at the correct and tested versions. These subtle interdependencies are eliminated and packages ‘just work’.
I approve of this expanded answer. I may have been too ELI5 in my post.
If the OP has read this far, I’m not telling you to use docker, but you could consider it if you want to store all of your services and their configurations in a backup somewhere on your network so if you have to set up a new raspberry pi for any reason, now it’s a simple sequence of docker commands (or one docker-compose command) to get back up and running. You won’t need to remember how to reinstall all of the dependencies.
I can also add that if you want to run multiple programs that each have a web interface it’s easy to direct each interface to the port you want instead of having to go through various config files that are different for each program or worst case having to change a hardcoded port in some software. With docker you have the same easy config options for each service you want to run. Same with storage paths. Various software stores their files at seemingly random places. With docker you just map a folder and all you files are stored there without any further configs.
This blog post explains it well:
https://cosmicbyt.es/posts/demistifying-containers-part-1/
Essentially, containers are means of creating environments in which you can run software, and those environments are:
- isolated, which makes it a very controlled environment. Much harder to run into errors
- reproducible: we have tools that reproduce the same container from an image file
- easy to distribute: just have the container image.
- little to no compromises on performance (at least on Linux)
It is essentially a way for you to run a program without having to worry how to set up the environment, why it didn’t work as expected, what dependencies you’re missing, etc.
I feel that a lot of people here are missing the point. Docker is popular for selfhosted services for a few main reasons:
- It is one package that can be used on any distribution (or even OS with a Linux VM).
- The package contains all dependencies required to run the software so it is pretty reliable.
- It provides some basic sandboxing against non-malicious services. Basically the service can’t scribble all over your filesystem. It can only write to specific directories that you have given it access to (via volumes) other than by exploiting security vulnerabilities.
- The volume system also makes it very obvious what data is important and needs to be backed up or similar, you have a short list.
Docker also has lots of downsides. I would generally say that if your distribution packages software I would prefer the distribution’s package over the docker image. A good distribution package will also solve all of these problems. The main issue you will see with distribution packages is a longer delay before new versions are made available.
What Docker completely dominates was previous cross-distribution packaging options which typically took one of the previous strategies.
- Self-contained compiled tarball. Run the program inside as your user. It probably puts its data in the extracted directory, maybe. How do you upgrade? Extract and copy a data directory? Self-update? Code is mutable and mixed with data, gross.
- Install script. Probably runs as root. Makes who-knows what changes to your system. Where is the data, is the service running? Will it auto-start on boot. Hope that install script supports your distro.
- Source tarball. Figure out the dependencies. Hope they don’t conflict with the versions your distro has. Set up users and setup scripts yourself. Hope the build doesn’t take too long.
Sorry if I’m about 10 years behind Linux development, but how does Docker compare with the latest FlatPak trend in application distribution? How you have described it sounds somewhat similar, outside of also getting segmented access to data and networks.
For desktop apps Flatpak is almost certainly a better option than Docker. Flatpak uses the same core concepts as Docker but Flatpak is more suited for distributing graphical apps.
- Built in support for sharing graphics drivers, display server connections, fonts and themes.
- Most Flatpaks use common base images. Not only will this save disk space if you have lots of, for example GNOME, applications as they will share the same base but it also means that you can ship security updates for common libraries separately from application updates. (Although locked insecure libraries is still a problem in general, it is just improved over the docker case.)
- Better desktop integration via the use of “portals” that allow requesting specific things (screenshot, open file, save file, …) without full access to the user’s system.
- Configuration UIs that are optimized for the desktop usecase. Graphically tools for install, uninstall, manage permissions, …
Generally I would still default to my distro’s packages where possible, but if they are unsuitable for whatever reason (not available, too old, …) then a Flatpak is a great option.
Docker is to servers, as flatpak is to desktop apps.
I would probably run away if i saw flatpak on a headless serverFlatpak has better security features than docker. While its true it’s not designed with server apps in mind, it is possible to use its underlying “bubblewrap” to create isolated environments. Maybe in the future, tooling will improve its features and bridge the gap.
For your use case, consider it to be a packaging format (like AppImage, Flatpak, Deb, RPM, etc.) that includes all the dependencies (including services, not just libraries) for the app in question.
Should I change this?
If it’s not broken don’t fix it.
Use Podman (my preferred - the SystemD approach is awesome), containerd, or Incus. Docker is a graveyard of half-finished pet projects that have no reason for existing. Podman has a Docker-compatible socket, so 100% of Docker tooling will work with it.
I can add, podman was ignored in previous years at my day job because there were some reliability issues either with GPU access or networking I forget, however these issues have been resolved and we’re reimplementing it pretty much effortlessly
Yep, we’re reconsidering it at work as well. it’s grown pretty nicely
Try to run something that requires php7 and something else that requires php8 on the same web server; or python 2 and python 3.
You actually can, but it’s not pretty.
(The thing about a declarative setup isn’t much of a difference, you can do it for any popular Linux distro.)
Doesn’t that mean that docker containers use up much more resources since you’re installing numerous instances & versions of each program like PHP?
Oh, sure, the bloat on your images requires resources from the host.
There is the option of sharing things. But, obviously that conflicts a bit with maintaining your environments isolated.
I have a reason I don’t think is covered. A few programs I have come across that I want to try recommend docker and some only provide instructions for docker. They can spend less time trying to help you with dependencies and installations knowing they’ve included everything you need in the docker file. I don’t have a background in Linux or programming so unless they tell you exactly how to install something, I can struggle. Their installation page is then just the docker compose file with a note on the environment variables you can change.
When I asked this question
So there are many reasons, and this is something I nowadays almost always do. But keep in mind that some of us have used Docker for our applications at work for over half a decade now. Some of these points might be relevant to you, others might seem or be unimportant.
- The first and most important thing you gain is a declarative way to describe the environment (OS, dependencies, environment variables, configuration).
- Then there is the packaging format. Containers are a way to package an application with its dependencies, and distribute it easily through the docker hub (or other registries). Redeploying is a matter of running a script and specifying the image and the tag (never use latest) of the image. You will never ask yourself again “What did I need to do to install this again? Run some random install.sh script off a github URL?”.
- Networking with docker is a bit hit and miss, but the big thing about it is that you can have whatever software running on any port inside the container, and expose it on another port on the host. Eg two apps run on port :8080 natively, and one of them will fail to start due to the port being taken. You can keep them running on their preferred ports, but expose one on 18080 and another on 19080 instead.
- You keep your host simple and empty of installed software and packages. Less of a problem with apps that come packaged as native executables, but there are languages out there which will require you to install a runtime to be able to start the app. Think .NET, Java but there is also Python out there which requires you to install it on the host and have the versions be compatible (there are virtual environments for that but im going into too much detail already).
I am also new to self hosting, check my bio and post history for a giggle at how new I am, but I have taken advantage of all these points. I do use “latest” though, looking forward to seeing how that burns me later on.
But to add one more:- my system is robust, in that I can really break my containers (and I do), and to recover is a couple clicks in Portainer. Then I can try again, no harm done.
The thing with using the “latest” tag is you might get lucky and nothing bad happens (the apps are pretty stable, fault tolerant, and/or backward compatible), but you also might get unlucky and a container update does break something (think a 1.x going to 2.x one day). Without pinning the container to a specific version, you might have an outage suddenly due to that container becoming incompatible with one of your other applications. I’ve seen this happen a number of times. One example is a frontend (UI) container that updates to no longer be compatible with older versions of the backend and crashes as a result.
If all your apps are pretty much standalone and you trust them to update properly every time a new version of the container is downloaded, then you may never run into the problems that make people say “never use latest”. But just keep an eye out for something like that to happen at some point. You’ll save yourself some time if you have records of what versions are running when everything’s working, and take regular backups of all their data.
I used to have this with homeassistant and zwavejs. Every time I’d pull a new homeassistant, the zwave integration would fail, because it required a newer version of zwavejs. Taught me to build the chain of services into one docker-compose, so they’d all update together. That’s become one of the rationales for me to use docker: got a chain of dependent processes? wrap them in a docker so you’re working with (probably) the same dependencies as the devs.
My other rationale is just portability, and docker is just one of many solutions there. In my little home environment, where servers are either retired desktops or gee-that-seems-cool SBCs, it’s nice to be able to easily move stuff independent of architecture or OS.
I guessed it was a “once bitten twice shy” kind of thing. This is all a hobby to me so the cost-benefit, I think, is vastly different, nothing on my setup is critical. Keeping all those records and up to date on what version everything is on, and when updates are available and what those updates do and… sound like a whole lot of effort when currently my efforts can be better spent in other areas.
In my arrogance I just installed Watchtower, and accepted it can all come crashing down. When that happens I’ll probably realise it’s not so much effort after all.
That said I’m currently learning, so if something is going to be breaking my stuff, it’s probably going to be me and not an update. Not to discredit your comment, it was informative and useful.
One of the the main reasons why docker and kubernetes take off is they standardized the deployment process. Say, you have 20 services running on your servers. It’s much easier to maintain those 20 services as a set of yaml files that follow certain standard than 20 config files each with different format. If you only have a couple of services, the advantage is probably not apparent. But as you add more and more services, you’ll start to appreciate it.
Yep, I couldn’t run half of the services in my homelab if they weren’t containerized. Running random, complex installation scripts and maintaining multiple services installed side-by-side would be a nightmare.
The thing that confused me when first learning about docker was, that everybody compares it to a virtual machine. It’s not. Containers dont virtualize anything. They take a (single) process from the host OS and separate that into its own environment. All system calls, memory access, file writes etc are still handled by the same os (same kernel). However the process is separated both on the file system and process level. It can’t see other processes outside of the container and it also doesn’t see the real filesystem. It sees a filesystem provided by the container. This also means it sees different file and user permissions. When you run a alpine Linux docker container on an Ubuntu system, the container only containes the (few) files for alpine but no Linux kernel no desktop environment. A process inside that container only sees the alpine files and not the Ubuntu files. It also means all containers see a filesystem independent of each other and can use libraries and dependencies of different versions (they are only files after all).
For administration it makes running complex services easy. You define how to setup that service (what base Linux distro to use, what packages to install, what commands to run, and how to start the process). You can then be save to assume the setup of that service did not interfere with the setup of any other service. “Service 1 needs a certain system wide config changed? Service 2 needs that config in the default state? And both need a different version of the same library?” In containers you can have all at the same time because they each see a different version of the same config and library.
And all this is provided by the kernel itself. All docker does is provide an “easy” way to create and manage containers but could could do all of that using chroot, runc and a few other.
As a note, containers usually don’t come with systemd as they don’t need an init system. You would run the service directly inside the container and then use systemd outside the container to make sure the container is started/restarted, or just docker as it can already do that.
I found a great article demystifying containers recently
While you are technically right there is very little logical difference between containers and VMs. Really the only fundamental difference is that containers use the same kernel while VMs run their own. (let’s not even worry about para-virtualization right now).
In practice I would say the biggest difference is that there is better memory sharing so total memory usage will often be less. But honestly this mostly comes down to the fact that the average container bundles less software than the average VM image. Easier management of volumes is also nice because typically you will just bind-mount a host directory, but it also isn’t hard to mount a block device on a Linux host.
Install Portainer, it helps you get used to managing docker images and containers before going full command line.
I actually prefer dockge, I only have a few containers and its a lot simpler while still able to do all the basics of docker management. Portainer was overkill for me.
I have a pile of containers both for selfhosting and for dev builds, and still wouldn’t use Portainer.
I learned on portainer. I just wish it worked better. Dockge is a much better solution anyways
One benefit that might be overlooked here is that as long as you don’t use any Docker Volumes (and instead bind mount a local directory) and you’re using Docker Compose, you can migrate a whole service, tech stack and everything, to a new machine super easily. I just did this with a Minecraft server that outgrew the machine it was on. Just tar the whole directory, copy it to the new host, untar, and
docker compose up -d
.This
docker compose up -d
thing is something I don’t understand at all. What exactly does it do? A lot of README.md files from git repos include this command for Docker deployment. And another question: How can you automatically start the Docker container? Do you need a systemd service to rundocker compose up -d
?Docker Compose is basically designed to bring up a tech stack on one machine. So rather than having an Apache machine, a MySQL machine, and a Redis machine, you set up a Docker Compose file with all of those services. It’s easier than using individual Docker commands too. It sets up a network so they can all talk to each other, then opens the ports you tell it to. It’s isolated from other Docker Compose networks, so things won’t interfere with each other. So you can basically isolate a bunch of services with their own tech stacks all on the same machine. I’ve got my Jellyfin server running on the same machine as my Mastodon instance, thanks to Docker Compose.
As long as Docker is configured to run automatically at boot (which it usually is when you install it), it will bring containers back up that are set to be restarted. You can use the “always” or the “unless-stopped” values for the restart option, depending on your needs, then Docker will bring that container back up after a reboot.
Docker Compose is also useful in this context, because you can define dependencies for services. So I can say that the Mastodon container depends on the Postgres container, and Docker Compose will always start the Postgres container first.
You just need the docker and docker-compose packages. You make a
docker-compose.yml
file and there you define all settings for the container (image, ports, volumes, …). Then you rundocker-compose up -d
in the directory where that file is located and it will automatically create the docker container and run it with the settings you defined. If you make changes to the file and run the command again, it will update the container to use the new settings. In this commanddocker-compose
is just the software that allows you to do all this with thedocker-compose.yml
file,up
means it’s bringing the container up (which means starting it) and-d
is for detached, so it does that in the background (it will still tell you in the terminal what it’s doing while creating the container).
I started self-hosting a bit prior to when Docker took off, and getting multiple services running was much harder. Service A wants a certain version of PHP installed with certain plugins while Service B wants a different version. You’d follow a tutorial for installing Service C and desperately hope that it wouldn’t somehow break Service A or B. You installed Service D for a bit despite all the installation pain and now want to uninstall it - I hope you tracked exactly what config changes you made throughout the system so you can undo it.
Docker fixed all of this by making each service independent through containers which made self-hosting 10x easier. I’d also add that I love how easy it is to transfer my setup to a new server - I keep all of my container volumes in a specific directory and my docker-compose files in another and that’s all I need to backup / transfer. Without Docker you’d have to specifically handle each & every configuration file and database location, and if you later upgrade to a newer version of the OS or a different distro you’d have to handle possible conflicts between your versions and what the distro expects.
A lot of people here really do be describing docker like flatpak
They’re similar under the good, but flatpak is optimized for desktop use. Docker targets server applications.