Old school self hoster: scared of the security challenges of modern hosting

phase_change · edit-2 2 years ago

Old school self hoster: scared of the security challenges of modern hosting

sunaurus@lemm.ee · edit-2 2 years ago

I just want to point out that you don’t need to use neither Docker nor nginx to run Lemmy.

At the end of the day, the really required pieces are:

lemmy_server binary
Lemmy-ui
pict-rs binary
PostgreSQL database

How you get those things to talk to each other is totally up to you. There’s nothing stopping you from just using Apache as a reverse proxy, for example.

One possible Dockerless setup is described here: https://join-lemmy.org/docs/en/administration/from_scratch.html. This should give you some idea of how it could be done.

paco@fedia.io · 2 years ago

I’m with you. Same vintage IT guy, self hosting similarly. I dunno. I throw a lot of stuff up on my xcp-ng box. Some is important. Some isn’t. I’m doing all manner of old-school firewall and perimeter security and not worrying a ton about logging in my containers. I guess I’m just fatalistic. If I get hacked to the point that I’m digging through logs to figure out what happened, I’m kinda fucked. So I focus more on backup and restore. Can I restore to a known good state? But I hear you. Kids these days with their containers and their pipelines and their devops. Back in my day…

phase_change · 2 years ago

Kids these days with their containers and their pipelines and their devops. Back in my day…

Don’t get me started about the internal devs at work. You’ve already got me triggered.

And, I can just imagine the posts they’re making about how the internal IT slows them down and causes issues with the development cycle.

cstine@lemmy.uncomfortable.business · 2 years ago

Regarding your last question: I don’t expose any services directly to the Internet unless the functionality 100% requires that I do so.

I have a couple of tiers of mitigation at the network level - I drop the Spamhaus DROP list at the edge, I have crowdsec security engine on the docker host itself as the next layer, and I otherwise make sure that everything is current and patched.

I also use Cloudflare tunnels for cases where I need a service to be public (like, say, the Lemmy install I setup) but don’t really want to be exposing my personal IP to the world.

The end result is, basically, that I have a single nginx instance exposed on two virtual hosts, a single port for Plex, a single port for Wireguard and a (relatively) small attack surface that’s designed to have limited mitigation from bad networks, and reasonable response times to developing threats via crowdsec. And, then, there’s some internal VLAN separation between the docker host, the NAS, and other devices on the network that won’t prevent someone from hopping around if they get in, but will at least make it a little more effort and work required to figure out the network topology.

Nothing is ever “perfect security”, but this is enough to mitigate the non-stop endless bot and malware noise, though it’s only likely of limited use against someone who feels the need to personally target and attempt to compromise me, but honestly, that’s not really a threat I’m particularly concerned with.

phase_change · 2 years ago

I’m intrigued by the phrase “crowdsec security engine on the docker”. Yes, I can Google, but I’d appreciate a bit of comment on what that is and how involved the setup is.

cstine@lemmy.uncomfortable.business · 2 years ago

https://www.crowdsec.net/product/crowdsec-security-engine is what I’m talking about.

It’s not the easiest setup ever, but their documentation is pretty straightforward and easy to follow.

This is running on the host OS I’m using as the docker host.

It basically inspects logs/traffic to determine if a request is malicious and blocks sources of malicious traffic in real-ish time. It’s also crowdsourced data so you can import data on attacks other people have seen.

phase_change · 2 years ago

Nice. I’ll definitely check it out.

ubergeek77@lemmy.ubergeek77.chat · edit-2 2 years ago

How is it ephemeral? My Docker instance for Lemmy logs forever unless I manually clear the logs. My Caddy reverse proxy logs every request too. Both are stored to disk and I’m free to copy them out at any time. They’ll keep increasing in size until I decide to clear them.

They’re logged through the Docker engine, not the container. A malicious actor would require a sophisticated container breakout attack to even attempt to clear them. Those attacks are rare and highly publicized.

Alternatively, an attacker could try to find my real instance IP from behind Cloudflare (probably not going to happen), somehow bypass my provider’s firewall which only allows SSH from my home IP (my home is more likely to be broken into than that), and then somehow defeat SSH authentication on top of both of those (quantum computers aren’t quite there yet).

I’m having trouble seeing the risk you’re concerned about.

phase_change · 2 years ago

My concern is less the VM hosting the docker instance getting compromised but that Lemmy has an exploit and the Lemmy instance getting compromised. I’m quite certain that Lemmy is getting a closer look by the bad guys. You’ve had hundreds of instances spun up in a week, most that have done nothing more than follow an online example of how to spin up a Lemmy instance.

And, I was under the impression that the container and thus the logs were cleared when restarting or redeploying docker. If I’m wrong, I’m horribly embarrassed and will point at that “old school” in the title. I’ll also be doing some testing.

ubergeek77@lemmy.ubergeek77.chat · edit-2 2 years ago

No worries! There’s always something new to learn.

Docker has different log drivers, and the one used by default in Docker Compose (which is recommended by Lemmy) is called json-file, which logs to a file on the disk. The local driver is more commonly used by one-off Docker containers, and that one is ephemeral, although you could also use json-file there if you wanted.

If Lemmy itself has an exploit, at the absolute maximum, my instance either has its content cleared, or maybe it gets used for spam. I don’t see a scenario where anything outside of my container can be touched, containers are pretty secure.

Any attack on Lemmy would also be extremely widespread, and a random person like me likely wouldn’t be a target regardless. But if I were, I’d just restore one of my regular backups.

In any case, there is nothing to be done if Lemmy does have a dormant zero day vulnerability - we either use it or we don’t. No one is going to set up enterprise grade WAFs to use this hobbyist federation software, and it would likely break federation regardless.

You’ve had hundreds of instances spun up in a week, most that have done nothing more than follow an online example of how to spin up a Lemmy instance.

That’s true, but relatively speaking, Docker is pretty secure. The configs they provide aren’t bad either. Lemmy is a simple frontend service and a simple set of microservices, pretty much as small of an attack surface as you can get.

And Lemmy is written in Rust - I would be extremely surprised to learn a Rust application has an undetected RCE. But again, even if it did, the damage is contained to Lemmy itself and its backing postgres database. If I wanted to do postmortem forensics, I still have my Docker + access logs, which would include every web request (and thus attack payload) my proxy has ever received.

Anarch157a@lemmy.world · 2 years ago

As @[email protected] said, Docker has different log drivers. one of them is good old Syslog…

Put this in you /etc/docker/daemon.json

{
  "log-driver": "syslog",
  "log-opts": {
    "syslog-address": "unixgram:///dev/log"
  }

phase_change · 2 years ago

Perfect! Thanks.

Goldenderp@lemmy.world · 2 years ago

Ha i can totally feel the pain. Its a lot to learn! I’ve personally gone the traefik route instead of nginx. It does a lot of the rewriting by itself just by attaching labels to docker instances, and there is excellent middlewares available for security measures, like oauth forwarding or modsecurity. You can write your own middleware too and it’s quite simple to do without having to interact with the full http session. As for logging, you can configure other logging drivers for docker. If you’re worried about them being too ephemeral, send them to syslog or journald. Or set up fluentd and store them in the cloud. What makes things less complicated these days i think is that we now have “small things doing few things very well” in services with all sorts of containers, you just have to glue them all together.

nii236@lemmy.jtmn.dev · 2 years ago

Why do you want to persist logs? It doesn’t ‘improve’ security.

phase_change · 2 years ago

It doesn’t stop you from being hacked, but if you are hacked, it helps you to understand how so you can defend against it. So, I agree it doesn’t improve security for your instance, but it can improve security for your future instances.

nii236@lemmy.jtmn.dev · 2 years ago

Yeah but if someone gains root access to your server, you’re pretty boned anyway.

Best bang for buck will be to harden the fuck out of the server. Fail2ban, UFW, SSH keys only.

phase_change · 2 years ago

Agreed on all counts. Of course none of that exists on the on the Lemmy docker instance.

falcon15500@lemmy.nine-hells.net · 2 years ago

None of them are much use on the Lemmy container because it only exposes the ports required for it to work and it doesn’t have SSH. You would normally have SSH keys only and a firewall on the docker host. You could likely use something like crowdsec on the reverse-proxy logs to catch stuff.

blazarious@mylem.me · 2 years ago

My (short) take on this:

The whole fediverse lacks security right now compared to how email did in the 90s. This will only become a problem with mass adoption, again, like it did with email.

Good to start thinking about it early on, though.

tekeous@apollo.town · edit-2 1 year ago

deleted by creator

nii236@lemmy.jtmn.dev · 2 years ago

Philosophically I think the reason why things are “the way they are” these days, is because of a big push towards stateless compute.

So persistence is bad in this approach. That includes images, files, configs, secrets.

Thanks a lot, AWS!