I got an home server that is running docker for all my self hosted apps. But sometimes I accidentally trigger Earlyoom by remotely starting expensive docker builds, which kill docker.

I don’t have access to my server outside of my home network, so I can’t manually restart docker in those situations.

What would be the best way to restart it automatically? I don’t mind doing a full system restart if needed

  • lemmyng@lemmy.ca
    link
    fedilink
    English
    arrow-up
    17
    ·
    6 months ago

    Use -m and limit the build job’s memory so it doesn’t kill the docker daemon.

    • RustyNova@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      20
      ·
      6 months ago

      Fair enough. But I don’t want a bandaid fix solution. Even more that I do all my docker through portainer and the option isn’t there.

      It could also be useful if a container got a memory leak and is unbounded

      • just_another_person@lemmy.world
        link
        fedilink
        English
        arrow-up
        39
        arrow-down
        4
        ·
        6 months ago

        This isn’t a band-aid, it’s the literal fix.

        Structuring the available CPU and Memory reservations for containers is LITERALLY the entire reason containers exist. Just because you’re only familiar with the “dumb” way of using them doesn’t mean you should be dismissive when someone offers you advice when you come here asking for it.

        You’re also seemingly just a dick for being lazy, because I looked, and wuddyaknow. So now you’re just rude, dickish, and lazy.

        Take the advice from the original responder, and then go and learn how to use the things you’re asking for help with, along with some manners.

          • Bo7a@lemmy.ca
            link
            fedilink
            English
            arrow-up
            20
            arrow-down
            2
            ·
            6 months ago

            You can’t expect people who are knowledgeable about this stuff to just forever accept that someone asks for advice, gets told the solution, and then ignores/belittles the person with knowledge.

            This is our daily life experience. We get hired to be experts, and get told by non-experts that our solutions are not tenable every single day. Only for that solution to eventually be accepted when the user in question figures out their idea was not useful and the expert was correct.

            We have to put up with it at work, we are not obliged to accept it here.

              • Bo7a@lemmy.ca
                link
                fedilink
                English
                arrow-up
                7
                arrow-down
                1
                ·
                6 months ago

                In which way am I complaining? I am explaining why calling a valid solution a bandaid might be construed as belittling their very real knowledge of this process. And how that is a regular pattern in a lot technical fields.

                And don’t give me this shit about ‘I’m not the person you were talking to’ This is an open forum not a direct/private message.

              • just_another_person@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                3
                ·
                6 months ago

                I was obliged to respond to let him know that he was actually provided the correct answer, and he didn’t need to respond to the person who provided the correct answer like that. I don’t feel it’s right to sit idly by and let people who are only trying to help for free be getting snark like that. Obliged, much.

            • RustyNova@lemmy.worldOP
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              4
              ·
              6 months ago

              There’s a difference between helping people with misunderstanding a tool and belittling them for being wrong. It’s just a matter of wording that separate an helpful answer from a toxic one

              I could tell you “You should actually use Y instead of X. They are numerous benefits like A, B and C. The doc actually have a great example you may have missed or not understood it was for this purpose. It will help you a lot more than what you are thinking of doing.” And this would be fine.

              But “Just use Y. X is bad because Y is made for that. You not willing to use Y shouldn’t make you do X. There’s even a the first Google link on how to do it” isn’t fine.

              And I have not belittled them at all. I have said that it wasn’t what I was looking for. A lot of times people post questions they think should solve their issue, but only to realise that they didn’t fully understand the full picture and theirs problem is on a larger scale.

        • RustyNova@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          6
          ·
          edit-2
          6 months ago

          Alright, sorry for calling it a “bandaid fix”. It wasn’t just the right term for what I wanted to say. I was more referring on how it would only fix issues in cases of builds, and not on actual runtime, which can also be an issue if I am not careful. So yeah, it’s the fix for the issue in the post, but this solution made me realise that this isn’t the only thing I want.

          But the second part is… Just chill. It’s a home server. Not a high availability cluster. I can afford stupid things. Heck, I’m only asking this question because I got stupid and haven’t limited the job count of a cargo build, downing my server. I don’t care that my build crash. I just want to not have to manually restart it, because when I’m not here I can’t do it.

          As for the link that you sent, it’s container limitations, not image building limitations. And I already have setup some on my most hungry container, stats shown that it blew past it, so idk what’s going on there.

          Edit: NVM. This is a bandaid fix. What if you forgot to put the flag? Like it’s been 5 month since last time and forgot to do the same fix? Or you accidentally removed it while editing the command? I’m actually looking for a solution that fixed my problem fully, not a partial solution

          • just_another_person@lemmy.world
            link
            fedilink
            English
            arrow-up
            5
            arrow-down
            3
            ·
            6 months ago

            Then you didn’t explain the issue very well, because what you’re asking for was given to you exactly. Builds also have flags, and you should know that if you’re complaining about advice given to you. I’m not saying that to admonish you, just giving you the info.

            The next step down is that you’re using Portainer, and having user-error issues somehow. So another solution is renaming these actions something with a very obvious prefix like “BUILD ACTION”, but also setting memory limits.

            The very last step is making sure your swap is in order. Allocate 2x your system memory to swap, and this will help alleviate OOM issues to a point, but especially during builds.

            If you come back and say this is a band-aid solution, get a better machine and stop asking questions to solve the impossible in here. This is your fault this is an issue to begin with, you don’t know how to run your machines (regardless of it just being a home server or whatever ), and you’re just being rude.

      • Badabinski@kbin.earth
        link
        fedilink
        arrow-up
        16
        ·
        6 months ago

        The other person may have responded with a fair amount of hostility, but they’re absolutely correct. I run Kubernetes clusters hosting millions of containers across hundreds of thousands of VMs at my job, and OOMKills are just a fact of life. Apps will leak memory, and you’re powerless to fix it unless you’re willing to debug the app and fix the leak. It’s better for the container to run out of memory and trigger a cgroup-scoped OOM kill. A system-wide OOM kill will murder the things you love, shit in your hat, and lick your face like David Tennant licked Krysten Ritter.

        • RustyNova@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 months ago

          Oh that’s not a problem to let a container get killed. It’s perfectly fine. What I want is just not crippling my whole server because one container did a funny.

          If it keeps docker and the portainer VM I’ll be 100% ok, because I can just restart it. I don’t want to have remote access to my server outside of my home for security reasons, so this is just the bare minimum

          • null@slrpnk.net
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            6 months ago

            I don’t want to have remote access to my server outside of my home for security reasons, so this is just the bare minimum

            What are your security concerns?

      • Treczoks@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        ·
        6 months ago

        This is not a bandaid, this is the solution. What you try is, at least for this scenario, the band aid.

  • KrapKake@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 months ago

    You should be able to make docker exempt from early oom. Check it’s github for instructions.

  • GravitySpoiled@lemmy.ml
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    9
    ·
    6 months ago

    I don’t know the best way but I would use cron and start docker every minute (if it’s not running).

    • atzanteol
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      1
      ·
      6 months ago

      I don’t know the best way

      Apparently…

      Don’t do this. Either don’t go OOM to begin with (somebody else told you how to limit container memory usage} and/or configure systemd to restart docker if it quits. I’m surprised systemd isn’t already.

        • atzanteol
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          6 months ago

          Seems like the best solution.

          Over using a system tool designed to monitor and restart services that stop?

        • atzanteol
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          6 months ago

          It’s fairly obvious I feel.

          You’re saying rather than use a system tool that does the exact thing that you want you should bodge together a cron job that accomplishes your goal but doesn’t actually do what you want.

          Like say you want to stop the docker service for some reason? systemctl stop docker will do that. Then your cron job will restart it. That’s not the desired outcome. You want the service running IF the service SHOULD be running. Which is a different thing than “always running”. And its’ exactly what you get for free with systemd without any silly custom BS.

    • RustyNova@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      6
      ·
      6 months ago

      I’ll try that. I know that systemctl has a start-or-reload command, but is there any “start-or-ignore” commands? Or start flags?