I am currently out of town, and my server went down. All my services go through nginx, and suddenly started giving error 502. My SSH won’t let me in. I had my sister reboot the server, and it still doesn’t work. I apologize for the lack of details, but that is all I know, and I can’t access logs. I’ve cleared cache, and used a VPN in case fail2ban got me. I recently got a tp link router, so it could be something with that, but it was working for a while. I will have her do another reboot, and if that doesn’t work I will have her power off and unplug the server in case it was hacked.

Edit: I have absolutely no clue why, but it works now. I literally did nothing. As far as I know, my sister hasn’t touched it today. It just started working. Computers, man…

Edit 2: Actually she said she did something. Not sure what, but it works now.

  • xantoxis@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    4 months ago

    Some troubleshooting thoughts:

    What do you mean when you say SSH is “down”:

    1. connection refused (fail2ban’s activity could result in a connection refused, but a VPN should have avoided that problem, as you said)
    2. connection timeout. probably a failure at the port forwarding level.
    3. connection succeeded but closed; this can happen for a few reasons, such as the system is in an early boot up state. there’s usually a message in this case.
    4. connection succeeded but auth rejected. this can happen if your os failed to boot but came up in a fallback state of some kind.

    Knowing which one of these it is can give you a lot more information about what’s wrong:

    System can’t get past initial boot = Maybe your NAS is unplugged? Maybe your home DNS cache is down?

    Connection refused = either fail2ban or possibly your home IP has moved and you’re trying to connect to somebody else’s computer? (nginx is very popular after all, it’s not impossible somebody else at your ISP has it running). This can also be a port forwarding failure = something’s wrong with your router.

    Connection succeeded + closed is similar to “can’t get past initial boot”

    Auth rejected might give you a fallback option if you can figure out a default username/password, although you should hope that’s not the case because it means anyone else can also get in when your system is in fallback.

    Very few of these things are actually fixable remotely, btw. I suggest having your sister unplug everything related to your setup, one device at a time. Internet router, raspberry pi, NAS, your VM host, etc. Make sure to give them a minute to cool down. Hardware, particularly cheap hardware, tends to fail when it gets hot, and this can take a while to happen, and, well, it’s been hot.

    Here’s a few things with a high likelihood of failing when you’re away from home:

    • heat, as previously mentioned.
    • running out of disk space. Maybe you’re logging too much, throw some more disk in there and tune down the logging. This can definitely affect SSH, and definitely won’t be fixed by a reboot.
    • OOM failures (or other resource leaks). This isn’t likely to affect your bare metal ssh, but it could. Some things leak memory, and this can lead to cascading process destruction by the OS. In this scenario you’d probably be able to connect to things in the first few minutes after a reboot, though.
    • shitty cabling. Sometimes stuff just falls out of the socket, if it wasn’t plugged in perfectly to begin with. (Heat can also contribute to this one.)
    • reliance on a cloud service that’s currently down. (This can include: you didn’t pay the bill.) Hopefully your OS boot doesn’t fail due to a cloud service, but I’ve definitely seen setups that could.
    • shnizmuffin@lemmy.inbutts.lol
      link
      fedilink
      English
      arrow-up
      4
      ·
      4 months ago

      running out of disk space

      This would be my first guess. Nothing shuts down arbitrary services quite like a full /var/logs.

      • HumanPersonOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 months ago

        I’ve got a 1tb boot drive and it isn’t used for much, but stuff happens, so… idk.

    • HumanPersonOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      4 months ago

      It says connection closed. There is no message beyond that. I think it is likely that it is failing to boot. I might video call my sister and have her try to boot it so I can see any errors.

      Edit: Also, thanks very much for your response. It was very detailed and informative.

      • Shimitar@feddit.it
        link
        fedilink
        English
        arrow-up
        3
        ·
        4 months ago

        Connection closed means somebody is listening to the port and failing/not willing to reply.

        Unless some network middlemen is closing your connection (ssh should be on port > 1024 to be safe from ISP throttling), your ssh server is severely strained (oom, disk full…) or your F2B is kicking in.