After reading this article, I had a few dissenting thoughts, maybe someone will provide their perspective?
The article suggests not running critical workloads virtually based on a failure scenario of the hosting environment (such as ransomware on hypervisor).
That does allow using the ‘all your eggs in one basket’ phrase, so I agree that running at least one instance of a service physically could be justified, but threat actors will be trying to time execution of attacks against both if possible. Adding complexity works both ways here.
I don’t really agree with the comments about not patching however. The premise that the physical workload or instance would be patched or updated more than the virtual one seems unrelated. A hesitance to patch systems is more about up time vs downtime vs breaking vs risk in my opinion.
Is your organization running critical workloads virtual like anything else, combination physical and virtual, or combination of all previous plus cloud solutions (off prem)?
I work for a newspaper. It was published without fail every single day since 1945 (when my country was still basically just rubble, deservedly).
So even when all our systems are encrypted by ransomware, the newspaper MUST BE ABLE TO BE PRINTED as a matter of principle.
We run all our systems virtualized, because everything else would be unmaintainable and it’s a 24/7 operation.
But we also have a copy of the most essential systems running on bare metal, completely air-gapped from everything else, and the internet.
Even I as the admin can’t access them remotely in any way. If I want to, I have to walk over to another building.
In case of a ransomware attack, the core team meets in a room with only internal wifi, and is given emergency laptops from storage with our software preinstalled. They produce the files for the paper, save them on a USB stick, and deliver that to the printing press.
Seems like your org has taken resilience and response planning seriously. I like it.
Another newspaper in our region was unprepared and got ransomwared. They’re still not back to normal, over a year later.
After that, our IT basically got a blank check from executive to do whatever is necessary.
Funny how that seems to often be the case. They need to see the consequences, not just be warned. An ‘I told you so’ moment…
I’m just glad they got to see the consequences in another company.
Their senior IT admin had a heart attack a month after the ransomware attack.
…which is also kept with the air-gaped system and tossed once used, i assume…
There’s several for redundancy, in their original packaging, locked in a safe, and replaced yearly.
How you keep the air gapped system in sync?
We don’t. It’s a separate, simplified system that only lets the core team members access the layout-, editing- and typesetting-software that is locally installed on the bare metal servers.
In emergency mode, they get written articles and images from the reporters via otherwise unused, remotely hosted email addresses, and as a second backup, Signal.
They build the pages from that, send them to the printers, and the paper is printed old-school using photographic plates.
That’s a very high degree of BCDR planning, and quite costly I assume.
It’s less than the cost of our cybersecurity insurance, which will probably drop us on a technicality when the day comes.
And it’s not entirely an economic decision. The paper is family-owned in the 3rd generation, historically relevant as one of the oldest papers in the country, and absolutely no one wants to be the one in charge when it doesn’t print for the first time ever.