More

stego-tech · 2026-03-03T19:22:45 1772565765

These sorts of core-density increases are how I win cloud debates in an org.

* Identify the workloads that haven't scaled in a year. Your ERPs, your HRIS, your dev/stage/test environments, DBs, Microsoft estate, core infrastructure, etc. (EDIT, from zbentley: also identify any cross-system processing where data will transfer from the cloud back to your private estate to be excluded, so you don't get murdered with egress charges)

* Run the cost analysis of reserved instances in AWS/Azure/GCP for those workloads over three years

* Do the same for one of these high-core "pizza boxes", but amortized over seven years

* Realize the savings to be had moving "fixed infra" back on-premises or into a colo versus sticking with a public cloud provider

Seriously, what took a full rack or two of 2U dual-socket servers just a decade ago can be replaced with three 2U boxes with full HA/clustering. It's insane.

Back in the late '10s, I made a case to my org at the time that a global hypervisor hardware refresh and accompanying VMware licenses would have an ROI of 2.5yrs versus comparable AWS infrastructure, even assuming a 50% YoY rate of license inflation (this was pre-Broadcom; nowadays, I'd be eyeballing Nutanix, Virtuozzo, Apache Cloudstack, or yes, even Proxmox, assuming we weren't already a Microsoft shop w/ Hyper-V) - and give us an additional 20% headroom to boot. The only thing giving me pause on that argument today is the current RAM/NAND shortage, but even that's (hopefully) temporary - and doesn't hurt the orgs who built around a longer timeline with the option for an additional support runway (like the three-year extended support contracts available through VARs).

If we can't bill a customer for it, and it's not scaling regularly, then it shouldn't be in the public cloud. That's my take, anyway. It sucks the wind from the sails of folks gung-ho on the "fringe benefits" of public cloud spend (box seats, junkets, conference tickets, etc...), but the finance teams tend to love such clear numbers.

carefree-bob · 2026-03-03T20:33:48 1772570028

The main cost with on-prem is not the price of the gear but the price of acquiring talent to manage the gear. Most companies simply don't have the skillset internally to properly manage these servers, or even the internal talent to know whether they are hiring a good infrastructure engineer or not during the interview process.

For those that do, your scaling example works against you. If today you can merge three services into one, then why do you need full time infrastructure staff to manage so few servers? And remember, you want 24/7 monitoring, replication for disaster recovery, etc. Most businesses do not have IT infrastructure as a core skill or differentiator, and so they want to farm it out.

throwup238 · 2026-03-03T20:53:01 1772571181

> even the internal talent to know whether they are hiring a good infrastructure engineer or not during the interview process.

This is really the core problem. Every time I’ve done the math on a sizable cloud vs on-prem deployment, there is so much money left on the table that the orgs can afford to pay FAANG-level salaries for several good SREs but never have we been able to find people to fill the roles or even know if we had found them.

The numbers are so much worse now with GPUs. The cost of reserved instances (let alone on-demand) for an 8x H100 pod even with NVIDIA Enterprise licenses included leaves tens of thousands per pod for the salary of employees managing it. Assuming one SREs can manage at least four racks the hardware pays for itself, if you can find even a single qualified person.

everforward · 2026-03-03T22:50:22 1772578222

I work in SRE and the way you describe it would give me pause.

The first is that SRE team size primarily scales with the number of applications and level of support. It does scale with hardware but sublinearly, where number of applications usually scales super linearly. It takes a ton less effort to manage 100 instances of a single app than 1 instance of 100 separate apps (presuming SRE has any support responsibilities for the app). Talking purely in terms of hardware would make me concerned that I’m looking at an impossible task.

The second (which you probably know, but interacts with my next point) is that you never have single person SRE teams because of oncall. Three is basically the minimum, four if you want to avoid oncall burnout.

The last is that I don’t know many SREs (maybe none at all) that are well-versed enough in all the hardware disciplines to manage a footprint the size we’re talking. If each SRE is 4 racks and a minimum team size is 4, that’s 16 racks. You’d need each SRE to be comfortable enough with networking, storage, operating system, compute scheduling (k8s, VMWare, etc) to manage each of those aspects for a 16 rack system. In reality, it’s probably 3 teams, each of them needs 4 members for oncall, so a floor of like 48 racks. Depending on how many applications you run on 48 racks, it might be more SREs that split into more specialized roles (a team for databases, a team for load balancers, etc).

Numbers obviously vary by level of application support. If support ends at the compute layer with not a ton of app-specific config/features, that’s fewer folks. If you want SRE to be able to trace why a particular endpoint is slow right now, that’s more folks.

PunchyHamster · 2026-03-03T23:27:28 1772580448

> The last is that I don’t know many SREs (maybe none at all) that are well-versed enough in all the hardware disciplines to manage a footprint the size we’re talking. If each SRE is 4 racks and a minimum team size is 4, that’s 16 racks. You’d need each SRE to be comfortable enough with networking, storage, operating system, compute scheduling (k8s, VMWare, etc) to manage each of those aspects for a 16 rack system. In reality, it’s probably 3 teams, each of them needs 4 members for oncall, so a floor of like 48 racks. Depending on how many applications you run on 48 racks, it might be more SREs that split into more specialized roles (a team for databases, a team for load balancers, etc).

That's vastly overstating it. You hit nail in the head in previous paragraphs, it's number of apps (or more generally speaking ,environments) that you manage, everything else is secondary.

And that is especially true with modern automation tools. Doubling rack count is big chunk of initial time spent moving hardware of course, but after that there is almost no difference in time spent maintaining them.

In general time per server spent will be smaller because the bigger you grow the more automation you will generally use and some tasks can be grouped together better.

Like, at previous job, server was installed manually, coz it was rare.

At my current job it's just "boot from network, pick the install option, enter the hostname, press enter". Doing whole rack (re)install would take you maybe an hour, everything else in install is automated, you write manifest for one type/role once, test it, and then it doesn't matter whether its' 2 or 20 servers.

If we grew server fleet say 5-fold, we'd hire... one extra person to a team of 3. If number of different application went 5-fold we'd probably had to triple the team size - because there is still some things that can be made more streamlined.

Tasks like "go replace failed drive" might be more common but we usually do it once a week (enough redundancy) for all servers that might've died, if we had 5x the number of servers the time would be nearly the same because getting there dominates the 30s that is needed to replace one.

ori_b · 2026-03-03T23:40:32 1772581232

Noteworthy: the number of apps isn't affected by whether the machines are in your datacenter or Amazon's.

esseph · 2026-03-03T23:34:42 1772580882

So your definition of SRE is anybody that works on infra?

skissane · 2026-03-04T03:09:17 1772593757

> The first is that SRE team size primarily scales with the number of applications and level of support. It does scale with hardware but sublinearly, where number of applications usually scales super linearly. It takes a ton less effort to manage 100 instances of a single app than 1 instance of 100 separate apps (presuming SRE has any support responsibilities for the app). Talking purely in terms of hardware would make me concerned that I’m looking at an impossible task.

Never been an SRE but interact with them all the time…

My own personal experience is there is commonly a division between App SREs that look after the app layer and Infra SREs that looks after the infrastructure layer (K8S, storage, network, etc)

The App SRE role absolutely scales with the number of distinct apps. The extent to which the Infra SRE role does depends on how diverse the apps are in terms of their infrastructure demands

Eridrus · 2026-03-03T23:13:34 1772579614

I disagree with on-prem being ideal for GPU for most people.

If you're doing regular inference for a product with very flat throughput requirements (and you're doing on-prem already), on-prem GPUs can make a lot of sense.

But if you're doing a lot of training, you have very bursty requirements. And the H100s are specifically for training.

If you can have your H100 fleet <38% utilized across time, you're losing money.

If you have batch throughput you can run on the H100s when you're not training, you're probably closer to being able to wanting on-prem.

But the other thing to keep in mind is that AWS is not the only provider. It is a particularly expensive provider, and you can buy capacity from other neoclouds if you are cost-sensitive.

harrall · 2026-03-04T01:42:25 1772588545

You didn’t find people because SREs don’t do that.

You wanted sysadmins / IT / data center technicians.

tgrowazay · 2026-03-03T21:23:40 1772573020

Self-hosted 8xH100 is ~$250k, depreciated across three years => $80k/year, with power and cooling => $90k/year (~$10/hour total).

AWS charges $55/hour for EC2 p5.48xlarge instance, which goes down with 1 or 3 year commitments.

With 1 year commitment, it costs ~$30/hour => $262k per year.

3-year commitment brings price down to $24/hour => $210k per year.

This price does NOT include egress, and other fees.

So, yeah, there is a $120k-$175k difference that can pay for a full-time on-site SRE, even if you only need one 8xH100 server.

Numbers get better if you need more than one server like that.

Aurornis · 2026-03-03T21:33:28 1772573608

$120K isn't going to cover the fully loaded costs of an SRE who can set up and run that.

Hiring 1 person to run the infrastructure means that 1 person is on-call 24/7 forever.

If there's an issue with the server while they're sick or on vacation, you just stop and wait.

If they take a new job, you need to find someone to take over or very quickly hire a replacement.

There's a second bus factor: What happens when that 8xH100 starts to get flakey? You can't move the jobs to another server because you only have one. You can start diagnosing things and replacing parts and hope it gets to the root issue, but that's more downtime.

Going on-prem like this is highly risky. It works well until the hardware starts developing problems or the person in charge gets a new job. The weeks and months lost to dealing with the server start to become a problem. The SRE team starts to get tired of having to do all of their work on weekends because they can't block active use during the week. Teams start complaining that they need to use cloud to keep their project moving forward.

Figs · 2026-03-03T21:52:14 1772574734

> $120K isn't going to cover the fully loaded costs of an SRE who can set up and run that.

> Hiring 1 person to run the infrastructure means that 1 person is on-call 24/7 forever.

> If there's an issue with the server while they're sick or on vacation, you just stop and wait.

Very much depends on what you're doing, of course, but "you just stop and wait" for sickness/vacation sometimes is actually good enough uptime -- especially if it keeps costs down. I've had that role before... That said, it's usually better to have two or three people who know the systems though (even if they're not full time dedicated to them) to reduce the bus factor.

roryirvine · 2026-03-04T10:23:46 1772619826

So the entire business was happy to go offline for 2/3 weeks whenever their infra person fancied going off on their summer holiday?

By doing this, you're guaranteeing a bus factor of below 1. I can't think of any business that wouldn't see that as being a completely unacceptable risk.

Manuel_D · 2026-03-04T00:32:30 1772584350

> There's a second bus factor: What happens when that 8xH100 starts to get flakey? You can't move the jobs to another server because you only have one.

You can still use cloud for excess capacity when needed. E.g. use on-prem for base load, and spin up cloud instances for peaks in load.

PunchyHamster · 2026-03-03T23:29:13 1772580553

> There's a second bus factor: What happens when that 8xH100 starts to get flakey? You can't move the jobs to another server because you only have one. You can start diagnosing things and replacing parts and hope it gets to the root issue, but that's more downtime.

they come with warranty, often with technican guaranteed to arrive within few hours or at most a day. Also if SHTF just getting cloud to augument current lackings isn't hard

justsomehnguy · 2026-03-03T22:12:01 1772575921

If a business which require at least a quarter million bucks worth of hardware for the basic operation yet it can't pay the market rate for someonr who would operate it - maybe the basics of that business is not okay?

formerly_proven · 2026-03-03T21:39:42 1772573982

> There's a second bus factor: What happens when that 8xH100 starts to get flakey?

These come in a non-flakey variant?

spwa4 · 2026-03-03T22:33:25 1772577205

It's called a warranty.

And the other argument: every company I've ever know to do AWS has an AWS sysadmin (sorry "devops"), same for Azure. Even for small deployments. And departments want their own person/team.

stego-tech · 2026-03-04T00:28:39 1772584119

Out of all the comments on numbers, SREs, and scaling, you get the response for meeting numbers with numbers!

> $120K isn't going to cover the fully loaded costs of an SRE who can set up and run that.

Literally this. I can do SRE on-prem and cloud, and my 50/30/20 budget break-even point (as in, needs and savings but no wants - so 70%) is $170k before taxes. Rent is astonishingly high right now, and the sort of mid-career professional you want to handle SRE for your single DC is going to take $150k in this market before fucking off to the first $200k job they get.

Know your market, and pay accordingly. You cannot fuck around with SREs.

> Hiring 1 person to run the infrastructure means that 1 person is on-call 24/7 forever.

This is less of an issue than you might think, but strongly dependent upon the quality of talent you’ve retained and the budget you’ve given them. Shitbox hardware or cheap-ass talent means you’ll need to double or triple up locally, but a quality candidate with discretion can easily be supported by a counterpart at another office or site, at least short-term. Ideally though, yeah, you’ll need two engineers to manage this stack, but AWS savings on even a modest (~700 VMs) estate will cover their TC inside of six months, generally.

> There's a second bus factor: What happens when that 8xH100 starts to get flakey? You can't move the jobs to another server because you only have one. You can start diagnosing things and replacing parts and hope it gets to the root issue, but that's more downtime.

This strikes at another workload I neglected to mention, and one I highly recommend keeping in the public cloud: GPUs.

GPUs on-prem suck. Drivers are finnicky, firmware is flakey, vendor support inconsistent, and SR-IOV is a pain in the ass to manage at scale. They suck harder than HBAs, which I didn’t think was possible.

If you’re consuming GPUs 24x7 and can afford to support them on-prem, you’re definitely not here on HN killing time. For everyone else, tune your scaling controls on your cloud provider of choice to use what you need, when you need it, and accept the reality that hyperscalers are better suited for GPU workloads - for now.

> Going on-prem like this is highly risky.

Every transaction is risky, but the risk calculus for “static” (ADDS) or “stable” (ERP, HRIS, dev/test) work makes on-prem uniquely appealing when done right. Segment out your resources (resist the urge for HPC or HCI), build sensible redundancies (on-prem or in the cloud), and lean on workhorse products over newer, fancier platforms (bulletproof hypervisors instead of fragile K8s clusters), and you can make the move successful and sensible. The more cowboy you go with GPUs, K8s, or local Terraform, the more delicate your infra becomes on-prem - and thus the riskier it is to keep there.

Keep it simple, silly.

throwup238 · 2026-03-04T03:43:39 1772595819

> Out of all the comments on numbers, SREs, and scaling, you get the response for meeting numbers with numbers!

>> $120K isn't going to cover the fully loaded costs of an SRE who can set up and run that.

> Literally this. I can do SRE on-prem and cloud, and my 50/30/20 budget break-even point (as in, needs and savings but no wants - so 70%) is $170k before taxes. Rent is astonishingly high right now, and the sort of mid-career professional you want to handle SRE for your single DC is going to take $150k in this market before fucking off to the first $200k job they get.

That's $120k per pod. Four pods per rack at 50kW.

What universe are we living in that a single SRE can't manage even a single rack for less than half a million in total comp?

lazylizard · 2026-03-04T03:09:49 1772593789

i am not sre, merely sysadmin.

and somehow i have this impression that gpus on slurm/pbs could not be simpler.

u can use a vm for the head node, dont even need the clustering really..if u can accept taking 20min to restore a vm.. and the rest of the hardware are homogeneous - you setup 1 right and the rest are identical.

and its a cluster with a job queue.. 1 node going down is not the end of the world..

ok if u have pcie GPUs sometimes u have to re-seat them and its a pain. otherwise if ur h200 or disks fail u just replace them, under warranty or not...

charcircuit · 2026-03-03T23:07:11 1772579231

>If there's an issue with the server while they're sick or on vacation, you just stop and wait.

You can ask AI to troubleshoot and fix the issue.

ozgrakkurt · 2026-03-03T21:00:45 1772571645

This factually did not play out like this in my experience.

The company did need the same exact people to manage AWS anyway. And the cost difference was so high that it was possible to hire 5 more people which wasn't needed anyway.

Not only the cost but not needing to worry about going over the bandwidth limit and having soo much extra compute power made a very big difference.

Imo the cloud stuff is just too full of itself if you are trying to solve a problem that requires compute like hosting databases or similar. Just renting a machine from a provider like Hetzner and starting from there is the best option by far.

LunaSea · 2026-03-03T21:10:32 1772572232

> The company did need the same exact people to manage AWS anyway.

That is incorrect. On AWS you need a couple DevOps that will Tring together the already existing services.

With on premise, you need someone that will install racks, change disks, setup high availability block storage or object storage, etc. Those are not DevOps people.

PunchyHamster · 2026-03-03T23:33:25 1772580805

> With on premise, you need someone that will install racks, change disks, setup high availability block storage or object storage, etc. Those are not DevOps people.

we have 7 racks and 3 people. The things you mentioned aren't even 5% of the workload.

There are things you figure out once, bake into automation, and just use.

You install server once and remove it after 5-10 years, depending on how you want to depreciate it. Drives die rarely enough it's like once every 2 months event at our size

The biggest expense is setting up automation (if I was re-doing our core infrastructure from scratch I'd probably need good 2 months of grind) but after that it's free sailing. Biggest disadvantage is "we need a bunch of compute, now", but depending on business that might never be a problem, and you have enough savings to overbuild a little and still be ahead. Or just get the temporary compute off cloud.

vel0city · 2026-03-04T03:33:28 1772595208

> Biggest disadvantage is "we need a bunch of compute, now"

And depending on the problem set in question, one can also potentially leverage "the cloud" for the big bursty compute needs and have the cheap colo for the day to day stuff.

For instance, in a past life the team I worked on needed to run some big ML jobs while having most things on extremely cheap colo infra. Extract the datasets, upload the extracted and well-formatted data to $cloud_provider, have VPN connectivity for the small amount of other database traffic, and we can burst to have whatever compute needed to get the computations done really quick. Copy the results artifact back down, deploy to cheap boxes back at the datacenter to host for clients stupid-cheap.

rcxdude · 2026-03-04T00:48:27 1772585307

Moving around the physical hardware is a truly tiny part of the actual job, it's really not relevant. (especially nowadays, see the top level comment about how you can do an insane amount (probably more than the median cloud deployment) with a fraction of a rack).

ozgrakkurt · 2026-03-03T21:27:37 1772573257

To be clear, I'm not writing about on-premise. I mean difference between managed cloud and renting dedicated servers

Dylan16807 · 2026-03-03T21:41:54 1772574114

Even if you do include physical server setup and maintenance, one or two days per month is probably enough enough for a couple hundred rack units.

LunaSea · 2026-03-04T08:53:01 1772614381

Ah sorry, yes, that makes sense.

ocdtrekkie · 2026-03-03T22:05:27 1772575527

People will install racks and swap drives for significantly less money than DevOps, lol. People who can build LEGO sets are cheaper than software developers.

lightedman · 2026-03-03T21:30:56 1772573456

"Those are not DevOps people."

Real Devops people are competent from physical layer to software layer.

Signed,

Aerospace Devop

LunaSea · 2026-03-04T08:52:45 1772614365

> Real Devops people are competent from physical layer to software layer.

This is usually not the case because DevOps are often people that mostly worked on cloud services and Kubernetes clusters and not real hardware since most companies do not have on premise hardware anymore.

b40d-48b2-979e · 2026-03-04T02:02:07 1772589727

What a naïve take. Real™ DevOps know what they need to know.

mgaunard · 2026-03-03T21:20:57 1772572857

Ops people are typically more useful given you probably already have devs.

PunchyHamster · 2026-03-03T23:14:05 1772579645

> The main cost with on-prem is not the price of the gear but the price of acquiring talent to manage the gear. Most companies simply don't have the skillset internally to properly manage these servers, or even the internal talent to know whether they are hiring a good infrastructure engineer or not during the interview process.

That's partially true; managing cloud also takes skill, most people forget that with end result being "well we saved on hiring sysadmins, but had to have more devops guys". Hell I manage mostly physical infrastructure (few racks, few hundred VMs) and good 80% of my work is completely unrelated to that, it's just the devops gluing stuff together and helping developers to set their stuff up, which isn't all that different than it would be in cloud.

> And remember, you want 24/7 monitoring, replication for disaster recovery, etc.

And remember, you need that for cloud too. Plenty of cloud disaster stories to see where they copy pasted some tutorial thinking that's enough then surprise.

There is also partial way of just getting some dedicated servers from say OVH and run infra on that, you cut out a bit of the hardware management from skillset and you don't have the CAPEX to deal with.

But yes, if it is less than at least a rack, it's probably not worth looking for onprem unless you have really specific use case that is much cheaper there (I mean less than usual half)

dgxyz · 2026-03-03T23:16:23 1772579783

This is not the case. We had to double staff count going from three cages to AWS. And AWS was a lot more expensive. And now we're stuck.

On top of that no one really knows what the fuck they are doing in AWS anyway.

tempaccount5050 · 2026-03-03T22:49:34 1772578174

You need the exact same people to run the infra in the cloud. If they don't have IT at all, they aren't spinning up cloud VMs. You're mixing together SaaS and actual cloud infra.

danielheath · 2026-03-03T22:59:09 1772578749

I'm one of those people, and I don't agree.

Before I drop 5 figures on a single server, I'd like to have some confidence in the performance numbers I'm likely to see. I'd expect folk who are experienced with on-prem have a good intuition about this - after a decade of cloud-only work, I don't.

Also, cloud networking offers a bunch of really nice primitives which I'm not clear how I'd replicate on-prem.

I've estimated our IT workload would roughly double if we were to add physically racking machines, replacing failed disks, monitoring backups/SMART errors etc. That's... not cheap in staff time.

Moving things on-prem starts making financial sense around the point your cloud bills hit the cost of one engineers salary.

esseph · 2026-03-03T23:39:20 1772581160

> Also, cloud networking offers a bunch of really nice primitives which I'm not clear how I'd replicate on-prem.

Like what?

SamuelAdams · 2026-03-04T03:12:03 1772593923

IAM comes to mind, with fine grained control over everything.

S3 has excellent legal and auditory settings for data, as well as automatic data retention policies.

KMS is a very secure and well done service. I dare you to find an equivalent on-prem solution that offers as much security.

And then there's the whole DR idea. Failing over to another AWS region is largely trivial if you set it up correctly - on prem is typically custom to each organization, so you need to train new staff with your organizations workflows. Whereas in AWS, Route53 fail-over routing (for example) is the same across every organization. This reduces cost in training and hiring.

esseph · 2026-03-04T04:45:56 1772599556

I've worked at many enterprises that have done and do these very things. Some for fixed workloads at scale, some for data creation/use locality issues, some for performance. I think there is about a 15 year knowledge gap in on-prem competence and what the newest shiniest is on prem for some people. Yes, some of the vendors and gear are VERY bad, but not all, and there's always eBPF :)

danielheath · 2026-03-04T02:47:11 1772592431

The biggest one for me is the way AWS security groups & IAM work.

In AWS, it's straightforward to say e.g. "permit traffic on port X from instances holding IAM role Y".

You can easily e.g. get the firewall rules for all your ec2 instances in a structured format.

I really would not look forward to building something even 1/10th as functional as that.

tempaccount5050 · 2026-03-04T02:55:51 1772592951

And you think just anyone can set that up? No sys admin/infra guy needed? Seems pretty risky.

vel0city · 2026-03-04T04:12:04 1772597524

I mean not just anyone, but its far less complicated than dealing with arcane iptables commands. And yet far more powerful, being able to just say "instances like this can talk to instances like this in these particular ways, reject everything else". Don't need subnet rules or whatever, its all about identity of the actual things.

Meanwhile lots of enterprise firewalls barely even have a concept of "zones". Its practically not even close to comparing for most deployments. Maybe with extremely fancy firewall stacks with $ $MAX_INT service contracts one can do something similar. But I guess with on-prem stuff things are often less ephemeral, so there's slightly less need.

ahartmetz · 2026-03-04T08:30:11 1772613011

I could type your arcane iptables commands for a couple hundred an hour. That stuff is easy compared to some software development tasks. I have sometimes struggled, but I've always found a solution after a few hours max.

esseph · 2026-03-04T05:49:18 1772603358

> I guess with on-prem stuff things are often less ephemeral, so there's slightly less need

Kubernetes is running on bare metal quite a lot of places.

esseph · 2026-03-04T04:40:57 1772599257

I would probably just build the infra in crossplane which standardizes a lot of features across the board and gives developers a set of APIs to use / dashboard against. Different deployments and orgs have different needs and desire different features though.

__turbobrew__ · 2026-03-04T04:39:43 1772599183

BGP based routing is a major pain in the ass to do on-prem. If you want true HA in the datacenter you are going to need to utilize BGP.

esseph · 2026-03-04T04:42:11 1772599331

I mean, BGP EVPN is the datacenter standard. (Linux infra / k8s / networking guy)

tiew9Vii · 2026-03-04T10:22:14 1772619734

> The main cost with on-prem is not the price of the gear but the price of acquiring talent to manage the gear. Most companies simply don't have the skillset internally to properly manage these servers

This comes up again and again. It was the original sales pitch from cloud vendors.

Often the very same companies repeating this messaging are recruiting and paying large teams of platform developers to manage their cloud…and pay for them to be on call.

boltzmann-brain · 2026-03-03T20:51:32 1772571092

As opposed to talent to manage the AWS? Sorry, AWS loses here as well.

carefree-bob · 2026-03-03T20:57:20 1772571440

I know of AWS's reputation as a business and what the devs say who work there, so I have no argument against your point, except to say that they do manage to make it work. Somewhere in there must be some unsung heroes keeping the whole thing online.

rcxdude · 2026-03-04T00:52:10 1772585530

The point being that AWS runs AWS, they don't run your business on AWS. You still need someone to actually set up AWS to do what you want, much like you would need someone to run your on-premises servers. And in my experience, the difference is not much.

boltzmann-brain · 2026-03-04T07:01:26 1772607686

The biggest issue is that with colo you're building a skill pool that can be used forever, with AWS you're building a skill pool centered around a corporate entity's business strategies and an inscrutable, closed-source system, which is not sustainable.

hattmall · 2026-03-04T05:03:23 1772600603

To me this doesn't sound logical because you still have to hire someone to manage your cloud deployments which is an entire specialized discipline. Yeah you can get some leeway the job being fully remote I guess but ultimately you aren't reducing headcount as linearly as you seem to imply by going cloud vs on-prem.

barrkel · 2026-03-03T21:41:12 1772574072

What about the cost of k8s and AWS experts etc.?

citrin_ru · 2026-03-03T21:44:43 1772574283

> price of acquiring talent to manage the gear

Is it still a problem in 2026 when unemployment in IT is rising? Reasons can be argued (the end of ZIRP or AI) but hiring should be easier than it was at any time during the last 10 years.

Figs · 2026-03-03T22:00:50 1772575250

Hiring people is still fucked in 2026 in my experience. HR processes are extremely dysfunctional at many organizations...

PunchyHamster · 2026-03-03T23:33:55 1772580835

people with that set of skills are never looking for job for long.

bdangubic · 2026-03-03T22:10:57 1772575857

hiring in 2026 is 100x harder than ever before

briffle · 2026-03-04T06:08:41 1772604521

While I agree with you, some solutions, such as Oxide Computing could come pretty close to having all the ease of cloud, one whole rack of computers at a time.

nszceta · 2026-03-03T21:06:36 1772571996

Managing AWS is a ton of work anyway

justsomehnguy · 2026-03-03T21:35:07 1772573707

> main cost with on-prem is not the price of the gear but the price of acquiring talent to manage the gear

Not quite. If you hire a bad talent to manage your 'cloud gear' then you would find what the mistakes which would cost you nothing on-premises would cost you in the cloud. Sometimes - a lot.

highfrequency · 2026-03-03T21:33:51 1772573631

Given how good Apple Silicon is these days, why not just buy a spec'd out Mac Studio (or a few) for $15k (512 GB RAM, 8 TB NVMe), maybe pay for S3 only to sync data across machines. No talent required to manage the gear. AWS EC2 costs for similar hardware would net out in something ridiculous like 4 months.

zbentley · 2026-03-03T19:33:24 1772566404

That’s definitely the right call in some cases. But as soon as there’s any high-interconnect-rate system that has to be in cloud (appliances with locked in cloud billing contracts, compute that does need to elastically scale and talks to your DB’s pizza box, edge/CDN/cache services with lots of fallthrough to sources of truth on-prem), the cloud bandwidth costs start to kill you.

I’ve had success with this approach by keeping it to only the business process management stacks (CRMs, AD, and so on—examples just like the ones you listed). But as soon as there’s any need for bridging cloud/onprem for any data rate beyond “cronned sync” or “metadata only”, it starts to hurt a lot sooner than you’d expect, I’ve found.

stego-tech · 2026-03-03T19:36:30 1772566590

Yep, 100%, but that's why identifying compatible workloads first is key. A lot of orgs skip right to the savings pitch, ignorant of how their applications communicate with one another - and you hit the nail on the head that applications doing even some processing in a cloud provider will murder you on egress fees by trying to hybrid your app across them.

Folks wanting one or the other miss savings had by effectively leveraging both.

hedora · 2026-03-03T20:02:50 1772568170

Any experience with the mid-to-small cloud providers that provide un-metered network ports and/or free interconnect with partner providers?

(For various reasons, I just care about VPS/bare metal, and S3-compatiblity.)

I'm looking at those because I'm having difficulty forecasting bandwidth usage, and the pessimistic scenarios seem to have me inside the acceptable use policies of the small providers while still predicting AWS would cost 5-10x more for the same workload.

stackskipton · 2026-03-03T20:56:29 1772571389

Vultr and Digital Ocean both offer Direct Connects. I've had good experience with their VPSes.

Imustaskforhelp · 2026-03-03T21:02:19 1772571739

Netcup and OVH provide free un-metered ports. There are actually lots of options available on the market. BuyVM is another good one.

PaulKeeble · 2026-03-03T21:32:03 1772573523

What has surprised me about the cloud is that the price has been towards ever increasing prices for cores. Yet the market direction is the opposite, what used to be a 1/2 or a 1/4 of a box is now 1/256 and its faster and yet the price on the cloud has gone ever up for that core. I think their business plan is to wipe out all the people who used to maintain the on premise machines and then they can continue to charge similar prices for something that is only getting cheaper.

Its hard drive and SSD space prices that stagger me on the cloud. Where one of the server CPUs might only be about 2x the price of buy a CPU for a few years if you buy less in a small system (all be it with less clock speed usually on the cloud) the drive space is at least 10-100x the price of doing it locally. Its got a bit more potential redudency but for that overhead you can repeat that data a lot of times.

As time has gone on the deal of cloud has got worse as the hardware got more cores.

bearjaws · 2026-03-03T21:11:16 1772572276

I just don't know if the human capital is there.

At my job we use HyperV, and finding someone who actually knows HyperV is difficult and expensive. Throw in Cisco networking, storage appliances, etc to make it 99.99% uptime...

Also that means you have just one person, you need at least two if you don't want gaps in staffing, more likely three.

Then you still need all the cloud folks to run that.

We have a hybrid setup like this, and you do get a bit of best of both worlds, but ultimately managing onprem or colo infra is a huge pain in the ass. We only do it due to our business environment.

to11mtm · 2026-03-03T23:08:09 1772579289

I think you're hitting on a general problem statement a lot of orgs run into, even ignoring the uptime figure...

All of the complexity of onprem, especially when you need to worry about failover/etc can get tricky, especially if you are in a wintel env like a lot of shops are.

i.e. lots of companies are doing sloppy 'just move the box to an EC2 instance' migrations because of how VMWare jacked their pricing up, and now suddenly EC2/EBS/etc costing is so cheap it's a no brain choice.

I think the knowledge base to set up a minimal cost solution is too tricky to find a benefit vs all the layers (as you almost touched on, all the licensing at every layer vs a cloud provider managing...)

That said, rug pulls are still a risk; I try to push for 'agnostic' workloads in architecture, if nothing else because I've seen too many cases where SaaS/PaaS/etc decide to jack up the price of a service that was cheap, and sure you could have done your own thing agnostically, but now you're there, and migrating away has a new cost.

IOW, I agree; I don't think the human capital is there as far as infra folks who know how to properly set up such environments, especially hitting the 'secure+productive' side of the triangle.

sounds · 2026-03-03T23:02:35 1772578955

> I just don't know if the human capital is there.

> At my job we use HyperV, and finding someone who actually knows HyperV is difficult and expensive...

Try offering significantly higher pay.

ponector · 2026-03-04T06:55:48 1772607348

Or even try to educate people. It was common to have learning programs but nowadays managers only complain you cannot find cheap experts.

eastabrooka · 2026-03-04T08:52:27 1772614347

"We educated the people and they left because they could get better elsewhere" - Some Manager

jfindley · 2026-03-03T20:48:55 1772570935

Do note though that AIUI these are all E-cores, have poor single-threaded performance and won't support things like AVX512. That is going to skew your performance testing a lot. Some workloads will be fine, but for many users that are actually USING the hardware they buy this is likely to be a problem.

If that's you then the GraniteRapids AP platform that launched previously to this can hit similar numbers of threads (256 for the 6980P). There are a couple of caveats to this though - firstly that there are "only" 128 physical cores and if you're using VMs you probably don't want to share a physical core across VMs, secondly that it has a 500W TDP and retails north of $17000, if you can even find one for sale.

Overall once you're really comparing like to like, especially when you start trying to have 100+GbE networking and so on, it gets a lot harder to beat cloud providers - yes they have a nice fat markup but they're also paying a lot less for the hardware than you will be.

Most of the time when I see takes like this it's because the org has all these fast, modern CPUs for applications that get barely any real load, and the machines are mostly sitting idle on networks that can never handle 1/100th of the traffic the machine is capable of delivering. Solving that is largely a non-technical problem not a "cloud is bad" problem.

adrian_b · 2026-03-04T02:59:46 1772593186

These Intel Darkmont cores are in a different performance class than the (Crestmont) E-cores used in the previous generation of Sierra Forest Xeon CPUs. For certain workloads they may have even a close to double performance per core.

Darkmont is a slightly improved variant of the Skymont cores used in Arrow Lake/Lunar Lake and it has a performance very similar to the Arm Neoverse V3 cores used in Graviton5, the latest generation of custom AWS CPUs.

However, a Clearwater Forest Xeon CPU has much more cores per socket than Graviton5 and it also supports dual-socket motherboards.

Darkmont also has a greater performance than the older big Intel cores, like all Skylake derivatives, inclusive for AVX-using programs, so it is no longer comparable with the Atom series of cores from which it has evolved.

Darkmont is not competitive in absolute performance with AMD Zen 5, but for the programs that do not use AVX-512 it has better performance per watt.

However, since AMD has started to offer AVX-512 for the masses, the number of programs that have been updated to be able to benefit from AVX-512 is increasing steadily, and among them are also applications where it was not obvious that using array operations may enhance performance.

Because of this pressure from AMD, it seems that this Clearwater Forest Xeon is the final product from Intel that does not support AVX-512. Both next 2 Intel CPUs support AVX-512, i.e. the Diamond Rapids Xeon, which might be launched before the end of the year, and the desktop and laptop CPU Nova Lake, whose launch has been delayed to next year (together with the desktop Zen 6, presumably due to the shortage of memories and production allocations at TSMC).

formerly_proven · 2026-03-03T20:57:37 1772571457

E-cores aren't that slow, yesteryear ones were already around Skylake levels of performance (clock for clock). Now one might say that's a 10+ year old uarch, true, but those ten years were the slowest ten years in computing since the beginning of computing, at least as far as sequential programs are concerned.

jmward01 · 2026-03-03T20:08:08 1772568488

Cloud = the right choice when just starting. It isn't about infra cost, it is about mental cost. Setting up infra is just another thing that hurts velocity. By the time you are serving a real load for the first time though you need to have the discussion about a longer term strategy and these points are valid as part of that discussion.

andoando · 2026-03-03T21:22:41 1772572961

I guess it depends, but infra is also a lot simpler when starting out. It really isnt much harder (easier even?) to setup services on a box or two than managing AWS.

Im pretty sure a box like this could run our whole startup, hosting PG, k8s, our backend apis, etc, would be way easier to setup, and not cost 2 devops and $40,000 a month to do it.

CyberDildonics · 2026-03-03T20:22:29 1772569349

Is infra really that hard to set up? It seems like infra is something a infra expert could establish to get the infra going and then your infra would be set up and you would always have infra.

PunchyHamster · 2026-03-03T23:37:22 1772581042

You are correct but it still takes time. You can start using cloud today but you need to:

* sign the papers for server colo * get quote and order servers (which might take few weeks to deliver!), near always a pair of switches * set them up, install OSes, set up basic services inside the network (DNS, often netboot/DHCP if you want to have install over network, and often few others like image repository, monitoring etc.)

It's "we have product and cashflow, let's give someone a task to do it" thing, not "we're a startup ,barely have PoC" thing

ocdtrekkie · 2026-03-03T22:08:23 1772575703

As a big on-prem guy, I think cloud makes sense for early startups. Lead time on servers and networking setup can be significant, and if you don't know how much you need yet you will either be resource starved or burn all your cash on unneeded capacity.

On-prem wins for a stable organization every time though.

charcircuit · 2026-03-03T23:42:02 1772581322

You can rent a vps or dedicated server if you need something immediately to without going to cloud providers.

UltraSane · 2026-03-03T21:43:22 1772574202

Secure and reliable infrastructure is hard to set and keep secure and reliable over time.

estimator7292 · 2026-03-03T20:38:56 1772570336

You have to pay that infra person and shield them from "infra works, why are we paying so much for IT staff" layoffs. Then you have ongoing maintenance costs like UPS battery replacement and redundant internet connections, on top of the usual hardware attrition.

It's unfortunately not so cut and dry

readthenotes1 · 2026-03-03T20:37:06 1772570226

Based on the evidence, not only is infrastructure really hard to set up in the first place, it is incredibly error-prone to adjust to new demand.

klooney · 2026-03-04T02:06:04 1772589964

Man, how do you get box seats out of AWS, I'm missing out

madduci · 2026-03-03T20:30:50 1772569850

Is your calculation also taking cost of energy and personnel that keeps your own infra running?

matsemann · 2026-03-03T20:53:22 1772571202

Is that personnel cost more than running on someone else's infra? Just counting the amount of people a company now need just to maintain their cloud/kubernetes/whatever setup, paired with "devops" meaning all devs now have to spend time on this stuff, I could almost wager we would spend less on personnel if we just chucked a few laptops in a closet and sshed in.

umvi · 2026-03-03T21:55:54 1772574954

Is using virtualization the only good way of taking a 288-core box and splitting it up into multiple parallel workloads? One time I rented a 384-core AMD EPYC baremetal VM in GCP and I could not for the life of me get parallelized workloads to scale just using baremetal linux. I wanted to run a bunch of CPU inference jobs in parallel (with each one getting 16 cores), but the scaling was atrocious - the more parallel jobs you tried to add, the slower all of them ran. When I checked htop the CPU was very underutilized, so my theory was that there was a memory bottleneck somewhere happening with ONNX/torch (something to do with NUMA nodes?) Anyway, I wasn't able to test using proxmox or vmware on there to split up cpu/memory resources; we decided instead to just buy a bunch of smaller-core-count AMD Ryzen 1Us instead, which scaled way better with my naive approach.

PunchyHamster · 2026-03-03T23:40:29 1772581229

They are used for VMs because the load is pretty spiky and usually not that memory heavy. For just running single app smaller core count but higher clocked ones are usually more optimal

>Anyway, I wasn't able to test using proxmox or vmware on there to split up cpu/memory resources; we decided instead to just buy a bunch of smaller-core-count AMD Ryzen 1Us instead, which scaled way better with my naive approac

If that was single 384 (192 times 2 for hyperthreading) CPU you are getting "only" 12 DDR5 channels, so one RAM channel is shared by 16c/32y

So just plain 16 core desktop Ryzen will have double memory bandwidth per core

Dylan16807 · 2026-03-03T22:11:48 1772575908

How did the speed of one or two jobs on the EPYC compare to the Ryzen?

And 384 actual cores or 384 hyperthreading cores?

Inference is so memory bandwidth heavy that my expectations are low. An EPYC getting 12 memory channels instead of 2 only goes so far when it has 24x as many cores.

MrBuddyCasino · 2026-03-04T09:37:00 1772617020

It seems a lot of people have forgotten how BigCorp IT used to work.

- request some HW to run $service

- the "IT dept" (really, self-interested gatekeeper) might give you something now, or in two weeks, or god help you if they need to order new hardware then its in two months, best case

- there will be various weird rules on how the on-prem HW is run, who has access etc, hindering developer productivity even further

- the hardware might get insanely oversubscribed so your service gets half a cpu core with 1GB RAM, because perverse incentives mean the "IT dept" gets rewarded for minimizing cost, while the price is paid by someone else

- and so on...

The cloud is a way around this political minefield.

misswaterfairy · 2026-03-04T11:10:53 1772622653

> The cloud is a way around this political minefield.

Until the bills _really_ start skyrocketing...

zer00eyz · 2026-03-03T21:06:55 1772572015

> These sorts of core-density increases are how I win cloud debates in an org.

AMD has had these sorts of densities available for a minute.

> Identify the workloads that haven't scaled in a year.

I have done this math recently, and you need to stop cherry picking and move everything. And build a redundant data center to boot.

Compute is NOT the major issue for this sort of move:

Switching and bandwidth will be major costs. 400gb is a minimum for interconnects and for most orgs you are going to need at least that much bandwidth top of rack.

Storage remains problematic. You might be able to amortize compute over this time scale, but not storage. 5 years would be pushing it (depending on use). And data center storage at scale was expensive before the recent price spike. Spinning rust is viable for some tasks (backup) but will not cut it for others.

Human capital: Figuring out how to support the hardware you own is going to be far more expensive than you think. You need to expect failures and staff accordingly, that means resources who are going to be, for the most part, idle.

user5994461 · 2026-03-03T20:26:27 1772569587

> These sorts of core-density increases are how I win cloud debates in an org.

The core density is bullshit when each core is so slow that it can't do any meaningful work. The reality is that Intel is 3 times behind AMD/TSMC on performance vs power consumption ratio.

People would be better off having a look at the high frequency models (9xx5F models like the 9575F), that was the first generation of CPU server to reach ~5 GHz and sustain it on 32+ cores.

matja · 2026-03-03T21:21:28 1772572888

Intel seem to be deliberately hiding the clock frequency of this thing, the xeon-6-plus-product-deck.pdf has no mention of clock frequency or how LLC is shared.

varispeed · 2026-03-03T21:26:06 1772573166

That only works if purchasers in the organisation are immune to kickbacks.

fnord77 · 2026-03-04T05:27:45 1772602065

on prem = capex

cloud = opex

The accounting dept will always win this debate.

mschuster91 · 2026-03-03T19:42:03 1772566923

> If we can't bill a customer for it, and it's not scaling regularly, then it shouldn't be in the public cloud. That's my take, anyway. It sucks the wind from the sails of folks gung-ho on the "fringe benefits" of public cloud spend (box seats, junkets, conference tickets, etc...), but the finance teams tend to love such clear numbers.

I agree, but.

For one, it's not just the machines themselves. You also need to budget in power, cooling, space, the cost of providing redundant connectivity and side gear (e.g. routers, firewalls, UPS).

Then, you need a second site, no matter what. At least for backups, ideally as a full failover. Either your second site is some sort of cloud, which can be a PITA to set up without introducing security risks, or a second physical site, which means double the expenses.

If you're a publicly listed company, or live in jurisdictions like Europe, or you want to have cybersecurity insurance, you have data retention, GDPR, SOX and a whole bunch of other compliance to worry about as well. Sure, you can do that on-prem, but you'll have a much harder time explaining to auditors how your system works when it's a bunch of on-prem stuff vs. "here's our AWS Backup plans covering all servers and other data sources, here is the immutability stuff, here are plans how we prevent backup expiry aka legal hold".

Then, all of that needs to be maintained, which means additional staff on payroll, if you own the stuff outright your finance team will whine about depreciation and capex, and you need to have vendors on support contracts just to get firmware updates and timely exchanges for hardware under warranty.

Long story short, as much as I prefer on-prem hardware vs the cloud, particularly given current political tensions - unless you are a 200+ employee shop, the overhead associated with on-prem infrastructure isn't worth it.

Imustaskforhelp · 2026-03-03T21:15:51 1772572551

> Then, you need a second site, no matter what. At least for backups, ideally as a full failover. Either your second site is some sort of cloud, which can be a PITA to set up without introducing security risks, or a second physical site, which means double the expenses.

You can technically have backblaze's unlimited backup option which costs around 7$ for a given machine although its more intended for windows, there have been people who make it work and Daily backups and it should work with gdpr (https://www.backblaze.com/company/policy/gdpr) with something like hetzner perhaps if you are worried about gdpr too much and OVH storage boxes (36 TB iirc for ~55$ is a good backup box) and you should try to follow 3-2-1 strategy.

> Then, all of that needs to be maintained, which means additional staff on payroll, if you own the stuff outright your finance team will whine about depreciation and capex, and you need to have vendors on support contracts just to get firmware updates and timely exchanges for hardware under warranty.

I can't speak for certain but its absolutely possible to have something but iirc for companies like dell, its possible to have products be available on a monthly basis available too and you can simply colocate into a decent datacenter. Plus points in that now you can get 10-50 GB ports as well if you are too bandwidth hungry and are available for a lot lot more customizable and the hardware is already pretty nice as GP observed. (Yes Ram prices are high, lets hope that is temporary as GP noted too)

I can't speak about firmware updates or timely exchanges for hardware under security.

That being said, I am not saying this is for everyone as well. It does essentially boils down to if they have expertise in this field/can get expertise in this field or not for cheaper than their aws bills or not. With many large AWS bills being in 10's of thousands of dollars if not hundreds of thousands of dollars, I think that far more companies might be better off with the above strategy than AWS actually.

mschuster91 · 2026-03-03T22:37:30 1772577450

> You can technically have backblaze's unlimited backup option which costs around 7$ for a given machine although its more intended for windows, there have been people who make it work and Daily backups and it should work with gdpr (https://www.backblaze.com/company/policy/gdpr) with something like hetzner perhaps if you are worried about gdpr too much and OVH storage boxes (36 TB iirc for ~55$ is a good backup box) and you should try to follow 3-2-1 strategy.

Sure, but it doesn't solve the issue of "the datacenter is on fire" - neither if you're fully on prem or if you use colocation. You still need to acquire a new set of hardware, rack it, reconfigure the networking hardware and then restore from backups. That's an awful lot of work, and yes, I've been there.

justsomehnguy · 2026-03-03T21:24:11 1772573051

> The only thing giving me pause on that argument today is the current RAM/NAND shortage

Not a shortage - price gouging. And it would mean an increase in the 'cloud' prices because they need to refresh the HW too. So by the summer the equation would be back to it.

stego-tech · 2026-03-03T19:06:19 1772564779

This is why I align on comp ranges rather than title. I've been a "Lead" where all I contributed was a new imaging pipeline and introducing NAT to the product line, a "Manager" of a failing company where I had no managerial authority or direct reports, and a "Senior" at a SV firm where I actually behaved a level above a Senior Engineer - owning outcomes, doing research, mentoring juniors, building relationships across silos, governance councils, etc.

Titles are fungible, but your comp isn't. Don't let a company sell you on a better title for less comp, especially when the JD or role doesn't align with the title; the next place won't give a shit what your title was if all you did was Junior-level work because you bought into someone else's narrative rather than control your own.

icedchai · 2026-03-03T20:59:56 1772571596

I've worked with several "Directors" that all had between 0 and 3 reports. Vanity titles make people feel good and look nice on a resume, but that's about it.

madeofpalk · 2026-03-04T00:35:14 1772584514

"Lead" is a funny one, because its just not a level that exists where I currently work.

A few teams have a "lead" role, but its mostly ceremonial.

stego-tech · 2026-03-03T16:09:15 1772554155

That animated graph at the top is awful; does not render well on macOS Safari.

That being said, I am morbidly curious about traffic from RSS subscribers: has that gone up, gone down, or remained roughly the same in the same time period?

stego-tech · 2026-03-03T04:23:46 1772511826

It’s the IT career requirement to keep abreast of everything in tech while being cursed to know most of it is various forms of unnecessary for the supermajority of IT/Enterprise use cases. A few highlights:

* Most infrastructure does not need Kubernetes. Your ERP doesn’t need Helm charts, your internal Confluence doesn’t need HA K8s clusters, your Grafana is cheaper on ECS than GKE, and your zScaler estate flatly doesn’t support it. Kubernetes is amazingly awesome but the equivalent of using nuclear weapons to go duck hunting for most folks.

* AI, for all its power and capability, is too unreliable for wholesale automation - especially when you can just use it to generate the code or software to run the same automations infinitely with deterministic outputs. Your entire org doesn’t need Claude Code Pro Max 20x subs, you just need to get better at getting the code needed for infinite repetition without the AI sub

* Your fridge, oven, microwave, coffee maker, toaster oven, furnace filter, and dishwasher don’t need WiFi, Bluetooth, or Cloud Connectivity. They just don’t.

* Public Cloud didn’t let you reduce your infrastructure headcount, but it did make it easier for shadow IT to consume more spend, make your headcount more expensive and specialized, and put your infrastructure eggs in the same basket as millions of others, which surely will never become a problem. (/s)

* If you can run basic infra (compute, VPCs, storage, networking) for one public cloud provider, you can run them all. If you’re requiring Architect certs just to run VMs in a landing zone, you’re spending way too much for way too little.

* VLANs and a firewall are enough for 90% of use cases. The only reason you need a NGFW for layer 7 filtering is because vendors stopped publishing what IP ranges, FQDNs, and ports their stuff uses, and that’s less a justification for NGFW’s and more a damning indictment of shitty security practices industry-wide.

* VMs are fine. Containers are nice and efficient, but VMs are still perfectly fine. I am tired of having this conversation with folks who don’t know what containers do but think they’re God’s answer to the myriad of faults of VMs (that they also can’t identify)

* You don’t need Ansible, Terraform, CloudFormation, or Pulumi to “automate” workflows. Oftentimes all you need are cronjobs and webhooks, rather than another whole-ass set of sludgepipes.

* You don’t need a data lake, you need to do a better job identifying which data points are meaningful in a context and capturing them efficiently.

I love technology. I love learning about technology. I love solving problems with technology.

I hate the insistence that everything be maximally technological in nature and that every product must be adopted in order to not be left behind.

I hate the lack of discipline, is what I’m saying.

tobinfekkes · 2026-03-03T07:54:49 1772524489

Amen!

stego-tech · 2026-03-03T00:50:01 1772499001

Serious question: might the solution be a satellite broadcast in the clear, a la DVB-S but for data, audio, or video?

Weather radio is a critical service, and even if traditional AM/FM or RF signals are deprecated, there should still be a way for anyone - no matter how remote - to get safety and meteorology information from the government. Given that its constant availability is more important than latency or bandwidth, it feels like an appropriate use for GEO satellites broadcasting down over a large area in the clear, such that any basic SDR and a cheap dish could grab the signal with minimal fuss.

tokyobreakfast · 2026-03-03T03:16:33 1772507793

Serious answer: no.

Requiring line-of-sight outdoors to a satellite does fuck-all in emergency situations, especially one you're trying to shelter in place from, likely underground.

In the US, these broadcasts are localized, usually a county or multiple county area.

acchow · 2026-03-03T01:33:05 1772501585

We probably shouldn't deprecate AM for emergency broadcasts given you can listen to AM radio with grass https://www.youtube.com/watch?v=b9UO9tn4MpI

Aloha · 2026-03-03T05:16:06 1772514966

Only at the point of emission however...

msla · 2026-03-03T00:55:13 1772499313

Yeah. I'm down on commercial AM/FM radio being touted as an emergency service, because there's so rarely enough behind the scenes to make it reliable or even minimally usable as such, but this is something purpose-built to fill that role, and shutting it down means there's nothing which can fill it, given how worthless commercial radio is at the task:

https://en.wikipedia.org/wiki/Minot_train_derailment

> Because it was the middle of the night, there were few people at local radio stations, all operated by Clear Channel with mostly automated programming. No formal emergency warnings were issued for several hours while Minot officials located station managers at home. North Dakota's public radio network, Prairie Public Broadcasting, was notified and did broadcast warnings to citizens.

If you wanted to make commercial radio even minimally acceptable as an emergency alert system you'd be... guess what... reinventing EAS and EAS-a-likes, except more expensive and less responsive! EAS never has to "Interrupt This Program" it can just get to the meat.

autoexec · 2026-03-03T03:16:28 1772507788

> If you wanted to make commercial radio even minimally acceptable as an emergency alert system you'd be... guess what... reinventing EAS and EAS-a-likes, except more expensive

Exactly, whatever it costs to operate and maintain the emergency and weather services commercial radio would need to make enough money to pay for that, and then also make enough money on top of that to stuff their pockets with profit. The people shouldn't be on the hook for those extra expenses while private companies do everything in their power to degrade the service in order to lower their costs to increase profits even farther.

stego-tech · 2026-03-02T18:29:03 1772476143

I would argue that in a digital world, copyright should be inversely scalable to the size of the creator - that is, individual works by independent artists intended for exhibition rather than reproduction should receive more favorable terms than movies or games created by huge conglomerates intended for mass reproduction, licensing, and sale.

Or more simply: if you’re not selling it presently, you don’t get copyright on it. There, abandonware and lost media rights are solved, and we can all move on.

mastermage · 2026-03-03T12:56:34 1772542594

This my fundamental problem with some of the propositions on this topic here.

I fundamentally disagree to only for one example in a thread here have a copyright of 5 years for a Book Author. Many book authors could never finish their series without their first books becoming public domain or so.

On the other hand Everything created by corporations i.e. where a corporation not a single human holds a copyright can get fucked.

stego-tech · 2026-03-03T13:08:18 1772543298

Exactly. This is something I’ve chewed on constantly for nigh on 20 years, and this is the best compromise I’ve been able to come up with. Smaller teams or individual creators need more copyright protections than large corporations, but the law doesn’t reflect that - and it’s why copyright is so widely abused as a result.

This ain’t working for the interests of the public anymore, and AI has exacerbated it (large corps getting settlements, smaller creators getting shafted). We need a new model entirely that addresses these issues.

freejazz · 2026-03-03T17:16:37 1772558197

Corporations don't create copyrighted work. Authors do and assign their rights over. I continue to think that people pontificating on this space would be well served to inform themselves about how the business is generally conducted, as I see so many comments made from assumptions about principles and not actual reference to actual copyright law.

stego-tech · 2026-02-28T03:22:03 1772248923

I am genuinely shocked that a tech company actually stood on principle. My doubts about AI, Anthropic, and Mr. Amodei remain, but man, I got the warm and fuzzies seeing them stick to their principles on this - even if one clause (autonomous weapons) is less principled and more, “it’s not ready yet”.

stego-tech · 2026-02-26T19:51:09 1772135469

Honestly, the remixes this generation suck compared to priors.

"This time will be different," they said about the Metaverse, ignoring the vast tranches of MUCKs, MUDs, MMOs, LSGs, and repeated digital real estate gold rushes of the past half-century. Billions burned on something anyone who played Second Life, Entropia, FFXIV, EQ2, VRChat, or fucking Furcadia could've told you wasn't going to succeed, because it wasn't different, it just had more money behind it this time.

"NFTs are different", as collectors of trading cards, art prints, coins, postage stamps, and an infinite glut of collectibles looked at each other with that knowing, "oh lord, here we go" glance.

"Crypto is different", as those who paid attention to history remembered corporate scrip, gift cards, hedge funds, the S&L crisis, Enron, the MBS crisis, and the multitude of prior currency-related crises and grifts bristled at the impending glut of fraud and abuse by those too risky to engage in traditional commerce.

And thus, here we are again. "This time is different", as those of us who remembered the code generators of yore pollute our floppy drives and salesgrifts convinced our bosses that their program could replace those expensive programmers roll our eyes at the obvious bullshit on naked display, then vomited from stress as over a trillion dollars was diverted from anything of value into their modern equivalent - with all the same problems as before.

I truly hate how stupidly people with money actually behave.

paulddraper · 2026-02-26T19:58:14 1772135894

Is this “nothing ever happens”

inigyou · 2026-02-26T22:37:56 1772145476

Short of insider trading, betting on things not happening is apparently the best way to make money on those shady betting sites.

stego-tech · 2026-02-25T16:19:18 1772036358

I had no idea what I was using were called “EM-dashes” until the AI bubble. I just used them to reflect pauses in my speech for tangents - an old habit from my IRC days.

Incidentally, some folks reported my stuff for potential AI generation and I had to respond to the mods about it. So that was kinda funny, if also sad to hear that some folks thought I was a bot.

I’m a dinosaur, not a robot dinosaur. I’m nowhere near that cool, alas.

seabrookmx · 2026-02-25T16:25:45 1772036745

But the em-dash is a different character. I think even those that use a pause would just opt for - on their keyboard, whereas the em-dash — requires additional work on most (all?) keyboard layouts. It's _not_ more work for an AI though hence why it's a tell.

devb · 2026-02-25T16:25:17 1772036717

> I just used them to reflect pauses in my speech for tangents - and old habit from my IRC days.

The tell here is that you used a hyphen, not an em-dash.

stego-tech · 2026-02-25T17:23:25 1772040205

Okay, see, that's context even I forget, but you're right and bears repeating:

This `-` is a hyphen, which I love, even if I'm fairly sure I'm not using it correctly in grammar a lot of the time.

This `--` is an EM-Dash, apparently, which is also what I never use but I also thought was just a hyphen in a different context (incorrect!).

adamsilkey · 2026-02-25T19:19:14 1772047154

No, there are actually four different punctuation marks, all which look remarkably similar to the untrained eye.

1. We have the hyphen, which is most commonly used to create multi-part words, such as one-and-one-thousand.

2. We have the EN-DASH, which is most commonly used to denote spans of ranges. As an example, Barack Obama was President 2009–2017.

3. Then we have the recently maligned EM-DASH, which can be used in place of a variety of other punctuation marks, such as commas, colons, and parentheses. Very frequently, AI will use the em-dash as a way to separate two clauses and provide forward motion. AI uses it for the same reason that writers do: the em-dash is just a nicer punctuation mark compared to the colon.

4. Lastly, we have the minus sign, which is slightly different than the hyphen, though on most keyboards they're combined into the hyphen-minus.

By the by, they're called the em-dash and the en-dash because they match the length of an uppercase M or N, respectively.

stego-tech · 2026-02-25T19:23:08 1772047388

I am so here for this lesson in punctuation and grammar right now. One of today’s lucky 10,000.

halper · 2026-02-25T19:12:11 1772046731

It is probably even a hyphen-minus, so called because on most early keyboards one character had to do to represent both a hyphen and a minus. In Unicode, there is a separate code point for an unambiguous hyphen. There is also a non-breaking hyphen as well as the various dashes discussed here.

And "--" is absolutely just two hyphen-minuses, not an em-dash (—).

Aachen · 2026-02-25T19:29:37 1772047777

How did you make the character without googling it?

stego-tech · 2026-02-24T14:40:04 1771944004

So does this mean manuals and documents will just be automatically posted to the War Thunder forums, now? Man, what a win for efficiency!