Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm curious how many engineers per year this costs to maintain


> I'm curious how many engineers per year this costs to maintain

The end of the article has this:

> Consider custom infrastructure when you have both: sufficient scale for meaningful cost savings, and specific constraints that enable a simple solution. The engineering effort to build and maintain your system must be less than the infrastructure costs it eliminates. In our case, specific requirements (ephemeral storage, loss tolerance, S3 fallback) let us build something simple enough that maintenance costs stay low. Without both factors, stick with managed services.

Seems they were well aware of the tradeoffs.


And I am curious how many engineer years it requires to port code to cloud services and deal with multiple issues you cannot even debug due to not having root privileges in the cloud.

Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB. And no weird network issues to debug.


> as simple as "with open(...) as f: f.write(data)"

Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Without on-prem, saving a file is as simple as s3.put_object() !


>> Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB.

> Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Most of these concerns can be addressed with ZFS[0] provided by FreeBSD systems hosted in triple-A data centers.

See also iSCSI[1].

0 - https://docs.freebsd.org/en/books/handbook/zfs/

1 - https://en.wikipedia.org/wiki/ISCSI


Except running ZFS on FreeBSD would certainly require dedicated devops person with very specific skillset that majority of people on market dont have.


I don't think any of those mattered for their use case. That's why they didn't actually need S3.


With s3, you cannot use ls, grep and other tools.

> Save where? With what redundancy? With what access policies? With what backup strategy? With what network topology? With what storage equipment and file system and HVAC system and...

Wow that's a lot to learn before using s3... I wonder how much it costs in salaries.

> With what network topology?

You don't need to care about this when using SSDs/HDDs.

> With what access policies?

Whichever is defined in your code, no restrictions unlike in S3. No need to study complicated AWS documentation and navigate through multiple consoles (this also costs you salaries by the way). No risk of leaking files due to misconfigured cloud services.

> With what backup strategy?

Automatically backed up with rest of your server data, no need to spend time on this.


> You don't need to care about this when using SSDs/HDDs.

You do need to care when you move beyond a single server in a closet that runs your database, webserver and storage.

> No risk of leaking files due to misconfigured cloud services.

One misconfigured .htaccess file for example, could result in leaking files.


> One misconfigured .htaccess

First, I hope nobody is using Apache anymore, second, you typically store files outside of web directory.


Why nobody should use Apache? I rediscovered it to be great in many use cases. And there's llms to help with the config syntax.


Performance not great compared to nginx.


>> No risk of leaking files due to misconfigured cloud services.

> One misconfigured .htaccess file for example, could result in leaking files.

I don't think you are making a compelling case here, since both scenarios result in an undesirable exposure. Unless your point is both cloud services and local file systems can be equally exploited?


With bare-metal machines you can go very far before needing to scale beyond one machine.


It sounds like you’re not at the scale where cloud storage is obviously useful. By the time you definitely need S3/GCS you have problems making sure files are accessible everywhere. “Grep” is a ludicrous proposition against large blob stores


I mean you can easily mount the S3 bucket to the local filesystem (e.g. using s3fs-fuse) and then use standard command line tools such as ls and grep.


I inherited an S3 bucket where hundreds of thousands of files were written to the bucket root. Every filename was just a uuid. ls might work after waiting to page though to get every file. To grep you would need to download 5 TB.


It's probably going to be dog slow. I dealt with HDDs where just iterating through all files and directories takes hours, and network storage is going to be even slower at this scale.


You can't ever definitively answer most of those questions on someone else's cloud. You just take Amazons word for whatever number of nines they claim it has.


Not needing to ask the questions is the selling point.


Bro were you off grid last week. Your questions equally apply to AWS, you just magically handwave away all those questions as if AWS/GCP/Azure outages aren’t a thing.


Until it goes down because because aws STILL hasn't made themselves completely multi-region or can't figure our their DNS.


A lot of reductive anti-cloud stuff gets posted here, but this might be the granddaddy of them all.


> Without cloud, saving a file is as simple as "with open(...) as f: f.write(data)" + adding a record to DB. And no weird network issues to debug.

There may be some additional features that S3 has over a direct filesystem write to a SSD in your closet. The people paying for cloud spend are paying for those features.


Ah that is where logging and traceability comes in! But not to worry, the cloud has excellent tools for that! The fact that logging and tracing will become half your cloud cost, oh well let's just sweep that under the rug.


Variation on an old classic.

Question: How do you save a small fortune in cloud savings?

Answer: First start with a large fortune.


A small fraction of 1, probably? It sounds like a fairly simple service that shouldn't require much ongoing development


Especially if you have access to LLMs.


You're going to run a production system with a bus number of 1?

I think you mean a small fraction of 3 engineers. And small fractions aren't that small.


So far I have seen a lot more production systems with a bus factor of zero than production systems with a bus factor greater one.


The cost being a fraction of 1 does not imply it's one person. 3 people each spending 2 weeks a year on the service is still a fraction of 1.


It is three opportunity costs. No free lunches.


Nobody implied it was free. Yes there are opportunity costs, and they add up to less than one sysadmin of opportunity.


Yes, that was my thought as well. Breakeven might be like 1 (give or take 2x)?


Anything worth doing needs three people. Even if they also are used for other things.


What I notice, that large companies use their own private cloud and datacenters. At their scale, it is cheaper to have their own storage. As a side business, they also sell cloud services themselves. And small companies probably don't have that much data to justify paying for a cloud instead of buying several SSDs/HDDs or creating SMB share on their Windows server.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: