Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We use [insert very large application performance monitoring tool here] for workloads running in [insert very, very large cloud provider here] and after examining our deployments, concluded that we were spending nearly $13k/mo for data transfer out expenditures because the monitoring agents have crazy aggressive defaults. Seems like running our own (which may be worthwhile) would alleviate anything like that.


Tip, if you happen to be using datadog, make sure datadog agent logs are disabled from being ingested into datadog.

If you can disable them at the agent level and avoid the data out that would be even better.

At a previous employer the defaults were quite literally half of our log volume, that we were paying for. I was doing a sanity check before renewing our datadog contract and was very not-pleased to discover that.


We’re about to release a Datadog compatible API so you can point your Datadog agent at Opstrace instead (stay tuned for the blog post). Our goal is to be able to tell you exactly how much data the agent is sending and how much that is costing you (and for example what services/containers are responsible for the bulk of the cost). Here’s a list of the PRs: https://github.com/opstrace/opstrace/pulls?q=is%3Apr+is%3Acl...


I even opened a support ticket for their stupid python agent logging its connection refused tracebacks on every metrics poll and was told "too bad"

They really don't give one whit about log discipline or allowing the user to influence the agent's log levels


Perhaps on a related note, see this discussion about the power of incentives here: https://news.ycombinator.com/item?id=25994653


Lol no, the other really large one. Five minute Cloudwatch polling defaults are just overkill even in production.


Can hurt yourself that way too -- happened to us, but with not a lot of data, and all down to Thanos aggregating/reducing/whatever-ing meeeeeeeeeelions of metrics inside a s3 bucket to the tune of about 7k a month :/


Yes that is frustrating indeed. On top of paying your external vendor, you are punished by the egress cost you have to pay to your infrastructure cloud provider. This is one of the problems we wanted to solve. Feel free to contact me seb@opstrace.com.


It feels like the large monitoring applications should run aggregators in large cloud providers to reduce traffic for everyone.


Haha, sure. I suppose that for example AWS has little incentive for allowing for example Datadog to to offer a special per-AZ endpoint. But hey. Here we come into play :).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: