We use [insert very large application performance monitoring tool here] for work...

nrmitchi · on Feb 1, 2021

Tip, if you happen to be using datadog, make sure datadog agent logs are disabled from being ingested into datadog.

If you can disable them at the agent level and avoid the data out that would be even better.

At a previous employer the defaults were quite literally half of our log volume, that we were paying for. I was doing a sanity check before renewing our datadog contract and was very not-pleased to discover that.

fat-apple · on Feb 1, 2021

We’re about to release a Datadog compatible API so you can point your Datadog agent at Opstrace instead (stay tuned for the blog post). Our goal is to be able to tell you exactly how much data the agent is sending and how much that is costing you (and for example what services/containers are responsible for the bulk of the cost). Here’s a list of the PRs: https://github.com/opstrace/opstrace/pulls?q=is%3Apr+is%3Acl...

mdaniel · on Feb 1, 2021

I even opened a support ticket for their stupid python agent logging its connection refused tracebacks on every metrics poll and was told "too bad"

They really don't give one whit about log discipline or allowing the user to influence the agent's log levels

englambert · on Feb 1, 2021

Perhaps on a related note, see this discussion about the power of incentives here: https://news.ycombinator.com/item?id=25994653

brodouevencode · on Feb 2, 2021

Lol no, the other really large one. Five minute Cloudwatch polling defaults are just overkill even in production.

cyberpunk · on Feb 2, 2021

Can hurt yourself that way too -- happened to us, but with not a lot of data, and all down to Thanos aggregating/reducing/whatever-ing meeeeeeeeeelions of metrics inside a s3 bucket to the tune of about 7k a month :/

spahl · on Feb 1, 2021

Yes that is frustrating indeed. On top of paying your external vendor, you are punished by the egress cost you have to pay to your infrastructure cloud provider. This is one of the problems we wanted to solve. Feel free to contact me seb@opstrace.com.

alexchamberlain · on Feb 2, 2021

It feels like the large monitoring applications should run aggregators in large cloud providers to reduce traffic for everyone.

jgehrcke · on Feb 2, 2021

Haha, sure. I suppose that for example AWS has little incentive for allowing for example Datadog to to offer a special per-AZ endpoint. But hey. Here we come into play :).