I work for a startup; we have what I think is a fairly typical setup: metrics ingested from a variety of sources, fed into industry-standard metrics/dashboard solutions, triggering escalations to humans. It's fine and I'm happy we have it, but...
The highest value source of alerting right now is one of our growth marketers who pays close attention to our CRM and product analytics tool and notices when key product funnels are underperforming.
Our next highest value signals are a handful of ad hoc alerting channels, mostly in Slack, either directly from a partner telling us that something suspicious happened on their side (think: fraud) or from in-product instrumentation sent to a channel for non-engineering visibility. Members of our business/product/operations team pay attention in these places and make decisions based on their business context.
After that, our support team is increasingly able to filter customer issues and differentiate between bugs, missing features, etc.
I know someone is going to argue that these are all a sign that we haven't instrumented the right things. Fair, but also misses the point. The decision makers in these flows don't (and won't) live in traditional alerting systems and wouldn't have helped us understand breakages without these other, ad hoc processes.
My theory is that it's relatively easy to offer a technical product that moves alerts around or that manages escalation paths. It's quite hard to design a product that surfaces detail to a non-technical export and that makes it easy to build systematic rules.
you are defining the art of setting up SLOs for end user workflows. This is typically achieved with contract monitoring (top down). This article is focused on bottom up approach of fine tuning and setting up alerts
Thanks for the pointer to SLOs; I've read more than a few posts about them but it never clicked, will look more closely.
My point, I think, is still that the overwhelming focus of the tools I've seen focus on the kind of fine-tuning/setup you are describing and not the things that I find most valuable. And I think that part of the problem is that it's easy to build technology around mechanics than judgement.
We stumbled across much the same thing building out a query layer of composable join clauses. In previous efforts at something similar, I've used CTEs, but found the ergonomics worse because the query layer had to differentiate between cte clauses and regular ones.
Skimming the original article, I didn't really understand why the author didn't discuss "WITH" CTEs (for SQL newbies, common table expressions, see https://modern-sql.com/feature/with ) as alternative composition mechanisms.
Or even SQL views. But your ergonomics comment makes sense to me.
I'm not sure about "without changing code" but I have definitely seen the believe that Figma represents something authoritative about the product instead of, say, the product being authoritative for itself.
Perhaps because I have a similar bio to yours, I am allergic to this view.
QA should not be forced into an engineering or automation track because the incentives are wrong. You end up with test code becoming the goal and then it usually rots due to most QA not having the experience to create a codebase that scales.
I don't think the industry today understands how to treat QA and I think that leads to a lot of assumptions that it's not useful.
I get excited because I went to school with one of Vaughan Jones' children and was (and still am) into math and was blown away when I understood that he was significant.
Years and years ago (pre-smart phone), I built a mobile map and navigation product. Labeling streets was one of the more interesting side quests and the solution I found took a similar approach of generating a large number of candidates, picking one solution, and iterating. It worked quite well in practice.
I work for a startup; we have what I think is a fairly typical setup: metrics ingested from a variety of sources, fed into industry-standard metrics/dashboard solutions, triggering escalations to humans. It's fine and I'm happy we have it, but...
The highest value source of alerting right now is one of our growth marketers who pays close attention to our CRM and product analytics tool and notices when key product funnels are underperforming.
Our next highest value signals are a handful of ad hoc alerting channels, mostly in Slack, either directly from a partner telling us that something suspicious happened on their side (think: fraud) or from in-product instrumentation sent to a channel for non-engineering visibility. Members of our business/product/operations team pay attention in these places and make decisions based on their business context.
After that, our support team is increasingly able to filter customer issues and differentiate between bugs, missing features, etc.
I know someone is going to argue that these are all a sign that we haven't instrumented the right things. Fair, but also misses the point. The decision makers in these flows don't (and won't) live in traditional alerting systems and wouldn't have helped us understand breakages without these other, ad hoc processes.
My theory is that it's relatively easy to offer a technical product that moves alerts around or that manages escalation paths. It's quite hard to design a product that surfaces detail to a non-technical export and that makes it easy to build systematic rules.