More

OsrsNeedsf2P · 2026-02-22T03:46:38 1771731998

Not my product but wanted to share - Everyone seems to use Calendly, even though Cal.com is a free alternative, that's also open source.

Been using it for years. Highly recommend.

OsrsNeedsf2P · 2026-02-21T23:04:09 1771715049

Why isn't Claude doing QA testing for you?

PunchyHamster · 2026-02-21T23:22:14 1771716134

Why isn't it doing it for Anthropic ?

ffsm8 · 2026-02-21T23:42:25 1771717345

What makes you think it isn't?

They just have a lot of users doing QA to, and ignore any of their issues like true champs

slopinthebag · 2026-02-21T23:06:05 1771715165

I can't tell if this is sarcasm, but if not, you cant rely on the thing that produced invalid output to validate it's own output. That is fundementally insufficient, despite it potentially catching some errors.

creddit · 2026-02-21T23:08:23 1771715303

Damn. Guess I'll stop QAing my own work from now.

lioeters · 2026-02-21T23:32:25 1771716745

This but unironically. Of course review your own work. But QA is best done by people other than those who develop the product. Having another set of eyes to check your work is as old as science.

ryan_n · 2026-02-21T23:15:40 1771715740

That is often how software development has been done the past several decades yea...

Not to say that you don't review your own work, but it's good practice for others (or at least one other person) to review it/QA it as well.

slopinthebag · 2026-02-21T23:17:13 1771715833

You're making a false equivalence between a human being with agency and intelligence, and a machine.

rhubarbtree · 2026-02-21T23:48:39 1771717719

Are humans not machines?

sarchertech · 2026-02-22T00:33:52 1771720432

That’s something that more than half of humans would disagree with (exact numbers vary but most polls show that more than 75% of people globally believe that humans have a soul or spirit).

But ignoring that, if humans are machines, they are sufficiently advanced machines that we have only a very modest understanding of and no way to replicate. Our understanding of ourselves is so limited that we might as well be magic.

rhubarbtree · 2026-02-23T14:19:43 1771856383

So good we're magic. So bad we think we're magic.

jatari · 2026-02-22T00:40:25 1771720825

>if humans are machines, they are sufficiently advanced machines that we have only a very modest understanding of and no way to replicate

Well, ignoring the whole literal replication thing humans do.

sarchertech · 2026-02-22T00:51:34 1771721494

Obviously by replicate I meant building a synthetic human.

alistairSH · 2026-02-21T23:27:45 1771716465

Yes. That’s not a best practice. That’s why PRs and peer reviews and test automation suite exist.

akdev1l · 2026-02-22T00:30:52 1771720252

I think it is common for one to write their own tests tho

alistairSH · 2026-02-22T01:48:57 1771724937

He said QA. QA is more than just unit tests.

akdev1l · 2026-02-23T04:16:35 1771820195

Whatever level of automated testing, it’s all usually done by the same people who wrote the software to begin with

Spivak · 2026-02-21T23:13:39 1771715619

I mean there is some wisdom to that, most teams separate dev and qa and writers aren't their own editors precisely because it's hard for the author of a thing to spot their own mistakes.

When you merge them into one it's usually a cost saving measure accepting that quality control will take a hit.

meheleventyone · 2026-02-21T23:13:54 1771715634

Uh, yeah, thh hi is has been considered bad practice for decades.

habinero · 2026-02-21T23:22:13 1771716133

Yeah, someone should invent code review.

iagooar · 2026-02-21T23:09:14 1771715354

What if "the thing" is a human and another human validating the output. Is that its own output (= that of a human) or not? Doesn't this apply to LLMs - you do not review the code within the same session that you used to generate the code?

slopinthebag · 2026-02-21T23:22:10 1771716130

I think a human and an LLM are fundamentally different things, so no. Otherwise you could make the argument that only something extra-terrestrial could validate our work, since LLM's like all machines are also our outputs.

koolba · 2026-02-21T23:22:38 1771716158

The problem now is that it’s a human using Claude to write the code and another using Claude to review it.

huslage · 2026-02-21T23:35:37 1771716937

I have had other LLMs QA the work of Claude Code and they find bugs. It's a good cycle, but the bugs almost never get fixed in one-shot without causing chaos in the codebase or vast swaths of rewritten code for no reason.

charcircuit · 2026-02-21T23:12:47 1771715567

Products don't have to be perfect. If they can be less buggy than before AI. You can't call that anything but a win.

latchkey · 2026-02-21T23:18:35 1771715915

> you cant rely on the thing that produced invalid output to validate it's own output

I've been coding an app with the help of AI. At first it created some pretty awful unit tests and then over time, as more tests were created, it got better and better at creating tests. What I noticed was that AI would use the context from the tests to create valid output. When I'd find bugs it created, and have AI fix the bugs (with more tests), it would then do it the right way. So it actually was validating the invalid output because it could rely on other behaviors in the tests to find its own issues.

The project is now at the point that I've pretty much stopped writing the tests myself. I'm sure it isn't perfect, but it feels pretty comprehensive at 693 tests. Feel free to look at the code yourself [0].

[0] https://github.com/OrangeJuiceExtension/OrangeJuice/actions/...

slopinthebag · 2026-02-21T23:37:15 1771717035

I'm not saying you can't do it, I'm just saying it's not sufficient on its own. I run my code through an LLM and it occasionally catches stuff I missed.

latchkey · 2026-02-22T00:19:25 1771719565

Thanks for the clarification. That's the difference though, I don't need it to catch stuff I missed, I catch stuff it misses and I tell it to add it, which it dutifully does.

CamperBob2 · 2026-02-21T23:21:46 1771716106

I can't tell if that is sarcasm. Of course you can use the same model to write tests. That's a different problem altogether, with a different series of prompts altogether!

When it comes to code review, though, it can be a good idea to pit multiple models against each other. I've relied on that trick from day 1.

Nition · 2026-02-21T23:16:47 1771715807

That's why you get Codex to do it. /s

OsrsNeedsf2P · 2026-02-20T04:31:18 1771561878

I remember seeing this market on Dread many years back. Crazy to know it was run by the fed

ducktastic · 2026-02-20T12:46:57 1771591617

It's really not so crazy

OsrsNeedsf2P · 2026-02-18T17:34:00 1771436040

Plasma is one of the reasons Linux is so awesome

OsrsNeedsf2P · 2026-02-17T17:04:05 1771347845

How do you know this was Graphene OS' fault?

OsrsNeedsf2P · 2026-02-17T17:02:06 1771347726

This has been my experience with Windows too. Airpods connect out of the box on Linux, but on Windows they would stop pairing every couple minutes until I fixed some drivers

OsrsNeedsf2P · 2026-02-17T16:35:20 1771346120

Why would I work hard once I found out I was going to be defaulted X years from now?

OsrsNeedsf2P · 2026-02-15T20:41:57 1771188117

This honestly just tells me that Panagram is hot garbage

FeteCommuniste · 2026-02-15T21:03:37 1771189417

These days I'm always wondering whether what I'm reading is LLM-slop or the actual writing of a person who contracted AI-isms by spending hours a day talking to them.

OsrsNeedsf2P · 2026-02-15T07:26:52 1771140412

Great read. One bad side is it was so long, by the time I came back to upvote this article you already fell off trending.

OsrsNeedsf2P · 2026-02-15T05:43:23 1771134203

This is great. I look forwards to more "strict" languages whose deterministic compilers will give LLMs a tight feedback loop to catch bugs.