More

kolinko · 2026-04-14T17:04:32 1776186272

Nice!

What I missed from the writeup were some specific cases and how did you test that all this orchestration delivers worthwhile data (actionable and full/correct).

E.g. you have a screenshot of the AI supply chain - more of these would be useful, and also some info about how you tested that this supply chain agrees with reality.

Unless the goal of the project was to just play with agent architecture - then congrats :)

zc2610 · 2026-04-14T19:10:50 1776193850

Great advice!

For demo purpose and to attract attention, i was primarily picking some cases with cool visuals (like the screenshot of the AI supply chain you mentioned). we have some internal eval and will try to add more cases in the public repo for reference.

uoaei · 2026-04-14T19:33:44 1776195224

More signs of the AI bubble. Completely unprofessional behavior ("cool visuals" not "real results"). And don't give me that "hacker culture" bullshit, these people are targeting Wall Street as paying customers.

zc2610 · 2026-04-14T23:38:43 1776209923

would it be more professional in your opinion if i am claiming i make $xxxxx via this tool? I thought i have clearly stated that cool visuals is for >demo purpose and to attract attention. I do not want to post any dramatic statement to trick people using it. This is an early stage open source project to help investors and traders organize their thoughts, not an auto money making machine that guarantee profit. its the mind who use the tool decide if they will profit from market.

>And don't give me that "hacker culture" bullshit

I couldn’t help but be genuinely curious: if you believe AI is a bubble and aren’t a fan of hacker culture, then why are you here on Hacker News?

great to hear your input anyway!

dumfries · 2026-04-15T09:03:08 1776243788

First of all this project is great and finance is ready for a disruption like this. I'm sure a lot of good research and development went into this.

Quality research indeed doesn't always make money, so I agree that it doesn't make sense to present these type of metrics. But at the same type, it will be hard to trust this sort of thing immediately without having a way to validate its output. At the very least I would like to know that the financial metrics it calculates (esp those based on 20/30 data points) are correct. Looks like there is some transparency build in and that's a good thing.

But people that are not a pro in investment research wouldn't know that it messed up a certain metric and therefore the output is different from what it tells me. Or maybe it is not messing up entirely, but a certain sector-specific detail doesn't get picked up making a signal less strong than the output made you believe. Maybe you already have it but if not maybe you could get some sort of validation layer added, that could also serve as some sort of customisable calculation engine, I'd use it right away.

zc2610 · 2026-04-15T10:35:02 1776249302

Thanks, very valid point. We are building towards a benchmark as well. hope we can share more quantitive metric soon.

uoaei · 2026-04-16T06:04:19 1776319459

"Cool visuals" are "dramatic statements". Neither have any substance nor basis in reality.

What would this possibly add over existing AI chatbots if all it's for is "organizing thinking"? There is no value add here.

I love hacker culture. This isn't hacking, this is the exact opposite of that.

kolinko · 2026-04-12T21:59:29 1776031169

"ally of President Trump" -- more like an ally of Putin?

layer8 · 2026-04-12T22:07:33 1776031653

One doesn’t exclude the other.

kakacik · 2026-04-12T22:02:01 1776031321

Both. Both are allies too. Are you not watching the news for past decade?

Also, DJ vance was lobbing for him righr before elections in some local gatherings during his visot few days ago. So much for not interfering.

highpost · 2026-04-12T22:01:35 1776031295

you can be both.

kolinko · 2026-04-07T17:54:22 1775584462

Because Amazon stops supporting devices after 14 years? (while they can still be used to read books already downloaded)

Really?

kelnos · 2026-04-07T18:38:29 1775587109

In this case the reason for dropping support is most likely that the only DRM they can support on that older hardware has been broken. There's no technical reason why it can't be supported, and I doubt it would cost them much (or even anything) to continue support.

Meanwhile, I can still read physical books I've had since I was a child, 40 years ago. The Kindle is undeniably more convenient than physical books, but this is absolutely an unnecessary sunset of these devices.

mavhc · 2026-04-09T14:13:08 1775743988

you can still remove the drm and sideload them

Nicholas_C · 2026-04-07T18:16:04 1775585764

In my post I said "this kind of stuff" which also includes their DRM policies (which is the real reason they are ending the users' kindle support).

JojoFatsani · 2026-04-09T00:18:15 1775693895

My Kindle 4 hardware works great, I still read it nightly. Since it doesn’t feel like it’s obsolete (in fact it has physical buttons so may be slightly better than a modern Kindle), it feels like a blatant cash grab by Amazon to get us to buy new devices that probably are laden with ads or other revenue generators.

kolinko · 2026-03-27T23:48:19 1774655299

Since the November/December Opus and Claude Code, I found I don't need to read the code any more. Architecture overview sure, and testing yes, but not reading the code directly any more.

Me (and my friends similarly) inspect code indirectly now - telling agents to write reports about certain aspects of the code and architecture etc.

sarchertech · 2026-03-28T02:22:00 1774664520

I do regularly read the code that Claude outputs. And about 25% of the time the tests it writes will reimplement the code under test in the test.

Another 25% of the time the tests are wrong in some other way. Usually mocking something in a way that doesn't match reality.

And maybe 5% of the time Claude does some testing that requires a database, it will find some other database lying around and try to use that instead of what it's supposed to be doing.

And even if Claude writes a correct test, it will general have it skip the test if a dependency isn't there--no matter how fervently I tell it not to.

If you're not looking the code at all, you're building a house of cards. If you not reading the tests you're not even building you're just covering the floor in a big sloppy pile of runny shit.

thunky · 2026-03-28T12:12:06 1774699926

> I do regularly read the code that Claude outputs

You probably could have s/Claude/Human/ in your rant and been just as accurate. I don't know how many times I've flagged these issues in code reviews. And that's only assuming the human even bothered to write tests...

What I find is that when I ask AI to write tests it writes too many, and I agree with you that a lot of them are useless. But then I just tell it that, and it agrees with me and cleans it up. Much faster feedback loop and much better final result.

I feel like people that look at a poor result and stop there and conclude it's useless have made up their mind and don't want to see the better results that are right in front of them if they just spend an extra 5 seconds trying.

sarchertech · 2026-03-28T13:13:01 1774703581

How do you know whether the tests it’s spits out are bad if you don’t read the tests.

We’re not dealing AGI here. Tests aren’t strictly necessary for humans. They are for AI. AI requires guardrails to keep from spinning out. That’s essentially the entire premise of the agentic workflow.

thunky · 2026-03-28T13:46:07 1774705567

> How do you know whether the tests it’s spits out are bad if you don’t read the tests.

I do read the tests (quickly, I admit) and so does OP:

Architecture overview sure, and testing yes, but not reading the code directly any more.

Reading that again I may have misunderstood what they meant by "testing yes", though.

sarchertech · 2026-03-28T16:46:08 1774716368

I’m pretty sure they just meant they do testing not that they read the tests and that’s what everyone else who responded interpreted that as well.

You can get Claude to write good tests but based on what I’m seeing at work that’s not what’s happening. They always look plausible even when they’re wrong, so people either don’t read them, skim them very quickly, or read the first few assume the rest work and commit.

I think Claude is great for testing because setting test data and infrastructure is such a boring slog. But it almost always takes a lot of back and forth and careful handholding to get it right.

mlazos · 2026-03-28T23:59:44 1774742384

I read the tests, it also is really really good to have Claude verify that removing the changes in question break the tests. This brings the quality way way up for me.

UncleMeat · 2026-03-28T13:45:59 1774705559

In comparison, I see this issues in fewer than 1% of the changes I review. Because when it happens you can effectively teach people to stop doing it.

hiq · 2026-03-28T00:46:20 1774658780

I'd understand not reading the code of the system under test, but you don't even read the tests? I'd do that if my architecture and design were very precise, but at this point I'd have spent too much time designing rather than implementing (and possibly uncovering unknown unknowns in the process).

> Me (and my friends similarly) inspect code indirectly now - telling agents to write reports about certain aspects of the code and architecture etc.

Doesn't this take longer than reading the code?

I can see how some of this is part of the future (I remember this article talking about python modules having a big docstring at the top fully describing the public functions, and the author describing how they just update this doc, then regenerate the code fully, never reading it, and I find this quite convincing), but in the end I just want the most concise language for what I'm trying to express. If I need an edge case covered, I'd rather have a very simple test making that explicit than more verbose forms. Until we have formal specifications everywhere I guess.

But maybe I'm just not picturing what you mean exactly by "reports".

lunar_mycroft · 2026-03-28T00:13:37 1774656817

I've seen the code these models produce without a human programmer going over the results with care. It's still slop. Better slop than in the past, but slop none the less. If you aren't at minimum reading the code yourself and you're shipping a significant amount of it, you're either effectively the first person to figure out the magic prompt to get the models to produce better code, or you're shipping slop. Personally, I wouldn't bet on the former.

seba_dos1 · 2026-03-28T00:29:32 1774657772

Yeah, these models have definitely become more useful in the last months, but statements like "I don't need to read the code any more" still say more about the person writing that than about agents.

afavour · 2026-03-28T03:44:40 1774669480

If I were you I’d very worried about getting laid off. That kind of work isn’t going to keep earning a software engineer salary.

_andrei_ · 2026-03-28T22:41:34 1774737694

you're a slop maker

kolinko · 2026-03-27T23:39:22 1774654762

No, the rules have to be the same for all EU citizens.

kolinko · 2026-03-27T23:38:52 1774654732

Did you travel in Europe? Even without crisis, gas stations are often way busier on the cheaper country's border than more expensive.

My friends living in Switzerland (near the border) always go to Germany to fuel up. And, even without a crisis, gas stations on the cheaper sides of borders are often way more crowded than on the other side.

Also, keep in mind that Slovenia is roughly the size of Los Angeles. Or not much wider than Long Island. If there fuel was 30% cheaper on one side of Long Island, than on the other, I'm sure plenty of people wouldn't think twice about that.

ajsnigrutin · 2026-03-28T19:29:32 1774726172

Ah yes, the rich people of switzerland, doing their weekly shopping in germany :)

kolinko · 2026-03-23T22:27:35 1774304855

Aside from a cost? It's also managing the actual human being, and making sure they have enough work. If the place has 5-10 calls a day, then it's pointless to hire receptionist that will do nothing for 1 hour, and then get 2 minutes chat. It used to be pointless to build software to do that, but since claude code it's cheap enough to make sense.

skeeter2020 · 2026-03-24T00:49:53 1774313393

receptionist as a service has been a thing for like... forever. You are never going to solve the problem of accurately estimating and quoting with AI or an answering service, so pay for someone to answer the phone and take down the details; have a mechanic or trained service rep review and estimate. Cheap code that doesn't solve the problem is not cheap.

eucyclos · 2026-03-24T01:56:57 1774317417

Couldn't an ai take down the details and pass it to a mechanic or trained service rep?

ssl-3 · 2026-03-24T05:44:06 1774331046

Yes, of course. The bot can request information and the customer can provide it if they feel like it, and then someone qualified can call them back when they have their hands free.

But there's no bot, per se, needed at all. An answering machine from 1993 can do this same information-gathering job. :)

camillomiller · 2026-03-24T07:09:02 1774336142

I can see a useful simple case of structuring a good answering system and then using AI to do STT then using Claude to structure the callback data

ssl-3 · 2026-03-24T07:25:58 1774337158

Good point.

So update the device from 1993's new-fangled digital answering machine to 2009's Google Voice, and have it do the transcription from voicemail to text.

Someone will still have to call Bill back about his Honda (which is actually the Kia he bought for his daughter -- Bill is not a very technical guy these days[1] and he confuses such concepts regularly) in order to get any trading of money for services done.

It doesn't take an LLM to get there, and Bill would probably prefer to avoid being frustrated by the bot's insistent nature.

[1]: https://news.ycombinator.com/item?id=47356166

camillomiller · 2026-03-24T09:10:06 1774343406

Look, you‘re kicking an open door. I think LLMs applied like this are just a layer of complexity that os mostly replacing lower level programming solutions that could do the same thing

Mrngl1991 · 2026-03-24T10:37:57 1774348677

The transcription + callback loop is honestly underrated. Most of the value here is just capturing intent accurately ("Honda" vs "Kia" aside) so the mechanic can prioritize callbacks. A dumb voicemail-to-text pipeline handles that fine. The LLM layer adds complexity without solving the actual bottleneck, which is someone qualified picking up the phone.

ssl-3 · 2026-03-24T17:44:09 1774374249

You nailed it.

But I'm not sure that a bot can be trusted to make good decisions about priority, either. So even if it makes good decisions based on context (which it can increasingly-often do, but does not always do), it lacks the context that is necessary to form the basis of good decisions.

Suppose a message comes into the box with this form: "This is Wendy, can you call me? My car is making that noise again."

The bot might deprioritize that call because it lacks actionable contextual information. "My job as a bot is to get more jobs into the shop. This call does not have enough data to do that, so I'll shove to the bottom of list of callbacks behind more-actionable jobs."

But the mechanic? The mechanic knows Wendy's Ford very well, and he also knows Wendy. She's a been a good customer for over a decade. The mechanic also knows the noise, and that Wendy has 3 little kids and that she's vacationing 900 miles away on a road trip with those kids in that Ford. The context is all there inside of the mechanic's brain to combine and mean that this might be the highest-priority call he gets all week.

Wendy may not have actively relayed any urgency in her message, but the urgency is real and she needs called back right away. She needs answers about what to do (keep driving and look into it when she gets back? pull over immediately and get a tow to a decent local shop? maybe she even needs help finding such a shop?) pretty much immediately. Not because it means more business today, but because it means more business for years.

The mechanic can spot this from a list of transcripts in an instant and give her a ring back Right Now. The bot is NFG at this.

The addition of the bot only adds noise to the process, and that noise only works to Wendy's detriment. When the bot adds detrimental noise to Wendy's situation, it also adds detriment to the shop's longevity.

The presence of the bot -- even as a prioritizing sorting mechanism -- asymptotically shifts the state from an excellent shop that knows their customers very well to a bot-driven customer-averse hellscape.

(And no, the answer isn't to make the bot into an all-knowing oracle that actively gets fed all context. The documentation burden would be more expensive, time-wise (and thus money-wise) than hiring a competent human receptionist who answers the phone, handles the front door traffic, and absorbs context from their surroundings. A person who chatted with Wendy last Thursday right before she left for her trip is always going to be superior to a bot.)

deanputney · 2026-03-24T03:33:05 1774323185

If someone put on their website and voicemail that they were available for calls only from 8-10am (for example), or that they would return my call at that time, I'd make a point to call them then. It's reasonable that people are busy too.

kolinko · 2026-03-23T12:48:58 1774270138

instead of asking „what’s next”, a good question to ask is „what jobs are now feasible that previously were cinstrained by the cost of producing software”?

kolinko · 2026-03-21T08:39:43 1774082383

I think the parent meant vs MacOS, not vs Linux.

cracki · 2026-03-21T09:43:51 1774086231

Users of MacOS rarely have an active dislike for Windows, nor are they likely to announce this.

freehorse · 2026-03-21T12:16:52 1774095412

I use macos and I do actively dislike windows: here I announce it.

nmcfarl · 2026-03-21T13:30:22 1774099822

I liked the apple II, and the TRS 80 as I rather like basic. And then I didn’t hate DOS, and then I actively hated the graphical shell of Windows 3, but could not afford a Macintosh -so suffered through it where I had to, but mainly used DOS. Then I discovered UNIX, and did almost all of my work on a timeshare - in the early 90s!

Then Windows 95 came out and I actively hated it, but did think it was amazingly pretty - somehow this was the impetus for me to get a pc again, which I put Windows NT on. Which was profitable for freelance gigs in college. Soon after that, I dual booted it to Linux and spent most of my time in Slackware.

After that, I graduated and had enough money to buy a second rig, which I installed OS/2 warp on - which was good for side gigs. And I really liked. A lot. But my day job required that I have a Windows NT box to shell into the Solaris servers as we ran. Then I got a better class of employer and the next several let me run a Linux box to connect to our solaris (or Aix) servers.

Next my girlfriend at the time got a PowerBook G4 and installed OS X on it. It was obviously amazing. Windows XP came out, and it was once again so much worse than Windows NT - and crashed so much more - which was odd as it was based on Windows NT. (yes 98 was before this but it was really bad). Anyhow, right about here the Linux box I was running at home, died. And it was obvious that I was not going to buy an XP box, so I bought my first Mac.

And it’s been the same for the last 25 years - every time I look at a Windows box it’s horrible. I pretty much always have a Linux box headless somewhere in the house, and one rented in the cloud, and a Mac for interacting with the world.

And like the parent I actively dislike windows. And that’s interesting because I’ve liked most other operating systems I’ve used in my life, including MS-DOS. Modern windows is uniquely bad.

esafak · 2026-03-21T15:17:03 1774106223

DOS was bad by UNIX standards too. Only Windows NT/2000 was decent.

UltraSane · 2026-03-21T16:40:46 1774111246

I use windows and absolutely hate the mac UI. Having the current window title bar always at the top of the screen doesn't make any sense when you have a very big monitor. It only made sense with the tiny monitors available when the mac UI was originally created.

freehorse · 2026-03-22T22:53:19 1774219999

Yeah, that is an annoyance for me too but for a different reason. I have set the menu bar to be only in the internal display (to avoid issues with my OLED external monitor) so when I have a window in the external monitor, I have to move the mouse to the internal monitor screen space if I want to open something that is in the app's title bar.

On the other hand, it is actually useful that there is mostly a specific place you find settings etc, as in windows/linux it tends to vary depending on the app where to find those (is there a bar on top of the window? Is there a button to expand a menu somewhere? Something else? Who knows).

UltraSane · 2026-03-23T03:59:56 1774238396

The very idea of being able to have the programs' main menu on a different screen is so silly.

kolinko · 2026-03-23T22:28:21 1774304901

Me, personally, I have an active dislike for windows and I announce it broadly. But I may be weird :)

kolinko · 2026-03-20T12:10:09 1774008609

As someone who spent most of my time with computer scientists - the last thing I’d like would be for them to run the world.

ugtr3 · 2026-03-20T12:15:57 1774008957

Yup. Unhinged to put it mildly.

People who are liberal artsy at the core but do computer science? Yes.