Hacker Newsnew | past | comments | ask | show | jobs | submit | kqr's commentslogin

It was never that great, it seems. For all of 2025 there was virtually no improvement in the rate at which models produced quality code. They only got better at passing automated tests.

https://entropicthoughts.com/no-swe-bench-improvement


I have worked on similar problems. See e.g. [1].

The LLMs I have tested have terrible world models and intuitions for how actions change the environment. They're also not great at discerning and pursuing the right goals. They're like an infinitely patient five-year old with amazing vocabulary.

[1]: https://entropicthoughts.com/updated-llm-benchmark

(more descriptions available in earlier evaluations referenced from there)


Inside information is not the same as causing the outcome.

I knew my previous employer was getting acquired before the markets did -- I had inside information -- but I had no way to make the deal go through.

Revealing inside information through prediction markets is mostly fine. You may think people shouldn't need a financial incentive to do so but clearly they do, and clearly other people are willing to put up with that money for it.


If only playing board games didn't require colocating several friends for a non-trivial span of time... Everyone around me (including myself!) is busy with work, children, partner, running their household, and exercising.

How do people do it?


I play mostly with the aforementioned wife and kids!

We were shocked by how early our kids could pick up board games, including many of the ones mentioned in this article. Our 2 oldest kids were playing Ticket to Ride and Carcassone well enough to beat us form time to time at 3 and 4 years old. Now that they're a little older, slightly more complicated games like Catan and Flamecraft are on the table!


Ah. My wife isn't very interested and my children are just at the age where they can accept a slightly more complicated snakes and ladders[1] but they would not be able to do anything meaningful in Ticket to Ride yet. Looking forward to the day!

[1]: https://entropicthoughts.com/snakes-and-ladders


You didn't ask, but I would offer a few suggestions for games to introduce your children to soon.

Dinosaur Escape [0] if you can find it, it's basically a cooperative memory game but introduces them to some very gentle strategy. (Incidentally, there's a similarly themed game[1] on BGA that is also good.)

King of Tokyo is very fun, and it's easy to slim down the rules for someone who cannot read. It still allows them to understand the mechanics and make decisions about which dice to pick and which to re-roll.

Similarly, we started Kingdomino with simplified rules (no multiplication, just a tile matching game) and it was easy to graduate into full game play later.

Outfoxed is a logic / deduction board game that doesn't involve too much advanced strategy. Since it's cooperative it's easy to work with them so they can begin reasoning out the clues.

[0] https://boardgamegeek.com/boardgame/175497/dinosaur-escape [1] https://boardgamearena.com/gamepanel?game=babydinosaurrescue


I didn't explicitly ask but interpreting my comment as a question was the right thing to do. Thanks, I'll see what my local second hand market says about your suggestions.

I play a lot of games solo it’s how I got into board games. Was looking for an alternative to video games when I was having RSI issues.

There are lot of solo only games and most cooperative games allow for solo play. During the pandemic it became pretty popular.

I do a mix of solo, in person and online with boardgamearena.


I play D&D once a week with local friends.

We all met, and picked a day that was likely to work for us regularly, going forward - for us, it's a Tuesday. That way we know, and can plan ahead for the foreseeable future, that Tuesdays will be D&D nights. People with kids can get babysitters, or get spouses/grandparents to take care of them. People with other obligations can keep that night clear. Etc., etc.

I used to prefer the whole "let's schedule the next session at the end of the night", but that has 100% led to campaigns falling apart. Consistency is key.

(Also, it helps to have a big enough group - either for D&D or boardgames - that the absence of any one or even two people doesn't tank the night.)

Doing things virtually is also a good suggestion, but I'm pretty burnt out of staring at people's faces on a screen, so I hate playing D&D or other games over a screen - but your mileage may vary.


Board Game Arena in turn-based mode.

I have been playing multiple games daily non-stop for 6 years now with a consistent but sometimes shifting group of local friends.

We still play live either virtually or in-person at times, but the async games never end. Playing live on BGA still reduces the game time by as much as 50% for more complex games since it handles setup, teardown, and scoring.


We did it before we had kids. And now we do it virtually. Or very occasionally.

that's the ultimate board game I guess.

This is such a sad comment. You can call people and make a plan. It's called socializing.

The author uses hexyl as an example of trying, but not doing it right.

As a child and adolescent I always imagined that something would click when I became an adult and I would become good at things and understand the world. That never happened, and then I realised it never happens for anyone. We're all just large children walking around figuring things out. Some of us figure things out faster, some of us stop trying to figure things out, but we're all just as clueless in the grand scheme of things. It's a miracle and a testament to our perseverance and ambition that things still work as well as they do.

On the other hand, I've contacted several of my heroes (not been able to meet as many of them in person) and that's always been an exhilerating, formative experience. I strongly recommend it if you can think of a good reason. (I have a list of heroes I have yet to reach out to because I haven't yet encountered an interesting enough problem to offer them. Several of them unfortunately have an actuarial deadline not too far into the future.)


Could this be from adults not being honest to children when they don’t know something? I’ve personally seen this happen a lot. Many adults try to save face about not knowing things with other adults, let alone with children. So it might be a cultural issue that could be fixed.

This reminds me of this comic[1], which also works well for things such as MANAGE MENT or PARENT HOOD or FOUNDER SHIP.

[1] https://eelhips.tumblr.com/post/7035963689/early-life-crisis


In the 1970s computer systems spanned fewer orders of magnitude. Operations generally took somewhere between maybe 1 and 10^8 CPU cycles. Today, the range is closer to 10^-1 to 10^13.

The conspiratory reason would be that copy-paste errors give plausible deniability of ill intent.

On the other hand, any time a hypothesis appears significant, the first reaction should be to verify that all the data going into the calculation is correct, rather than just assume it is. In my day-to-day industry experience, significant results come far more often from incorrect data than an actual discovery.

I actually did that recently, but only at low sample rate so I don't qualify for bonus points. (My purpose was filtering to get at underlying patterns. Found what I expected but not much more.)

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: