Hacker Newsnew | past | comments | ask | show | jobs | submit | teleforce's commentslogin

For what it's worth there's a new book on The Science of Music by Mark Newman who also the author of the popular book on Computational Physics [1].

[1] Mark Newman's new book: The Science of Music (2023):

https://lsa.umich.edu/cscs/news-events/all-news/search-news/...


Does the PostgresSQL 18 performance increased with the latest asynchronous I/O, smarter query planning with improved parallelism kind of offset this performance hits? [1].

"Enhanced and smarter parallelisation; initial benchmarks indicate up to 40% faster analytical queries".

[1] PostgreSQL 18 released: Key features & upgrade tips:

https://www.baremon.eu/postgresql-18-released-key-features-u...


>Why domain specific LLMs won’t exist: an intuition

>We would have a healthcare model, economics model, mathematics model, coding model and so on.

It's not the question whether there ever will be specialized model, rather it's the matter of when.

This will democratize almost all work and profession, including programmers, architects, lawyers, engineers, medical doctors, etc.

For half-empty glass people, they will say this is a catastrophe of machine replacing human. On the other hand, the half-full glass people will say this is good for society and humanity by making the work more efficient, faster and at a much lower cost.

Imagine instead of having to wait for a few months for your CVD diagnostic procedures due to the lack of cardiologist around the world (facts), the diagnostics with the help of AI/LLM will probably takes only a few days instead with expert cardiologist in-the-loop, provided the sensitivity is high enough.

It's a win-win situation for patients, medical doctors and hospitals. This will lead to early detection of CVDs, hence less complication and suffering whether it's acute or chronic CVDs.

The foundation models are generic by nature with clusters HPC with GPU/TPU inside AI data-center for model training.

The other extreme is RAG with vector databases and file-system for context prompting as the sibling's comments mentioned.

The best trade-off or Goldilocks is the model fine-tuning. To be specific it's the promising self-distillation fine-tuning (SDFT) as recently proposed by MIT and ETH Zurich [1],[2]. Instead of the disadvantages of forgetting nature of the conventional supervised fine-tuning (SFT), thr SDFT is not forgetful that makes fine-tuning practical and not wasteful. The SDFT only used 4 x H200 GPU for fine-tuning process.

Apple is also reporting the same with their simple Smself-distillation (SSD) for LLM coding specialization [3],[4]. They used 8 x B200 GPU for model fine-tuning, which any company can afford for local fine-tuning based on open weight LLM models available from Google, Meta, Nvidia, OpenAI, DeepSeek, etc.

[1] Self-Distillation Enables Continual Learning:

https://arxiv.org/abs/2601.19897

[2] Self-Distillation Enables Continual Learning:

https://self-distillation.github.io/SDFT.html

[3] Embarrassingly simple self-distillation improves code generation:

https://arxiv.org/abs/2604.01193

[4] Embarrassingly simple self-distillation improves code generation (185 comments):

https://news.ycombinator.com/item?id=47637757


It seems that self-distillation is the way to go for LLM.

Self-distillation has been shown recently as very efficient and effective back in January this year by MIT and ETH team in their Self-Distillation Fine-Tuning (SDFT) LLM system [1],[2].

This paper is also their closest competitor named On-Policy Self-Distillation in the comparison table.

I hope they keep the original work real name that is Self-Distillation Fine-Tuning or SDFT. Imagine later paper citing this very paper as cross-entropy self-distillation instead of their very own given name Simple Self-Distillation or SSD. Although I'd have admitted it's a lousy name that breaks the namespace with common SSD nomenclature for solid-dtate drive, as others have rightly pointed.

I think they should given the proper credit to this earlier seminal earlier on SDFT but apparently they just put it as one as of the systems in their benchmark but not explaining much of the connection and lineage which is a big thing in research publication.

[1] Self-Distillation Enables Continual Learning:

https://arxiv.org/abs/2601.19897

[2] Self-Distillation Enables Continual Learning:

https://self-distillation.github.io/SDFT.html


>Nobody sane would predict Israel and the US would start this war.

Israel predicted and started the war unless you consider they're insane [1].

[1] Iran is a distraction [video]:

https://news.ycombinator.com/item?id=47640560


>The first of these is its locked-in ecosystem, which keeps its users buying Apple.

Personally this is why I wouldn't touch Apple's product with a ten foot pole.

Anyway kudos to them their vision (read Steve's visions) and tenacity have put them at the upper echelon of consumer's tech companies. To make products that are desirable to people is very hard. That's probably why according to the article, the second most important things about Apple:

>Another major factor is its marketing, which has made it the only luxury technology brand.


>I don't know what the point of not writing your own comments is, other than spam.

I think it will reach to the point of "dogfooding".

Those who's not crafting their own comments will be treated as those who's not using their written software.


Fun facts, TK Solver or TK!Solver original developer is Milos Konopaseka a textile engineer from from Czechoslovakia.

TK Solver is a software cousin of the infamous VisiCalc, developed by the same company Software Arts.

VisiCalc has been discontinued but TK Solver is still being sold today by Universal Technical Systems (UTS) [1].

Milos also developed the Question Answering System (QAS) running on a PDP-10. It operates on equations relating input yarn, cloth area, fiber strengths, etc. For a desired cloth strength, you could solve for fiber strength, or given fiber strength, you could solve for the cloth strength. The same operations you can still perform in TK Solver.

[1] Comprehensive Mathematical Software Tool for Engineers:

https://www.uts.com/Products/TKSolver


System Programming in Linux ebook alone is $80.

If you already have that book go for the 15 items bundle at $36 but bear in mind that two of them are just early access preview.


Someone need to write a new book on Linux router.

The old one is getting really old now, nearly 25 years ago [2].

[1] Book Review: Linux Routers - A Primer for Network Administrators, 2nd Ed:

https://www.linuxjournal.com/article/6314


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: