Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Something I've noticed playing around with Llama 7b/13b on my Macbook is that it clearly points out just how little RAM 16GB really is these days. I've had a lot of trouble running both inference and a web UI together locally when browser tabs take up 5GB alone. Hopefully we will see a resurgence of lightweight native UIs for these things that don't hog resources from the model.


FWIW I've also had browser RAM consumption issues in life, but it's been mitigated by extensions like OneTab: https://chrome.google.com/webstore/detail/onetab/chphlpgkkbo...

For now, local LLMs take up an egregious about of RAM, totally agreed. But we trust the ecosystem is going to keep improving and growing and we'll be able to make improvements over time. They'll probably become efficient enough where we can run them on phones, which will unlock some cool scope for Khoj to integrate with on device, offline assistance.


The new Chrome "memory saver" feature that discards the contents of old tabs saves a lot of memory for me. Tabs get reloaded from the server if you revisit them.


Or hopefully we will see an end of the LLM hype.

Or at least models that don’t hog so much RAM.


>Or at least models that don’t hog so much RAM

The RAM usage is kind of the point though; we're trading space for time. It's not a problem that the model is using it, it's just that with the default choice for UI being web based now, the unnecessary memory usage of browsers is actually starting to be a real pain point.


1. I hear you on going back to lightweight native apps. Unfortunately the Python ecosystem is not great for this. We use pyinstaller to create the native desktop app but it's a pain to manage.

2. The web UI isn't required if you use Obsidian or Emacs. That's just a convenient, generic interface that everyone can use.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: