Hahah, thanks! It was a marathon to get develop this, and I'm glad it reached th...

dhruvdh · on April 17, 2024

I would imagine the importance of weights depends on the prompt. How do you decide which weights are important?

kolinko · on April 17, 2024

Yeah, that is the point more or less - it dynamically chise the weights layer per layer depending on the internal state.

A bit technical explaination here. https://kolinko.github.io/effort/equations.html

indymike · on April 17, 2024

> It is possible with Effort.

"All things are possible with enough effort." -- Dad.

kolinko · on April 17, 2024

Hahaha :)

0x4139 · on April 17, 2024

Implementing this approach could significantly enhance the adoption of LLMs within mobile phone libraries and other compact devices. I highly recommend opening an improvement issue for llama.cpp.