llama2 is pretty old. ollama also defaults to rather poor quantizations when using just the base model name like that - I believe that translates to llama2:Q_4_M which is a fairly weak quantization(fast, but you lose some smarts)
Picking one where the size is < your VRAM(or, memory if without a dedicated GPU) is a good rule of thumb. But you can always do more with less if you get into the settings for Ollama(or other tools like it).
My suggestion would be one of the gemma3 models:
https://ollama.com/library/gemma3/tags
Picking one where the size is < your VRAM(or, memory if without a dedicated GPU) is a good rule of thumb. But you can always do more with less if you get into the settings for Ollama(or other tools like it).