Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For production use of open weight models I'd use something like Amazon Bedrock, Google Vertex AI (which uses vLLM), or on-prem vLLM/SGLang. But for a quick assessment of a model as a developer, Ollama Turbo looks appealing. I find Google GCP incredibly user hostile and a nightmare to navigate quotas and stuff.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: