For production use of open weight models I'd use something like Amazon Bedrock, ...

For production use of open weight models I'd use something like Amazon Bedrock, Google Vertex AI (which uses vLLM), or on-prem vLLM/SGLang. But for a quick assessment of a model as a developer, Ollama Turbo looks appealing. I find Google GCP incredibly user hostile and a nightmare to navigate quotas and stuff.