Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm one of the Ray developers, thanks for the shoutout :)

If you're curious about how Ray is used for LLMs, here are some interesting examples of LLM projects using Ray!

- Alpa does training and serving with 175B parameter models https://github.com/alpa-projects/alpa

- GPT-J https://github.com/kingoflolz/mesh-transformer-jax

- Another HN thread on training LLMs with Ray (on TPUs in this case) https://news.ycombinator.com/item?id=27731168

- OpenAI fireside chat on the evolution of their infrastructure and usage of Ray for training https://www.youtube.com/watch?v=CqiL5QQnN64

- Cohere on their architecture for training LLMs https://www.youtube.com/watch?v=For8yLkZP5w&t=3s

Some other thoughts

1. There is a lot more we want to do to make Ray better for working with large language models and for making training, serving, and batch inference work well out of the box.

2. The original post is about training, but we actually see even more interest in fine-tuning and serving with LLMs, in part because there are good pre-trained models.

3. For LLMs, we see a lot of interest in Ray + Jax or Ray + TPUs relative to what we see in other use cases.



Do you see any convergence on wire (arrow?) and storage (pandas?) formats?


And we can make Ray more efficient by optimizing GPU hardware utilization https://centml.ai/


Will it work with a PC that has 7 AMD Vega GPUs?


Yes, but this will largely come down to whether the deep learning framework that you're using (PyTorch, TensorFlow, Jax, etc) works well in that setting. Ray is pretty framework and hardware agnostic and can be used to schedule / scale different ML frameworks on different types of devices (CPUs, GPUs, TPUs, etc), but the actual logic for running code on the accelerators lives in the deep learning framework.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: