Based on the reverse engineering done by Parth Thakkar [1], the model used by Co...

obastani · on Dec 23, 2022

Just out of curiosity, in what sense is Codex is better trained than CodeGen?

moyix · on Dec 23, 2022

OpenAI hasn't said exactly how they trained code-davinci-002 so this is speculative, but I'm reasonably sure it was trained on more data and languages than CodeGen and for longer. It was also trained using fill-in-the middle [1].

[1] https://arxiv.org/abs/2207.14255