To be fair they also kind of share that opinion, hence why MLIR came to be, first only for AI, nowadays for everything, even C is going to get its own MLIR (ongoing effort).
I did my thesis porting my supervisor's project from NeXTSTEP into Windows, was an OpenGL fanboy up to the whole Long Peaks disaster.
Additionally Vulkan has proven to be yet another extension mess (to the point now are actions try to steer it back on track), Khronos is like the C++ of API design, while expecting vendors to come up with the tools.
However, as great as CUDA, Metal and DirectX are to play around with, we might be stuck with Khronos APIs, if geopolitcs keep going as bad or worse, as they have been thus far.
Vulkan backends work just fine, provided one wants to be constrained by Vulkan developer experience without first class support for C++, Fortran and Python JIT kernels, IDE integration, graphical debugging, libraries.
Additional points, CUDA is polyglot, and some people do care about writing their kernels in something else other than C++, C or Fortran, without going through code generation.
NVidia is acknowledging Python adoption, with cuTile and MLIR support for Python, allowing the same flexibility as C++, using Python directly even for kernels.
They seem to be supportive of having similar capabilities for Julia as well.
The IDE and graphical debuggers integration, the libraries ecosystem, which now are also having Python variants.
As someone that only follows GPGPU on the side, due to my interests in graphics programming, it is hard to understand how AMD and Intel keep failing to understand what CUDA, the whole ecosystem, is actually about.
Like, just take the schedule of a random GTC conference, how much of it can I reproduce on oneAPI or ROCm as of today.
reply