It is not going to take off if it is not significantly better, and has browser support. WebP took off thanks to Chrome, while JPEG2000 floundered. If not native browser support, maybe the codec could be shipped by WASM or something?
The interesting diagram to me is the last one, for computational cost, which shows the 10x penalty of the ML-based codecs.
The thing about ML models is the penalty is a function of parameters and precision. It sounds like the researchers cranked them to max to try to get the very best compression. Maybe later they will take that same model, and flatten layers and quantize the weights to can get it running 100x faster and see how well it still compresses. I feel like neural networks have a lot of potential in compression. Their whole job is finding patterns.
Did JPEG2000 really flounder? If your concept of it being a consumer facing product as a direct replacement for JPEG, then I could see being unsuccessful in that respect. However, JPEG2000 has found its place in the professional side of things.
Yes, I do mean broad- rather than niche adoption. I myself used J2K to archive film scans.
One problem is that without broad adoption, support even in niche cases is precarious; the ecosystem is smaller. That makes the codec not safe for archiving, only for distribution.
The strongest use case I see for this is streaming video, where the demand for compression is highest.
That makes the codec not safe for archiving, only for distribution.
Could you explain what you mean by "not safe for archiving"? The standard is published and there are multiple implementations, some of which are open-source. There is no danger of it being a proprietary format with no publicly available specification.
Not the GP, but for archiving, you want to know that you'll be able to decode the files well into the future. If you adopt a format that's not well accepted and the code base gets dropped and not maintained so that in the future it is no longer able to be run on modern gear, your archive is worthless.
As a counter, J2K has been well established by the professional market even if your mom doesn't know anything about what it is. It has been standardized by the ISO, so it's not something that will be forgotten about. It's a good tool for the right job. It's also true that not all jobs will be the right ones for that tool
I was not thinking of J2K as being problematic for archiving but these new neural codecs. My point being that performance is only one of the criteria used to evaluate a codec.
For archiving, I'd recommend having a wasm decompressor along with some reference output. Could also ship an image viewer as an html file with all the code embedded.
Why the need for all things to be browser based? Why introduce the performance hit for something that brings no compelling justification? What problem is this solution solving? Why can't things just be native workflows and not be shoveled into a browser?
Not the parent but one imagines that WASM could be a good target for decompressing or otherwise decoding less-adopted formats/protocols because WASM is fairly broadly-adopted and seems to be at least holding steady if not growing as an executable format: it seems unlikely that WASM disappears in the foreseeable future.
Truly standard ANSI C along with a number of other implementation strategies (LLVM IR seems unlikely to be going anywhere) seem just as durable as WASM if not more, but there are applications where you might not want to need a C toolchain and WASM can be a fit there.
One example is IIUC some of the blockchain folks use WASM to do simultaneous rollout of iterations to consensus logic in distributed systems: everyone has to upgrade at the same time to stay part of the network.
Wasm is simple, well-defined, small enough that one person can implement the whole thing in a few weeks, and (unlike the JVM) is usable without its standard library (WASI).
LLVM isn't as simple: there's not really such thing as target-independent LLVM IR, there are lots of very specific keywords with subtle behavioural effects on the code, and it's hard to read. I think LLVM is the only full implementation of LLVM. (PNaCl was a partial reimplementation, but it's dead now.)
ANSI C is a very complicated language and very hard to implement correctly. Once Linux switches to another language or we stop using Linux, C will go the way of Fortran.
Part of archiving information has always been format shifting. Never think you can store information, forget about it for a thousand years (or even five), and have it available later.
I think we probably agree about most things but I’ll nitpick here and there.
ANSI C is among the simpler languages to have serious adoption. It’s a bit tricky to use correctly because much of its simplicity derives from leaving a lot of the complexity burden on the author or maintainer, but the language specification is small enough in bytes to fit on a 3.5” floppy disk, and I think there are conforming implementations smaller than that!
You seem to be alluding to C getting replaced by Rust as that’s the only other language with so much as a device driver to its name in the Linux kernel. Linus is on the record recently saying that it will be decades before Rust has a serious share of the core: not being an active kernel code tribute I’m inclined to trust his forecast more than anyone else’s.
But Rust started at a complexity level comparable to where the C/C++ ecosystem ended up after 40 years of maintaining substantial backwards compatibility, and shows no signs of getting simpler. The few bright spots (like syntax for the Either monad) seem to be getting less rather than more popular, the bad habits it learned from C++ (forcing too much into the trait system and the macro mechanism) seem to have all the same appeal that template madness does to C++ hackers who don’t know e.g. Haskell well. And in spite of the fact that like 80% of my user land is written in Rust, I’m unaware of even a single project that folks can’t live without that’s married to Rust.
Rust is very cool, does some things very well, and it wouldn’t be hard to do a version of it without net-negative levels of opinionated on memory management, but speaking for myself I’m still watching Nim and V and Zig and Jai and a bunch of other things, because Rust takes after its C++ heritage more than its Haskell heritage, and it’s not entrenched enough in real industry to justify its swagger in places like HN.
The game is still on for what comes after C: Rust is in the lead, but it’s not the successor C deserves.
it's well past the considering stage. J2K is used more than people think even if we're not using to spread cat memes across the interwebs. J2K is used in DCPs sent to movie theaters for digital projections. J2K is used as lossless masters for films. the Library of Congress uses it as well. this isn't even attempting to make an exhaustive list of use, but it's not something being looked into. it's being used every day
But that's like saying it's difficult to drive your Formula 1 car to work every day. It's not meant for that, so it's not the car's fault. It's a niche thing built to satisfy the requirements of a niche need. I would suggest this is "you're holding it wrong" type of situations that isn't laughable.
I think it is an interesting discussion, learning experience (no pun intended). I think this is more of a stop on a research project than a proposal; I could be wrong.
The interesting diagram to me is the last one, for computational cost, which shows the 10x penalty of the ML-based codecs.