Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Clever, but slightly scary. They tag their object/string/integer/floats by storing them all in 64 bits, top 32 bits is tag, noting that there are vast swathes of values in the IEEE float format that are all NaN and the hardware and libraries only use one.

But woe to the poor beggar that uses the wrong NaN.

And fear the nasty man that slips the wrong NaN in through a tricky mechanism and turns it to a pointer.

(For similar problems, I was always fond of the mechanism in the Mindy Dylan interpreter. There all pointers had their low bit set, and integers had a low bit of zero. The nifty part being that integer math still worked without shifting and for pointers you are generally indexing to an offset anyway so take the one off there and it is free. Mercifully I've forgotten why I had to read enough of their source code to discover that.)



This representation reminded me a lot of the one used in Gwydion Dylan. That also used a two-word representation - mostly, if the compiler could statically infer the type it'd leave off the tag - with one word for the type tag and one word for the value.

IIRC, they ran into some very serious problems with threading. If you use a two-word representation, you can't atomically set a value - there's always the chance that the thread will be pre-empted in between setting the value and the tag and some other thread will be able to run. This of course is disaster, because you'll get a value interpreted as a type that it's not.

I wonder how Mozilla plans to handle this. Is their JS engine inherently single-threaded only? Fine-grained locking seems impractical, because you'd need locks around every assignment. Would they do something like Python's GIL?

What's the reason behind this representation anyway? It was an interesting experiment when Gwydion did it, but I don't really see the practical benefit.


If you use a two-word representation, you can't atomically set a value.

At least on 32-bit x86 this isn't true. CMPXCHG8B can atomically write 64 bits of data and has been supported by CPUs for decades. The x86-64 equivalent, CMPXCHG16B, for atomic 128-bit writes is more problematic, as early Athlon64/Opteron CPUs did not support it. But even on 64-bit systems, a 64-bit encoding should be good enough. Technically, the address space can comprise 48 bits, but even if you steal a couple more than 16 for the NaN coding you should have more than enough for a few decades.

PowerPC, ARM, etc. have different atomicity guarantees though, which make this sort of thing more problematic. However, as far as I know, even Web Worker threads don't share JS context state with the main thread, so this is probably a non-issue.

I don't really see the practical benefit.

Probably to avoid cache misses.


> If you use a two-word representation, you can't atomically set a value

On 32-bit architectures. 64-bits architectures have atomic XCHG and CMPXCHG operations.

That being said, the plan for Jaegermonkey is to upgrade jsvals from 64 to 128 bits so that's not the final solution. However...

> Is their JS engine inherently single-threaded only?

Yes, for certain values of 'yes'. Each script runs in its own context. There can be many scripts and many contexts but each context is only executed by a single thread.


Most modern 32-bit architectures support 64-bit compare-and-swap operations: http://en.wikipedia.org/wiki/Compare_and_swap


JavaScript will never support threads in the usual sense, as long as /be has his say: http://weblogs.mozillazine.org/roadmap/archives/2007/02/thre...

So JS won't need fine-grained locking in the foreseeable future.


"A requirement for JS3 (along with hygienic macros)"

Did anything ever come of hygienic macros for JavaScript? (first time I heard that)


The VM controls the creation and manipulation of doubles, so beggars and nasties do not have the ability to create a non-canonical NaN.


There is a foreign function interface floating around in Mozilla land. js-ctypes. I have no idea how it relates to this javascript engine, but it tantalizes.


js-ctypes goes through the JSAPI, so the JS engine can still guard against rogue NaNs.


You have to shift on multiplication and division.


Another way is to use bibop memory organization (Big Bag Of Pages).

http://www.google.com/search?q=bibop+memory+organization+(Bi...


But BiBOP (aside from being hard to implement on modern platforms) does not get you direct fixnums (or flonums in this case).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: