Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It should also be worth nothing that Clojure sacrificed quite a bit to make this as efficient as possible.

“persistent vectors” are certainly an interesting data structure that strike a compromise between fast indexing and being able to relatively quickly create a copy where only one element changes, but it's a compromise and indexing is made slower to allow for the latter. — They also take up more memory on their own but are allowed to share memory with their copies.

I will say that my ideal language contains them in the standard library alongside standard vectors that index in constant time.

Further, it should be noted that much of the performance talk is on the assumption that accessing from memory is truly random access; — with the existence of c.p.u. caches that assumption is not entirely accurate and accessing from contiguous rather than scattered memory in practice is considerably cheaper so one also pays the price for their being scattered more in memory.



Random access into a clojure vector is going to need more memory lookups than conventional sequential buffer array (I don't recall the constants used in the implementation, I think it's either 4 or 8 lookups).

But when you're indexing into the vector sequentially, the memory layout plays rather well with memory caching behavior, and most lookups are going to be in L1 cache, just like they would be in a conventional array.

So lookups are a bit more expensive, but not as much more expensive as one might imagine.


How? I don't see how that's possible.

The actual data of a Pvector is not in contiguous memory but scattered however the JVM wills it, and on top of that in order to find which address to retrieve it, an algorithm that runs in logarithmic time with respect to the length of the vector must be used opposed to a constant time one.

How can most lookups end up in L1 cache if an element that is 32 indices removed is statistically likely to be arbitrarily far removed in memory?

Of course, all of that is not that material to begin with given that most elements will be pointers to begin with so the actual objects wither they point will already be arbitrarily scattered and it simply adds one more pointer indirection, but for unboxed types such as integers it does play a factor.


So I don't recall what size of chunks Clojure's implementation uses, but I'll assume it uses 64-bit indices with 16-word chunks, because I want to use numbers.

    (loop [i 0]
      (println (get my-vector i))
      (recur (inc i)))
Assuming no part of my-vector is cached at first, the first iteration needs to make a full 16 round trips to main memory — quite bad. But on the next iteration, all of that is cached, and we don't hit main memory at all, and the same until i=16, which requires one round trip. Then when i=16², we need to hit main memory twice, etc.

No doubt this is quite a bit worse than having everything nicely laid out sequentially in memory, but it's not as bad as you're describing.

Of course, all of that is not that material to begin with given that most elements will be pointers to begin with

I guess this is sort of true. If you're doing random lookups, then using a persistent vector instead of an array list mean 17 trips to main memory instead of just 1, so it's not totally inconsequential.

But I think (hope?) that modern JVMs can optimize collections of small immutable objects so that they're not represented as pointers to the heap. Surely ArrayList<Integer> x gets represented as int x[], and not int *x[], at least with the most optimizing JIT level.


I think you are talking about arbitrary stride sequential access, and the comment you were responding to meant stride-1 access. But conventional arrays also fare poorly with arbitrary stride access, though probably they have an advantage with small strides close to the Clojure persistent vector chunk size.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: