Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What do you think won't show up in a profiler with GC? With the right profiler I can see the entire heap as well as where every object was allocated and all those deallocations are basically free. With a modern collector like ZGC, max pause times are now sub-millisecond for collections. Also easy to measure the total GC cost in the profiler as well. The modern GC story is extremely good. The biggest issue with GC in my opinion is that it uses more memory than you strictly need.


It won't show what we're talking about in this thread - deallocations of large number of allocations when they loose last root reference at once.


It absolutely will show them because they cost nothing. 0 will show up in the profiler which is how much deallocations cost with GC.


Magic.

In rc/arc you can achieve the same by making release a no-op.


I don't know what you mean by "making release a no-op", but with reference counting it inherently takes O(N) time to deallocate an object containing N references, and O(M) time to deallocate M objects containing no references. With a generational copying collector, it takes O(1) time to deallocate M unreferenced objects each containing N references in the nursery — or rather it takes O(P+Q) time, where P is the number of objects that don't get deallocated and Q is the number of root references that need to be scanned, the ones in the stack plus the cards marked by your write barrier or whatever.

You can't achieve that with reference counting, with malloc/free, or with a non-copying tracing collector. You can achieve it with regions.

On the other hand, you may not need to. Allocating and initializing those objects took O(MN) time, so your program can't get an asymptotic speedup by deallocating them in O(1) time instead of O(MN) time. So it really depends on the constant factors, which nowadays depend strongly on things like cache miss rates and cache line sharing between cores.


Allocation cost in region is just pointer update, quite cheap. Deallocation is constant. You will get speedup if you can replace large number of allocations with arena. In gc languages you don't have luxury of specifying custom allocators for selected parts of the program. But the main argument was that tracing gc hides all of this from you and you can't profile your program explicitly see which callstacks contribute to large deallocations. Just because gc hides it, defers and spreads in time, doesn't mean it's zero cost – on contrary tracing will have more overhead. In vast majority of programs power consumption overhead is not relevant though, if it's traded for programming ergonomics, it's a win.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: