Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I qualified my statement with an "often" for a reason ;)

But looking back at the article, it seems like the author is complaining that the compiler used an unaligned move instead of an aligned one–but based on the code they wrote, this seems necessary. The compiler has no knowledge of any data alignment (how would it know that the data is always aligned without telling it?), so it emits conservative instructions.



Not really - the initial implementation (without memcpy) resulted in an aligned move, and hence the crash. The memcpy implementation results in an SSE-based unaligned instruction that does the right thing, but is slower (in the specific case that the author cares about, ie cases where the loop will rarely trigger and will basically never go through more than three iterations) than not using SSE. There's no real way to hint to the compiler that this is the case, so the author chooses to disable the use of SSE in that function on architectures that support it - this way they can use memcpy() (for correctness) and still get the hoped for code generation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: