I'm not sure anyone's 100% sure, but from what I've read the leading hypothesis is that white light spreads in the retina and is more sensitive to lens distortions (and the pupil is larger when looking at a black screen, so even more distortion). So white letters essentially smudge more in the eye against each other and require more work to decipher, while black letters don't. (White light surrounding black letters could make them "shrink" with the same effect, but this doesn't seem to harm legibility.)
Which is also why it's more relevant to body text, as opposed to display text which is larger and thus less susceptible to the problem.