The thing I'd be really interested in is how you deal with UI changes - I've never found a satisfactory way to test "is this ugly/confusing" other than letting a few users bang on it on a staging server.
The ultimate solution is to have business metrics drive your UI changes, usually in the form of an A/B test. Then you have a clear winner. This A/B would be run separate from the roll out structure (and indeed, we do LOTS of A/B tests).
Sometimes that's not possible, for a new feature or for content without a clear business metric to evaluate for. Either way we often have someone manually test new UI, so that we're not exposing users to something fundamentally broken. We usually do this by using the existing deploy system, but turning the frontend on only for QA users.
In the end, you do what works and is cheap, and that's usually something slightly different for every project.