They're relying on the fact that most browsers lack per-domain cookie controls to force Google Drive users to allow third-party cookies knowing full well the majority won't remember (or bother) to disable them after.
The explanation in the other comment is also correct. When you go from drive.google.com to googleusercontent.com to download a file, this historically worked by using third-party cookies to verify that you were authorized to download the file. When Safari dropped support for third-party cookies they added a new flow which uses link decoration instead, but only use this flow when they think the browser doesn't support third-party cookies. Their "does this browser do third-party cookies" logic isn't very good, and doesn't handle Chrome without third-party cookies.
I'm not sure why they don't use the new flow for everyone. My guess is that it's less secure? Maybe that if the link they generate is shared it gives access beyond what the original owner chose to share?
>I'm not sure why they don't use the new flow for everyone.
It's less secure, slower (more round trips), and more server side intense - likely considered a hack. Effectively it does the same what a cookie would. The 3rd party cookies are not a bad thing per se, it's just that they have been abused to hell and back, is what causes their reputation.
I don't think it has to be. They could make the request as an ajax request with an Authorization header. Of course that makes the frontend more complex as it as to do some gymnastics to treat the response as a download.
I don't fully understand why they need to use a separate domain for this at all. There is infinite URL space available on drive.google.com, even if Google just used a proxy behind the scenes to route those requests to whatever load balancer normally services googleusercontent.com, and that would solve the issue with third party cookies entirely... as well as several other issues, like potentially confusing users with their own files coming from a domain that isn't drive.google.com.
It's not about url space or load balancing, but security. You do not want to serve user content from your primary domain:
* Even if you serve it with the correct content type and no-sniff headers some browsers can be tricked into running JS, and then you have XSS.
* Even in modern browsers it's defense in depth, in case you mess up your configuration or they have a bug.
* If malware gets past your scanners then your primary domain can get flagged.
* It looks like it's coming from a trusted domain: a PDF that claims to be from Google Drive and where the URL bar says drive.google.com looks legit in a way that one where the bar says googleusercontent.com does not.
I guess that’s all fair, but to be clear, I’m not proposing to host public-facing content. Only private content that can be viewed by authorized users who have the right first party cookie to allow it.
Public facing content could easily be hosted on the other domain for all of the reasons you listed, and third party cookies won’t matter then.
I appreciate you outlining the arguments. I know some other sites like Dropbox do the exact same thing with a user content domain.
It would still say “drive.google.com”, not “google.com”, and if that isn’t enough of a hint for the target, googleusercontent.com won’t be either. In fact, people have heard of Google Drive. They know that means it isn't from Google. "googleusercontent" could be "Google content intended for users" for all someone knows.
So, I disagree here. The well-known name of Google Drive as a user file sharing service is much more meaningful as a warning at a glance.
There are also mitigations that could be put in place for file sharing, like requiring the user to have accepted a file sharing request from that account before (via Google sent notification email) for a direct link to actually work. This would be a great thing to have in place regardless of domain, for defense in depth. Unsolicited links to private files arguably should not work.
Obviously people may have different opinions on this stuff.
> There are also mitigations that could be put in place for file sharing, like requiring the user to have accepted a file sharing request from that account before (via Google sent notification email) for a direct link to actually work.
That sounds pretty annoying? I upload something, give access to coder543, and ping you a link in Slack or whatever tool we use. But you can't open it until you go into your email and click through?
Maybe my phrasing was awkward, but I said you would only have to do this once for a given account. So, if I've never accepted a share from you before, your links won't work. When you share something with me for the first time, I would have to accept it via a Google-sent email containing a link that only Google knows (not something that can be sent via slack), and then all your future share links would work for me on slack. The error page denying access could even indicate that the user should check their email for additional verification.
You can think of it as the equivalent of a friend request. "This person tried to share a file with you. Do you know this person? Are you sure you want to receive files from them?"
This is not some outlandish solution. This should not be "pretty annoying". Based on my own experience, most people would go months or years between seeing these emails, since people tend to share files with (and receive files from) the same people over and over.
Moreover, in a work context, you would probably be sharing links to files that are on a shared google drive that I have equal access to already, so that would not require additional verification. It's not an unsolicited link to someone else's Google Drive... it's a link to a drive that I already have read/write access to.
Do people want to have friend requests in Google? If I wanted to share a file to your Google account, would you like to trust the future shares automatically as well? It doesn't seem like the superior alternative to just using 3rd party cookies—other than that it works if 3rd party cookies are disabled.
It also provides a new attack vector (your friends) if such people are able to create more credible documents (e.g. due to an attack, not due to a deliberate intent to mislead you).
The alternative is trusting all shared links, which is currently what Google does. Third party cookies have nothing to do with it. Having some form of revokable authorization to be able to click on links from a person is superior to "all sketchy links working instantly."
If you get a Google Drive link by someone claiming to be a friend you know, you could download malware right now, because Google trusts all of these links equally. With this mitigation in place, you would be stopped: “hey, this isn’t someone you’ve ever received files from before.” Because they aren’t actually your friend using your friend’s account which you’ve received files from before. It would add a serious obstacle to a lot of these impersonation attacks, and I see impersonation attacks all the time.
My comment awhile ago said that this mitigation would be nice regardless of whether Google kept using their separate domain or not.
It absolutely doesn’t provide a new attack vector. It strictly serves to reduce the attack surface, not to increase it.
Back in the day, you could upload, for an example, a specially-crafted HTML file with your own malicious JS code to, for an example, an image hosting service and basically use them to serve your attack upload. You could more or less abuse any website upload form to host any file that that you wanted. It was bad.
Browsers have drastically improved but why risk it? Using a separate domain makes a lot of scary scenarios completely impossible.
Using a separate domain for user generated content is usually done for security reasons. For example, if a user-generated chunk of JavaScript was executed from drive.google.com, then it could potentially gain access to your drive.google.com, or maybe even *.google.com, authentication cookies. Scripts running on an unrelated domain have no such access.
This usually isn't the only thing protecting against this, and is instead used as an additional safeguard.
I believe Google's use of this practice also predates widespread support of Content Security Policy, which isn't to say that this is a useless practice, but perhaps it isn't as important as it used to be.
Native browsers tend to flag any files they download with information on what domain the file came from, so it's also relevant in that case. Windows and OS X will pop up a warning when opening untrusted files, so whether the user sees 'google.com' or not could be important.
> I believe Google's use of this practice also predates widespread support of Content Security Policy, which isn't to say that this is a useless practice, but perhaps it isn't as important as it used to be.
Perhaps not, but I still think it's quite worthwhile to defend against CSP-related browser bugs, or even a botched infra change on Google's side that accidentally drops the CSP header.
Yes, that's exactly what I mean by it not being useless. If everything is working perfectly, then perhaps ends up not doing anything, but it's good to have another line of defense for when things go wrong. It's the safety net for when someone messes up CSP.
> I believe Google's use of this practice also predates widespread support of Content Security Policy, which isn't to say that this is a useless practice, but perhaps it isn't as important as it used to be.
I'm no fan of google but I have an inkling it was set up like this before Safari decided to block 3rd party cookies and for your answer why they didn't immediately consolidate into one domain? Google operates at a scale you probably can't even comprehend.
Embedding the authentication in the link is both a security risk and more complex than simply relying on the cookie.
They could have opted to do what Twitter does: Leave everything accessible wide open even if the file was created in a private context such as Twitter DM:s
Locking down access to static files that you ideally would like to serve and cache straight from storage is a tricky thing in regards to performance, security and maintenance complexity.
I don't really agree (and I'm happy to bash on Google).
This is basically the poster child for a case when someone should be using 3rd party cookies: A single entity manages multiple domains and shares cookie auth across them.
It's not like the other flow is somehow making you less identifiable - they're literally just passing the same information in a more round-about, less usable manner.
I genuinely think the current approach of blacklisting everything with essentially no recourse to enable a fine-grained whitelist related to cookies going to an alternate domain is fundamentally web-hostile.
The web worked because you could link to 3rd parties. We're currently throwing the baby out with the bath water because our government is dysfunctional and unable to regulate tech privacy.
> This is basically the poster child for a case when someone should be using 3rd party cookies: A single entity manages multiple domains and shares cookie auth across them.
If everyone would use 3rd party cookies like you're describing, there'd be no issue with users enabling them. Instead, they're frequently used to track users across domains, and the alternate flow used for Safari should be the pragmatic option used for everyone.
You're right to complain about how we're basically unable to use an otherwise-useful feature because of bad actors. It's a signal that core web technologies need to be created with potential abuses first and foremost.
> It's a signal that core web technologies need to be created with potential abuses first and foremost.
No. This is how absolutely everyone ends up with the shittiest version of everything.
We need recourse and a general legal expectation that you DON'T abuse your users.
Honestly - that attitude is exactly the problem: You're letting bad actors literally ruin the web, because the US government is unable to pull its fucking mouth out of the feed trough (or honestly do much of anything at all, right now).
We don't take that stance for literally ANY other industry: You can buy a gun, but guns can kill people. You can buy a car, but cars can crash. You can get a dog, and that dog can bite people.
The answer is not "Ban it because it might be bad". The answer is to properly set expectations that abuse will be met with heavy penalties.
This is not fucking Minority Report, and we shouldn't be trying to "precognition" all the bad out of the world. We should address it head on, and fucking burn the bad actors to the ground.
It is possible the US government lacks the reach to do what you've described, given how much organized crime is centered in other nations.
But I agree with you overall... Much of the web's concept of privacy and security is baked in with the assumption that it must be technologically enforced because it can't be legally enforced. Change that math and you change the model.
While I agree that we direly need privacy legislation to stop openly chartered surveillance companies from tracking us through whatever means, your position doesn't work for computer security in general. The only way "accountability" works for computer security is if every node on the network carried an inescapable real world identity that is responsible for its network traffic, which would be much more of a draconian regime than you are arguing against.
> A single entity manages multiple domains and shares cookie auth across them.
The issue is we (the users) really want a more nuanced concept of "third party": something like "different domain that's controlled by the first party."
Unfortunately, any declaration that relies on the first party will immediately be abused to hell ("All these tracking domains are controlled by me, so plz allow them!"), and we'd be right back here.
It feels like a problem that needs something like DNS (query & response), but probably just needs a fundamental rethink of what a cookie is.
> The issue is we (the users) really want a more nuanced concept of "third party": something like "different domain that's controlled by the first party."
Neat! It feels like cryptographic attestation by the child/secondary site would be less subject to abuse.
I.e. proving they have access to the same private key used to sign the parent, which would by definition not be something the parent would willingly share with random third parties
Some things are not solved in the appropriate manner through a technological solution.
They are misuses (and abuses) of a perfectly acceptable system. Don't undo the system, address the misuse.
Take your example:
>Unfortunately, any declaration that relies on the first party will immediately be abused to hell ("All these tracking domains are controlled by me, so plz allow them!"), and we'd be right back here.
The only reason this is the case is because this misuse has zero consequences.
Make them declare their domains, if they choose to include tracking domains, fine the ever-loving shit out of them. Not the ".05% of yearly profit" bullshit - I'm talking 200% of daily revenue for the top controlling company for every day that domain was on the list after it was declared a bad actor. If the company can't pay? Fucking nationalize them, remove the tracking domain, sell it to the highest bidder.*
Watch how fucking fast these companies will scramble to fix the problem when the stakes are real.
When the stakes are trivial - it doesn't matter what technology you try to put in place to block this, they will just work around it.
* I understand this is ridiculously extreme, but I'm done playing with these fucks. We've had the gloves on for the last 20 years, it's time they come off.
What is the definition of what is "really" "my" domain?
If I put a custom domain on an S3/cloudfront that's part of my system, so it appears as `storage.mysystem.com`, is there something nefarious going on?
Who decides what is allowable declaration of a domain to be mine? And who enforces this with fines? Is there currently any way to fine someone on the internet for violating a rule? What would you imagine this looking like, an organization that has the ability to fine people globally, and enforce the payment of those fines (by... taking domains back I guess?), and who would control it? (and who would pay for it, how?) It's a lot of global legal infrastructure we don't really have now, I think. It would be a pretty huge step.
> Who decides what is allowable declaration of a domain to be mine?
Basically, there is a list included in all browsers: https://wiki.mozilla.org/Public_Suffix_List. That's why you.github.io can't read other github.io cookies, but if you make your own domain, you can share cookies between a.example.com and b.example.com. (Also why example.com can't read .com cookies.)
> Is there currently any way to fine someone on the internet for violating a rule?
I understood that the conversation was about attesting that, for instance, googleusercontent.com was owned by the same entity as google.com so could share cookies.
A) I don't see any way that the list included in browsers of public suffixes makes it possible to decide that google.com really owns googleusercontent.com. If it did, we would already be there and woudln't be discussing this.
B) Who do you think makes the public suffix list in the first place, where do you think it comes from exactly?
> This is basically the poster child for a case when someone should be using 3rd party cookies
It is kinda funny that Google, among others, are the reason why we can have 3rd party cookies. Now they have a services that has a legitimate use-case and can't rely on 3rd party cookies being available and have to revert to work-around.
The reason he's thinking of is that they want to annoy people into enabling 3rd party cookies for tracking purposes, with security/performance/etc. as the excuse.
I saw the efficiency/performance claim a bunch of times now. How is using a cookie over say the same data embedded in the requested URL or transmitted as form-data supposedly more efficient? The server still has to check the auth, no matter what part of the request it extracted the auth data from. Or am I missing something here?
As for security, yeah, there are some good reasons for not embedding auth info in the link (tho one could still POST the same data instead without a third party cookie, etc), as well as for having a dedicated domain for user content.
They're relying on the fact that most browsers lack per-domain cookie controls to force Google Drive users to allow third-party cookies knowing full well the majority won't remember (or bother) to disable them after.