I don't understand enough about the ad business to answer this myself. If there's a legitimate reason to allow 3p scripts to run code - it would seem like creating a domain specific language that Google safely translates into JS would be so much better. Allowing 3Ps to run arbitrary JS just seems so shockingly wrong.
No amount of manual auditing can catch malicious code. It's way too complex for a human to parse.
Is there a legitimate business need that anyone's aware of to have code run in Ads? If so, why not use a DSL?
I used to work for https://www.interpolls.com/ in 2007 so tech may has changed quite a bit but here's how the business worked then:
We were allowed 30kb of JS file to load which could (depending on the ad network) serve ~300kb of a Flash SWF file. We ran Cold Fusion hooks in the SWF to radio home to our JS file to trigger 1x1 pixels for 3rd party trackers. We scraped our raw Akamai HTTP request logs after the fact on a CRON job to create our reporting system. There was a small cluster of FreeBSD servers that crunched the HTTP request logs. Every mouse-over / click was registered via these pixel HTTP GETs. We had timers too that would trigger every few seconds. The reporting system probably had a 2-3 hour delay due to the immense amount of traffic we received.
We specialized in "polls" which were plain old HTML radio buttons overtop of the SWFs which after you answered gave you a quick answer and sometimes had digital takeaways in the popup answer window (Icons, Wallpapers, etc..)
At the time all of our ads were handmade, we had a design team and a programming team that would create these together and code them specifically to the clients request. By the time I left we had started to automate it into a drag+drop system for clients.
Sidenote: Biggest job screw up that I've ever done was not putting in the correct 3rd party tracking pixel into an 300x250 that ran for 1 day on AOL.com homepage. It ended up being fine since we got the results back for the typo from the raw logs, but it could have been a $200k mistake!
I don't know anything about JS-based cryptomining, but I wonder if you can't stop such ads without breaking 90% of legit ads.
I mean, it's all probably boils down to number-crunching? So DSL you are envisioning should block really basic language parts, like cycles and math operations.
If I'm wrong and mining actually could be easily blocked on language level using some DSL, I'm all ears.
It would be nice if things could be blocked by CPU usage... even if you’re not mining cryptocurrency, if your ad uses more then 5% of my CPU it should be killed.
Interesting, just noticed that watching a Udemy course uses %98 CPU (in Activity Monitor on a MBP). This even if playback if paused. Wonder if they're doing something similar or it's just a lame implementation?
I've seen some software video decoders do that at times, though if it's happening even while paused it's a little unusual (maybe decoding buffered frames?).
Once a widget has run say 100 million instructions, suspend it if it comes from a different domain than the main page, mark it visually and provide a button on it to enable high CPU usage.
We used to do something like that with Flash: make the user click it if they want it to run.
That's exactly the reasoning that is the root cause of all these problems. You designed a website and you don't want it to be broken. Fair enough. But as a user, I don't really care about your website - what matters for me is if I can prevent it from taking over my CPU or not. This option should be there and it should be configurable. The browser makers already figured this is a problem and have some rudimentary mechanisms preventing total abuse ("this window/tab became unresponsive...") but if users have more control over it, it completely changes the rules of the game. Having a configurable option "if a script consumes more than N% of CPU, turn it off" would save many people the time spent on looking for the culprit, sometimes hidden between tabs. Fortunately many people have an auditory clue when the JS is abused: the fan noise.
Designers and developers need to understand that allowing them to run their code on my computer is a privilege, not an absolute right. As every right, it must not be abused. If it's abused, it will be terminated. Google finding and disabling these Coinhive miners on YT is just treating the symptoms, not the root cause.
I've come to decide that the only ads I consider "legit" are where the site owner strikes a deal with another business that is interested in advertising on their site, the site owner hosts the ad on their own server, as a picture banner or text or perhaps a nice block in a side column that says "sponsored content" or whatever, and just links to the other business.
Site owner controls all the content. Any tracking will be done mainly via server logs, if the site owner wants to they can use a bit of script to quickly shove in a redirect onmousedown, in order to track exactly when the user clicked what link. But frankly I've found even that technique a privacy insult ever since I noticed Google doing this in their own search results.
This is analogous to how paper newspapers used to manage their ad space. No third party shit, and if the magazine was proud of itself it would curate the ads to only deal with advertisers that wouldn't annoy their reader base (too much).
A bit of a hassle maybe, but it shows your readers that you actually care about what content is displayed on your site (let alone what code is run). But most importantly, no adblocker will block these kinds of ads. Because they're just image links, after all. Adblocker can't see if that's an ad banner or just a thumbnail linking to an external domain. And I would maybe even bother to whitelist those if they did (right until one shows me crap I don't want to see, like being confronted with nudity or sex when I'm not in the mood for it).
>I've come to decide that the only ads I consider "legit" are where the site owner strikes a deal with another business that is interested in advertising on their site, the site owner hosts the ad on their own server, as a picture banner or text or perhaps a nice block in a side column that says "sponsored content" or whatever, and just links to the other business.
I agree. When it comes to ads for niche content (blogs, forums, etc). The ad industry sells online ads like they're TV commercials but the companies buying the ads should be looking at them like partially sponsoring a race team in exchange for your logo showing up in front of people who are interested in your type of products/services.
However, mining is useless without a way to send it back out to the network. I doubt ads need networking capabilities - so just prevent that bit. That should do it, as far as I can tell.
Ads absolutely need networking capabilities, for tracking stuff like "viewability", or "anti-fraud, brand safety and independent measurement" by some third-party provider. In fact, you can't get serious marketing budget from reputable brand without having your, err... their ad wrapped inside some JS which does networking calls. Brands want to audit each impression you, as ad-tech firm, will serve on their behalf.
Part of what I find frustrating as a user is that I don't like ANY of those features :)
While I'd prefer a model without any advertising (and am willing to pay for it), I can put up with unobtrustive ads, without tracking, like Daringfireball uses.
I've worked in adtech before, and I know that these techniques make money, and are important to advertisers. But as a user, I find them intrusive, and they are why I run adblockers.
Yeah, race to the bottom. Fraud in online advertising estimated to be tens of billions USD yearly, so brands require more and more "brand safety" and "measurements", each ad calls, like, 4 different vendors calculating some metrics, and this, in turn, fuels adblockers growth.
Interestingly enough, "walled gardens", like facebook, are big and important enough to bully advertisers into playing by facebook rules, accepting FB measurement standards, without calls to 3rd party vendors.
It's only open web which is polluted more and more each year.
Well, it's a complicated question, and I'm not that well educated.
To my best knowledge...
1. You can't make a single cent if your bot visited facebook.com 100 million times. You can make some serious money if your bot visited some-exciting-domain.com, which belongs to you, and there was 5 ads displayed on each visit.
With this incentive you have all the reasons to make your bot very human-like (think headless chrome, realistic mouse movements, having old cookies, etc) so fighting fraud gets extremely hard.
It's easier to serve the ads and let advertiser figure out anti-fraud measures by himself. Being responsible for measurements and lack of fraud on open web is a huge PITA without clear path to huge uplift in revenue.
2. Facebook optimizes UX (or claims to), and calls to other servers make site slower, especially on mobile, lowering user engagement. This argument obviously does not work for some-exciting-domain.com. So, you can call whatever your want from your ad on some-exciting-domain.com, but on facebook.com you play by facebook rules.
In fact, some-exciting-domain.com can probably ban ads which call other domains, but it will just kill his revenue (programmatic systems will label him as "non-performing", because nothing is properly measured and stop buying ads there).
I do not think that metaphor holds at all, if "water" is open web and "chugging" means spending advertising dollars there.
I don't have a link on hand, but GOOG and FB captured something like 95% of digital advertisement growth in 2017. In other words, out of each new 1$ shifted to digital from TV and print, 95 cents went to duopoly.
And advertisers which are still "chugging" open web, installing more and more "filters" and "purifiers" (different anti-fraud and measurement providers).
People from digital media are talking about digital media crash. [0] Buzzfeed failed their revenue goal and fired 100 employees. [1] Mashable was sold for peanuts. [2] Business Insider, granddady of ad-monetized clickbait, pushing more and more articles under "BI Prime", which means paid access.
A lot of people were clamoring for death of ad-supported publishing on open web. Well, the future is almost here.
Chugging means, the general public is not at all concerned about what it's being subjected to - be it privacy / tracking or the visual pollution of ads.
If the market is the decider then I think the market is saying - quite loudly - the water is great to drink.
As a user I love all of those features. Every single one of them makes my adblockers more effective, not to mention they drive more people to use adblockers. :)
The audience is clawing back their rights using ad blockers and the browser vendors are already limiting tracking by eliminating apis used for tracking and limiting cookies.
Ultimately the browser vendors and the users make the rules, not advertisers.
but one of the biggest browser is owned by one of the biggest advertisers in the world. you'd expect dive conflict of interest at best, and anticompetitive behaviour at worst, from them.
It was only last week or so when I read some security researcher pulling some tricks to bypass part of caja's sandbox (while looking for something else, even). Sure this was a whitehat researcher and they got a (very) nice bug bounty from Google Project Zero. But if they use this to secure the 3rd party scripts that are apparently allowed these days on Google Ads, they're being hugely irresponsible.
I never heard of Caja until a few weeks ago, but apparently it started in 2007, could be that I forgot when I heard about it though. Back in 2007 though I was still a frontend web developer with a very keen eye on JS security and all the XSS/CSRF problems of those days. Back then JS/ECMAScript did not have sufficiently advanced features to properly sandbox code. This was a bit of a fool's errand back then. By now it's gotten a lot more of these features, mainly revolving around protecting the super-flexible fluid JS objects from modification and abuse. But I really kind of wonder if that locks everything really watertight? Because browsers are going to have unofficial/proprietary features, and you need just one to accidentally get this slightly wrong, get a reference to a non-sandboxed Window object via-via-via, and it all falls apart.
You don't let untrusted people run code on stuff you serve. Code is just too slippery, turing complete etc, and ECMAScript perhaps even more so than many other languages. Can't we just take that for a given by now, instead of trying to be cleverer than the previous smart person that failed at it?
Sort of: I don't think it's meant to restrict time/space usage, just access to capabilities. If you don't give the ad-code access to the network, it'd have no way to access the blockchain, but it could still chew up your CPU cycles.
Ultimately, it's because the advertisers don't trust Google or other middle men and insist on running code from third party vendors that grabs more information and promises better metrics, or to determine if the site or user are in some way fraudulent.
No amount of manual auditing can catch malicious code. It's way too complex for a human to parse.
Is there a legitimate business need that anyone's aware of to have code run in Ads? If so, why not use a DSL?