More

thrmsforbfast · on Nov 11, 2018

This comment ignores the problem that parent identified:

>> But it has the net effect of pushing the tax burden down onto smaller companies who are competing with the Amazons of the world.

I.e., even if we need a solution to "sclerotic tax increases", this is not a good one, because the solution only works for massive companies.

thrmsforbfast · on Nov 6, 2018

I agree with parent that this whole thread comes off as extremely hyperbolic.

I've known this information about my friends and neighbors for the better part of the last decade. In most states this is public info and you can even look it up from a web form using name, zip, and dob.

But now it's in an app so the world is ending.

thrmsforbfast · on Nov 4, 2018

Companies will have a Chief Privacy Officer whose job is basically to provide oversight and, of course, absorb the risk. That person will probably be paid well.

I'm actually OK with that. We're always complaining that companies don't take security/privacy seriously because there's no incentive to do so. See e.g. the Equifax HN threads. Having a person in the C suite who'll end up in jail if the company seriously fucks up is, IMO, a net positive for the world.

weberc2 · on Nov 4, 2018

I just hope it’s crafted such that it won’t inhibit small businesses or hobbyists.

tehlike · on Nov 4, 2018

That's exactly my hope. Only large companies benefit from such laws (including, potentially GDPR), other smaller ones get slowed down. With gdpr, many newspaper outlets stopped access from outside of the US.

thrmsforbfast · on Nov 4, 2018

Startups aren't even covered by this bill until they've gained lots of traction (1 million users and 50MM+ gross receipts). At which point, again again IFF their business is data hoarding, they will need to hire approx. one additional employee.

I'm sure startups/consultants will step in to provide regulator compliance as a service as well, so maybe not even that.

thrmsforbfast · on Nov 4, 2018

It is. Read Sec 5.(d).

It's not like people will be thrown in prison because their DB wasn't patched quickly enough. They have to knowingly and intentionally lie to the federal government in an annual report.

rhizome · on Nov 4, 2018

They have to knowingly and intentionally lie to the federal government in an annual report

So it's a nonstarter, since the people being prosecuted for these things will have lawyers adept at whittling down intent to only the most brazen and malicious behavior. Not only that, but Sarbanes-Oxley showed us how effective "annual report" red lines are.

thrmsforbfast · on Nov 4, 2018

Perhaps. But that's not a compelling argument against this bill. Perfect enemy of better and all that.

rhizome · on Nov 4, 2018

I agree that the nirvana fallacy is implicated, but there were zero convictions under SarbOx, zero prosecutions even.

thrmsforbfast · on Nov 4, 2018

Nope.

The bill defines covered entities in Sec. 2.(5)(A) and 2.(5)(B). In particular, companies with less than $50,000,000 in gross receipts and information on fewer than 1,000,000 customers are not covered by this legislation.

And even if those apply to your local coffee shop or whatever, Sec. 2(5)(B)(iii) further limits the definition of covered entity so that businesses that do not provide 3rd party access to information are not covered.

So Starbucks and other huge coffee chains/retail shops are the only organizations that would have to re-evaluate data collection from their public Wifi hotspots, and even then might be exempt depending on what they are collecting and how they are using that information. And, I should point out, these companies will need privacy experts on staff anyways, so this provision is highly unlikely to cause them to shutter their in-store Wifi networks...

Additionally, some of the more onerous requirements only apply to a subset of covered entities with yet larger gross receipts and yet larger numbers of tracked consumers.

But, unequivocally, your locally owned mom & pop coffee shop is excluded from consideration under this provision multiple times over.

blululu · on Nov 4, 2018

Unless inflation happens... This only applies to big businesses now, but in 25 years it will start effecting medium sized firms and it will eventually hit small businesses. This will create a morass of bureaucratic regulations stifling entrepreneurship...

ebullientocelot · on Nov 4, 2018

Granting the assumption of monotonically increasing inflation at a wild rate, this is still only true ceteris paribus. I can't imagine inflation that would make a small coffee shop chain into 50m/year revenue (customer floor requirement notwithstanding) would happen in a vacuum.

anticensor · on Nov 5, 2018

What he means is regulatory inflation, not the monetary one. A few years later, the customer threshold would be amended to a few thousands, ten years later, the thresholds would be abolished.

thrmsforbfast · on Nov 2, 2018

Open source doesn't ensure quality code.

Ideally the code would be part of the peer review process, but code review is really expensive, so who knows how that would play out.

ChrisFoster · on Nov 2, 2018

True, but it does provide at least some measure of reproducibility. Quality of implementation and reproducibility are orthogonal and both very valuable in their own right.

shkkmo · on Nov 2, 2018

> Open source doesn't ensure quality code.

Yes, but closed source helps ensure that low quality code is hidden from sight. It also means that people who distrust or doubt the conclusions have no chance to identify any bug(s) and disprove the results or conclusions.

fifnir · on Nov 2, 2018

It's simple:

We stop publishing in papers, and instead adopt smaller chunks of our work as the core publishing units.

Each figure should be an individually published entity which contains the entire computational pipeline.

Figures are our observations on which we apply logic/philosophy/whatyouwannacallit. Publishing them alongside their relevant code makes the process transparent, reproducible and individually reviewable, as it should be.

We can then "publish" comments, observations, conclusions etc on those Figures as a separate thing. Now the logic of the conclusions can be reviewed separately from the statistics and code of the figure.

chiefalchemist · on Nov 3, 2018

A comparable solution would be for all involved to value all research, not just the ground breaking, earth shattering type.

As it is, research that yields a "failure" is buried. That means wheels are being reinvented and re-failed. That means there's no opportunity to compare similar "failures", be inspired, and come up with the magic that others overlooked.

Unfortunately, I would imagine, even if you can get researchers to agree to this the lawyers are going to have a shit fit. Imagine Google using an IBM "failure" for something truly innovative.

tokai · on Nov 3, 2018

What you are proposing sounds a lot like the concept of the least publishable unit.

https://en.wikipedia.org/wiki/Least_publishable_unit

jpeloquin · on Nov 3, 2018

> Each figure should be an individually published entity which contains the entire computational pipeline.

I agree in principle. But, for the experimental sciences, we need better publication infrastructure to make this practically possible.

For example, consider a figure that summarizes compares, between several groups, the mechanical strain of tensile test specimens for a given load. Strain is measured from digital image correlation of video of the test. Some pain points:

1. There is a few hundred GB of test video underlying the figure. Where should the author put this where it will remain publicly accessible for the useful lifetime of the paper? How long should it remain accessible, anyway? The scientific record is ostensibly permanent, but relying on authors to personally maintain cloud hosting accounts for data distribution will seldom provide more than a couple years' of data availability.

2. Open data hosts that aim for permanent archival of scientific data do exist (e.g., the Open Science Framework), but their infrastructure is a poor match with reproducible practices. I haven't found an open data host that both accepts uploads via git + git annex or git + git LFS and has permissive repository size limits. Often the provided file upload tool can't even handle folders, requiring all files to be uploaded individually. Publishing open data usually requires reorganizing it to according to the data host's worldview or publishing a subset of the data, which breaks the existing computational analysis pipeline.

3. Proprietary software was used in the analysis pipeline. The particular version of the software that was used is no longer sold. It's unclear how someone without the software license would reproduce the analysis.

Finally, there's the issue of computational literacy of scientists. In most cases, the "computational pipeline" is a grad student clicking through a GUI a couple hundred times, and occasionally copying the results into an MS Office document for publication. No version control. Generally, an interactive analysis session cannot be stored and reproduced later. How do we change this? Can we make version control (including of large binary files) user-friendly enough that non-programmers will use it? And make it easy to update Word / PowerPoint documents from the data analysis pipeline instead of relying on copy & paste?

If any of these pain points are in fact solved and my information is out of date, I would be thrilled to hear it.

fifnir · on Nov 4, 2018

1 ans 2: I like IPFS for this, check it out

3: analysis that uses propriatory is marked appropriately as second class

> computational literacy of scientists

Welp...

no_identd · on Nov 2, 2018

I have two words for you: Ted. Nelson.

j88439h84 · on Nov 2, 2018

Can you expand on this?

lvh · on Nov 3, 2018

I can’t speak for GP, but Nelson invented hypermedia/hyperlinks and had a vision for the future that included documents including other documents. All of that seems pretty compatible.

agumonkey · on Nov 3, 2018

similar to reproducible builds or nix

research just jumped onto jupyter notebooks, it's halfway there, someone helps the remaining step

d0mine · on Nov 3, 2018

www was created to publish information in CERN but we can use it in other contexts too ;) http://info.cern.ch/Proposal.html

lvh · on Nov 3, 2018

Of course it won’t ensure anything, but currently being completely unable to reproduce results, even as the author but just a year from now, is par for the course.

darpa_escapee · on Nov 2, 2018

It's not about code quality, it's about transparency and ease of reproduction.

BurningFrog · on Nov 3, 2018

Code review is cheap. I do it for fun. But it doesn't prove anything.

Science should prove things...

m_mueller · on Nov 3, 2018

science can never prove anything as a matter of principle. it can only disprove all the alternatives. math and logic can prove, but only within the model it has built up, which has been shown to contain unprovable axioms that one must simply accept.

BurningFrog · on Nov 3, 2018

Yeah, I'm aware of the strict theory.

Fomite · on Nov 3, 2018

Little of what I do, even with the most rigorous methods available and the best practices from both software development and computational science, proves anything.

BurningFrog · on Nov 3, 2018

I know. And I think it's a problem for science...

Logical proofs will never happen for software development, but surely standards for scientific programming can be tightened up a few levels!

I think I heard of some reform proposals from the Reproducibility Crisis reformers.

Fomite · on Nov 3, 2018

I more mean there are whole aspects of science that aren't provable without being able to actually obtain counterfactuals, and that means time machines

thrmsforbfast · on Nov 2, 2018

A lot of your news is not just "not-depressing". It's actively uplifting. In fact, most of the stories on the front page are "feel good" stories.

It'd be cool/useful if you could provide news on the important issues of the day that's simply "not-depressing". I.e., factual and detached and doesn't elicit emotion, but not explicitly feed-good.

(e: forgot to say: nice work!)

armatav · on Nov 2, 2018

Thanks! Yeah I'm thinking of adding a sentiment slider after I get a large amount of news sources going, as well as a "which sources I want to see" selection.

thrmsforbfast · on Nov 2, 2018

> As in most descriptions of it are annoyingly ambiguous.

My very first interviewer (intentionally) didn't spec fizzbuzz correctly. The real test was whether the candidate listened to the customer's/lead engineer's spec instead of jumping to conclusions.

Fortunately, I was just entering college and hadn't heard of fizzbuzz before. I passed the "test" but for the wrong reason.

segh · on Nov 2, 2018

Being purposely misleading then penalising those who are misled doesn't seem like a good hiring strategy, but then again, I've never tried to hire someone.

thrmsforbfast · on Nov 1, 2018

Well, no, generally they can't shove it off on their creditor and say "there we're even", if that's what you mean.

There's a whole field of law about bankruptcy. In particular, questions about how to value assets are obviously a big component of bankruptcy law.

I imagine "it's complicated" when it comes to intangible/intellectual property like patents, brands, etc.