Why is this? Insurance providers actually provide a valuable service (not necess...

osullivj · on Oct 18, 2019

Market making dealers in govt bonds provide liquidity to different types of fund managers. For instance, hedge funds, insurance funds, pension funds. All fund managers must preserve and/or grow funds. Pension and insurance funds are referred to as "real money" as, unlike hedge funds, they don't use broker dealer credit to leverage positions. They are also highly regulated compared to hedge funds, so can't use derivatives, go short or generally pursue riskier strategies. So regulation prioritises capital preservation over growth, for obvious reasons. Real money funds tend to be far less technically sophisticated than HFT or systematic trading hedge funds. So they get charged more, in the form of wider spreads, by dealers as they go in and out of positions. Why might they go in and out of positions? One reason would be tracking a govt bond index monthly. The bonds in the index may change, so to continue tracking a real money trader has to buy some bonds and sell others once a month. That real money trader will get a worse price from the dealer than a hedge fund trader who has far more live market data and pricing tech to assess the prices offered by dealers.

disgruntledphd2 · on Oct 18, 2019

It's worth noting that traditionally, actual insurance runs at a loss or break-even, most of the profits tend to come from investment.

This has changed recently, as investment returns have dropped, and may also be driving the adoption of better statistical techniques for modelling risk (as they can't rely on investment income to make up for insurance losses).

Also, having read this report, I'm very very sceptical about whether or not companies are using "ML". I suspect that most of them are just doing linear/logistic regression on larger data sizes, which isn't really the same.

mlthoughts2018 · on Oct 18, 2019

If the linear / logistic regression stuff is improved through Bayesian methods and hierarchical modeling, and if it grows large enough that there is a need to use MCMC sampling techniques for approximate posterior inference, then even though the model specification itself would be very simple, I’d absolutely say this is “ML”.

disgruntledphd2 · on Oct 18, 2019

I really, really wouldn't. That's Andrew Gelman's gig, and he's been doing that since the 80's, when ML/AI was all about expert systems. I think of that as a statistical technique, not an ML one (and for some weird reason, many quantitative people in insurance don't like Bayesian techniques).

But then, our disagreement here just highlights the difficulties involved in getting consensus on what ml vs statistics actually are.

salty_biscuits · on Oct 18, 2019

I think I have settled for ML = gets better with more examples ("learns"). So I'd say Bayesian models of all flavours are definitely in the ML umbrella. AI is super fuzzy though. I think of that as, does something you'd think a computer shouldn't be able to do but a human can, and is thus a forever moving goalpost.

disgruntledphd2 · on Oct 18, 2019

I probably wouldn't use that definition, as all statistical techniques will improve as the number of examples increase (i.e. the estimate will be more precise). If you mean that ML normally estimates more parameters, and as such improves more with more examples, I would agree, but it's very difficult to draw a dividing line then (what about splines - loads of parameters, very flexible, but not normally classified as an ML technique).

gbrown · on Oct 18, 2019

Depending on how you look at "improvement", that's not strictly true. Often, large data sources are slightly biased relative to the population we want to generalize to (say, a single regional company's customers, trying to generalize to a national population). Working with larger and larger such data sets, that bias can become large relative to standard errors/credible interval width.

Xiao-Li Meng has had some interesting talks/papers about this and related ideas: https://dash.harvard.edu/handle/1/10886849

disgruntledphd2 · on Oct 19, 2019

I was never convinced by that paper, to be honest. I almost always use MLE so that's not really an issue. It just struck me as a pedantic distinction without a difference.

But if you use GEE, it's probably great to know.

ska · on Oct 18, 2019

That applies to ML techniques also, mutatis mutandis.

salty_biscuits · on Oct 18, 2019

Splines are totally ML if gaussian processes or markov random fields are ML... Also, not saying that I agree with the definition or any sensible cut between statistics and ML, just my mental model for interpretation of what people are talking about when they talk about ML

sampo · on Oct 18, 2019

> I think of that as a statistical technique, not an ML one

Many ML techniques are statistical techniques, not ML. If we go down this road.

disgruntledphd2 · on Oct 18, 2019

I completely agree, and have been banging this drum for years :)

ska · on Oct 18, 2019

There is a real risk of venturing into narcissism of small differences with this.

disgruntledphd2 · on Oct 19, 2019

Like, personally, I regard statistics and machine learning as the same subject from different perspectives (mathematics vs computer science). Their differences are primarily driven by the context of the time of their development. Back when we had very little data, we needed strong assumptions to make inferences. As compute increased, this became less necessary and we could just bootstrap instead of needing normal theory confidence intervals.

But apparently, this is a controversial view.

NeutralCrane · on Oct 18, 2019

> I think of that as a statistical technique, not an ML one

Perhaps not all, but most of the distinctions between "statistical" techniques and "ML" techniques are fairly arbitrary.

sgt101 · on Oct 18, 2019

And do we care if they do the job?

Especially if they share the same trait of being so complex and dependent on Byzantine treatment of data that they need to be regulated and governed in a different way from something that can be groked via a single ppt slide?

edited for the second clause.

mlthoughts2018 · on Oct 18, 2019

Machine learning is a superset of all of statistics.

disgruntledphd2 · on Oct 19, 2019

I would probably have said the opposite. Can you clarify why you think the scope of ML is greater than that of statistics?

mlthoughts2018 · on Oct 19, 2019

Rule based systems

disgruntledphd2 · on Oct 20, 2019

OK, but I guess I could argue that asymptotic theory and design of experiments are proof that statistics is a super-set of ML (for the record, I'm not sure either of those is true).

throwaway66920 · on Oct 18, 2019

> It's worth noting that traditionally, actual insurance runs at a loss or break-even, most of the profits tend to come from investment.

I hadn’t considered that. It sounds like another reason why insurance is broken. If you have two options, $100 premiums and $100 cost; versus $1000 premiums and $1000 cost; the net gain on premiums vs cost is the same, but the investment income over the interim will be higher when the total amount of money in the system is higher.

I would be interested to see the average premium price compared to total investment funds across different geographies and times

czinck · on Oct 18, 2019

That's been one of the supposed consequences of the 80/20 rule that Obamacare introduced, although I can't find any numbers, just speculation that insurers are doing it. The 80/20 rule says that the insurance companies must spend at least 80% of their revenue on actual health care, or else they have to return money to the insured. So if you're a healthcare company and your stockholders want to see increases in profit, the only way is to spend more money whether it's necessary or not. You can't make more by increasing efficiency.

Searching for this gives articles about how the insurance companies actually have returned a fair bit of money, and none with hard numbers about this actually driving healthcare costs up, But there could be a lot of reasons other than "it hasn't driven costs up" for why it's not showing up in a quick search.