Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From the other side of the table, the machine learning candidate pool is also a clown show right now.

I did some hiring for a very real machine learning (AI if you want to call it that) initiative that started even before the LLM explosion. The number of candidates applying with claimed ML/AI experience who haven’t done anything more than follow online tutorials is wild. This was at a company that had a good reputation among tech people and paid above average, so we got a lot of candidates hoping to talk their way into ML jobs after completing some basic courses online.

The weirdest trend was all of the people who had done large AI projects on things that didn’t need AI at all. We had people bragging about spending a year or more trying to get an AI model to do simple tasks that were easily solved deterministically with simple math, for example. There was a lot of AI-ification for the sake of using AI.

It feels similar to when everyone with a Raspberry Pi started claiming embedded expertise or when people who worked with analytics started branding themselves as Big Data experts.



> The weirdest trend was all of the people who had done large AI projects on things that didn’t need AI at all.

This is how people get experience with ML though. I don’t think that’s a bad thing.

It sounds like you’re looking for a candidate with current ML experience. But I’ve seen so many people go from zero knowledge to capable devs that this seems like a mistake. You’ll end up overpaying.

Just try to find someone with a burning ambition to learn. That seems like the key to get someone capable in the long run. If they point out something beyond Kaggle that makes you think, pay attention to that feeling — it means they’re in it for more than the money.


This reminds me of when I started learning spark (back in the dinosaur days). It was considered this cutting edge 'advanced' technology that only the top tier of 10x engineers knew how to implement. The documentation was crap and there were not many tutorials so it took forever to learn.

These days people can get an excellent introductory class to spark and be just as good as I've ever been at it. I wouldn't call them 'charlatans' like the poster above did. It's just that the libraries used to implement spark have been abstracted and people learn it faster.

That's just how it goes in tech. Anyone who wants to learn is a treated like a poser. We over-index on academic credentials which are really not indicative of actual hands-on engineering ability.

PS. There are no AI/ML experts. There are LLM experts, prediction model experts, regression experts, image recognition experts.... If you are hiring a 'AI/ML expert', you have no idea what you are hiring.


If you can make do with generalist techies who can ramp up in a few weeks, you probably don’t need to be paying them $500k-$1M TCO. They’re just a new technician.

But that doesn’t mean that having people with actual research/depthful expertise aren’t essential and hard to find amongst the noise.

The person you responded to is talking about would-be technicians applying for researcher roles. That happens in tech booms and opens amazing doors for lucky smart people, but it’s also a huge PITA for hiring managers to deal with.


I have to agree. Especially given the very real possibility that your ML project won't be cutting edge research grade. At that point someone who doesn't have bias and is willing to search for a reasonable looking approximation to the problem and try a canned solution may actually be an optimal candidate.


Considering the number of problems that could be plugged into a random forest with good results, data proficiency seems more important than strong ML experience.


Depends heavily on the application once you get to more specialized domains.

I wish there was an easier way to label roles differently based on when you just need to throw X or Y model at some chunk of data and when more specialized modeling is required. Previously it was roughly delineated by "data science" vs "ML" roles but the recent AI thing has really messed with this.


What you say is true but in terms of hiring and screening candidates it really depends.

One important aspect of all skills is to know the limitations and boundaries of said skill. It’s probably fine if somebody implemented ML on a trivial problem to learn and practice, but if they didn’t realize there could be better solutions in the first place and that ML isn’t a solution to everything, then it’s a big red flag for me.

Also, finding a good problem for a solution is also a handy skill, if one can’t figure out how to apply their skills to a real problem, then that does give slightly negative impressions.


>Just try to find someone with a burning ambition to learn. That seems like the key to get someone capable in the long run. If they point out something beyond Kaggle that makes you think, pay attention to that feeling — it means they’re in it for more than the money.

If you're teaching them, you shouldn't be paying them at the AI expert rate.


But we do this in software engineering all of the time, why is AI different?


Corporations love people with experience but they don't want to actually invest in creating those people. If nobody is supposed to hire people who have only taken classes or done tutorials, how do you actually get people who have that experience? Or are these guys expecting us to bootstrap our own PhD before they deign to speak to us?


A former colleague of mine (SW guy) took Andrew's Coursera course, downloaded some Kaggle sets, fiddled with them, and put his Jupyter notebooks online. He learned the lingo of deep learning (no experience in them, though). Then he hit the interview circuit.

Got a senior ML position in a well known Fortune 500 company. Senior enough that he sets his goals - no one gives him work to do. He just goes around asking for data and does analyses. When he left our team he told me "Now that I have this opportunity, I can actually really learn ML instead of faking it."

If you think that's bad, you should hear the stories he tells at that company. Since senior leadership knows nothing about ML practices, practices are sloppy to get impressive numbers. Things like reporting quality based on performance on training data. And when going from a 3% to a 6% prediction success rate, they boast about "doubling the performance".

He eventually left for another company because it was harder to compete against bigger charlatans than he was.


> "took Andrew's Coursera course"

If he really did take those and did all the assignments himself and understood all the concepts, that still puts him at least in the 95th percentile among ML job seekers.


I don't disagree. Still, would you give such a person a senior ML role and let him do whatever work he wants?


No. Absolutely not.

But it shows intent for junior roles.

People have no idea how many people just put "AI enthusiast" on LinkedIn profile and start to seek ML roles.

I have had people only with Excel skills apply to ML roles.

And, true thing is every company’s ML tooling/stack/procedure is wildly different. One has to learn on the job.


If you think that's bad

(Shrug) I don't. Hustle gets rewarded, as usual. Sounds like he contributed at least as much value as he captured.


> Sounds like he contributed at least as much value as he captured.

What in my comment gave you that idea?


I don't have any ML experience but I don't see what is wrong with it. To me it seems like the equivalent of someone self teaching software development. As long as they are interested and doing a good job there background shouldn't matter much.


Say your company hired a SW engineer who had merely taken an introductory programming course on Coursera, and other than that had no experience. And you immediately make him a senior person, and let him define the role he will play in your company.

Oh, and he didn't have to write any code during the interview.

You don't see anything wrong with that?

I think it's fine to hire a person who just took Coursera ML courses and passes the interview, but you would normally position the person to be a junior with senior folks overseeing his work.


Ah got it, ty for the explanation.


What's hard about AI that requires special expertise? In many ways it is much simpler than regular software engineering because the conceptual landscape in AI is much simpler. Every AI framework offers the same conceptual primitives and even deployment targets whereas most web frameworks have entirely different conceptions of something as simple as MVC so knowing one framework isn't very useful for learning and understanding another one but if you know how to use PyTorch then you can very easily transfer that knowledge to another framework like Tensorflow or Jax.

It should be possible for a competent software engineer to get up to speed in AI in less than 6 months and much of that time can be on the job itself.


> What's hard about AI that requires special expertise?

AI is ill-defined so the premise of your comment makes it difficult to answer. For small well-known tasks (image classification, object detection, sentiment detection) that is train-once on a single dataset and deploy-once what you are saying is true, but for more complex products there is a lot of arcane knowledge that can go in training/deploying/maintaining a model.

On the training side, you need to be able to define the correct metrics, identify bottlenecks in your dataloader, scale to multiple nodes (which is itself a sub-field because distributing a model is not simple) and run evaluation. Throughout the whole thing you have to implement proper dataset versioning (otherwise your evaluation results won't be comparable) and store it in a way that has enough throughput to not bottleneck your training without bankrupting the company (images and videos are not small).

Finally you have a trained model that needs to be deployed, GPU time is expensive so you need to know about compilation techniques/operator fusing, quantization and you need to be able to scale. The requirements to do that are complex because the input data is not always just text.

So yes all the above (and a lot more) require specific expertise.


How long would it take for someone to learn all that?


2-3 years of full time or near full time study.

I know cause I did it.

And I knew the math beforehand. I was a Physics major in college with a CS minor.


As with most topics in software engineering I'd say you will be have to keep learning as you go. They keep coming out with larger models that require fancier parallelism and faster data pipelines. Nvidia comes out with a new thing to accelerate inference every year. Want to use something else than Nvidia? Now you need to learn TPU, Trainium, Meta Accelerator (whatever its name is).


IMO you can only learn al that by doing a few successful ML projects end-to-end. So, a few years?


Is this a catch-22 then or is there a rational course to self-study into the field for those that are competent?


Well, these were senior level skills, a person who can drive and complete a project. I don't know how you could become senior via self-study and without practical hands-on experience on a project (working with and learning from somebody with experience).


Not that long then, especially if someone was motivated enough to complete the projects as quickly as possible.


There's no way even the smartest hard working expert engineer will be competent in AI in 6 months.

I've been in industry and now I do research at a top university. I hand pick the best people from all over the world to be part of my group. They need years under expert guidance, with a lot of reading that's largely unproductive, while being surrounded by others doing the same, in order to become competent.

Writing code is easy. You can learn to use any API in a weekend. That's not what is hard.

What's hard is, what do you do when things don't work. Fine, you tried the top 5 models. They're all ok, but your business requirements need much higher reliability. What do you do now?

This isn't research. But you need a huge amount of experience to understand what you can and cannot do, how to define a problem in a way that is tractable, what problems to avoid and how to avoid them, what approaches cannot possibly work, how to tweak and endless list of parameters, how to know if your model could work if you spent another 100k of compute on it or 100k of data collection, etc.

This is like saying you can learn to give people medical advice in 6 months. Sure, when things are going well, you could handle many patients with a Google search. But the problem is what happens when things go badly.


Becoming an ML engineer is about 6 months of work for a competent backend engineer.

But becoming an X-Scientist (Data/Applied/Applied Research) is a whole different skill set. Now, this kind of role only exists in a proper ML company. But, just acquiring the Statistics & Linear Algebra 201 level intuition is about 6 months of fulltime study in its own right. You also need to have deep skills in one of the Tabular/Vision/NLP/Robotics areas and get hired into a role accordingly. Usually 1 year intensive masters level is good enough to get your foot in the door, with the more prestigious roles needing about 2 years of intensive work with some track record of State-of-the-art results on 1 occasion.

Then you have proper researchers, and that might be the most impossible to get in field right now. I know kids who have only done hardcore ML since high school, who are entering the industry after their masters or PhD. I would not want to be an entry level researcher right now. You need to have undergrad math-CS dual major level skills just to get started. They're expected to have delivered state-of-the-art results a few times just to be called for an interview. I'd say you need at least 3 years of fulltime effort if you want to pivot into this field from SWE.


Good to know.


AI is much harder if you need competitive results, and if you don't need competitive results you don't need to hire a dedicated AI person. Just feed data into some library which is typical software engineering and doesn't have anything to do with AI.


The only metric that matters for a business is whatever helps their bottom line. No one really cares about competitive results if they can just fine tune some open source model on their own data set and get good business outcomes. So if there is good data and access to compute infrastructure to train and fine tune some open source model then the only obstruction to figuring out if AI works for the business or not is just a matter of setting up the training and deployment pipeline. That requires some expertise but that can be learned on the job as well or from any number of freely available tutorials.

I don't think AI is hard to learn. The fundamentals are extremely simple and a competent software engineer can learn all the required concepts in a few months. It's easier if you already have a background in mathematics but not required. If you can write software then you can learn how to write differentiable tensor programs with any of the AI frameworks.


Yes, and those businesses don't need to hire an AI person. This topic is AI research jobs, not for people who sometimes has to call an ML library once in a while in their normal software job.

Edit: You asked what it is about these jobs that requires expertise. I answered: it requires expertise to create competitive models. So companies that need competitive models requires expertise.


Do you build competitive AI models?


I worked on AI at Google, some would say Google isn't competitive in the space but at least they try to be and their business model depends on it.

Edit: Why do you ask? I don't see why it is relevant for the discussion.


HN is often full of abstract argumentation so it helps to know if someone has actual experience doing something instead of just pontificating about it on an internet forum.


I thought what I said was common knowledge on HN, it was last time I was in one of these discussions a few years ago. But something seems to have changed, I guess the "use ml library" jobs drowned out the others by now and that colored these discussions.


People come and go so I don't know how much can be assumed to be common knowledge but what changed is that big enterprises figured out that ML/AI can now be applied in their business contexts with low enough cost to justify the investment to shareholders without anyone getting fired if things don't work out as expected. Every business has data that can be turned into profits and investing in AI is perceived to be a good way to do that now.


Those jobs has been on the rise for over a decade now, it was the majority of people talking a few years ago as well, but at least there was more awareness of the different kinds of jobs out there.


> "What's hard about AI that requires special expertise?"

Several years ago on HN there was a blog post which (attempted to) answer this question in detail, and I have been unsuccessfully trying to find it for a long time. The extra facts I can remember about it are:

* It was by a fairly well known academic or industry researcher

* It had reddish graphics showing slices of the problem domain stacking up like slices of bread

* It was on HN, either as a submission or in the comments, between 2016 and 2018.

If anybody knows the URL to this post, I would be stoked!


This is the type of thing ChatGPT is really good at. You might have some luck if it doesn't pop up here.

--

I decided to have a crack myself and see what come back.

Here's a few names / blogs that might be useful:

Chris Olah: He's written extensively about deep learning and AI. His blog, colah.github.io, has a unique graphical style that helps explain complex topics.

Distill.pub: This online journal publishes clear and visually engaging articles on machine learning topics. Some of the articles have been discussed on HN.

Andrej Karpathy: Director of AI at Tesla and previously a researcher at OpenAI and Stanford. He's known for his blog, karpathy.github.io, where he delves into various AI topics.

Ian Goodfellow: Known for inventing Generative Adversarial Networks (GANs) and for his deep learning textbook. He might have some writings that match your description.

Ben Recht: A professor at Berkeley who writes about the challenges and misunderstandings in machine learning on his blog, www.argmin.net.

Sebastian Ruder: He has written many articles about NLP and machine learning at ruder.io.

You can also try searching via https://hn.algolia.com/


I'd at least debate if it's much harder to learn a new web framework and it's concepts or whatever is required to solve the ML tasks at a company. If you know how database+frontend+backend work (and are already used to HTML/CSS/SQL//JS+another language), you can also on the job learn a new framework.

Knowing the library is the least hard part about ML work just like knowing the web framework is the least hard part about webdev (both imo). It's much more important to understand the actual problem domain and data and get a smooth data pipeline up and running.

Scaling, optimizing inference, squeezing out better performance and annoying labeling. There's a pretty solid gap from applying some framework to a preexisting and never changing dataset vs. curating said dataset in a changing environment. And if we're talking about RL and not just supervised/unsupervised then building a suitable training environment etc. also become quite interesting.

If someone asked me "what's so hard about webdev" my answer would be similar btw...it's fairly easy to set up a reasonably complicated "hello world" project in any given framework but it gets a lot harder when real world issues like different auth worklflows, security, scaling and handling database migrations etc. enter the picture.


These are good points to consider.


If your job is only calling the APIs' .fit() method, then that is not a job at all.

If something is already done, i.e. a model is available for your exact use case (which is never), then for using and deploying that can be done by a good SWE and any ML/AI specialist is not needed at all.

To solve any real problem that is novel, you need to know a lot of things. You need to be on top the progress made by reading papers and be a good enough engineer to implement the ideas that you are going to have iff you are creative/a good problem solver.

And to read those papers you need to have solid college level Calculus and Stats.

If this is so easy, then why don't you do it, and get a job at OpenAI/Tesla/etc?


It's a matter of opportunity cost. I don't think working in AI would be the best use of my time so I don't work at OpenAI/Tesla/etc.


I'm genuinely curious, what is your expectation of candidates looking to get into ML at the entry level?

You seem to look down on those who have

1) learned from online courses

or

2) used AI on tasks that don't require it

Isn't this a bit contradictory? Or you expect candidates to have found a completely novel usecase for AI on their own?

I understand that most ML roles prefer a master's degree or PhD, but from my experience most of the master's degrees in ML being offered right now were spawned from all the "AI hype". That is to say, they may not include a lot of core ML courses and probably are not a significantly better signal of a candidate's qualifications than some of the good online courses out there.

So what does that leave, only those with a PhD? I think it's unreasonable that someone should need that many years of formal education to get an entry level position. Maybe I'm missing something, but I'm really wondering, what do you expect from candidates? I think a few years of professional software engineering experience with some demonstrated interest in AI via online courses and personal projects should be enough.


It sounds like Aurornis was not, in fact, trying to hire people at the entry level.

Most companies doing regular, non-ML development hire a mix of junior and experienced engineers, with the latter providing code reviews, mentorship and architectural advice alongside normal programming duties.

It's understandable that someone kicking off a new ML project would hope to get the experienced hires on board first.

But there are a lot more junior people on the market than senior people right now - as is the nature of a fast growing market.


Ok, that makes sense.

I agree, it's problematic that there are so many more juniors than seniors in the industry right now. I feel like many juniors are being left without mentorship, and then it becomes much harder for them to grow and eventually become qualified for senior roles. So that could help explain why many candidates seem so weak, alongside with all the recent hype.

I guess eventually the market will cool off and the hype will die down since this stuff seems to be cyclical, and the junior engineers who are determined enough to stick it out and seek out mentorship will be able to grow and become seniors.

But it definitely seems like the number of seniors is a bottleneck for talent across the industry.


"The weirdest trend was all of the people who had done large AI projects on things that didn’t need AI at all. We had people bragging about spending a year or more trying to get an AI model to do simple tasks that were easily solved deterministically with simple math, for example. There was a lot of AI-ification for the sake of using AI."

I've seen two variants of this

1) People that have worked for traditional (as in non-tech) companies, where there's been a huge push for digitalization and "AI". These things come from the very top, and you don't really have much say. I've been there myself.

The upper echelon wants "AI" so that they can tick off boxes to the board of directors. With these folks, its all about managing expectations - but frankly, they don't care if you implement a simple regression model, or spend a fortune on overkill models. The most important part is that you've brought "AI" to the company.

2) The people that want to pad their resumes. There's no need, no push, but no-one is stopping you. You can add "designed and implemented AI products to the business operation blablabla" to your CV.

These days, I've seen and experienced 1) an awful lot. It's all about keeping up with the joneses.


> We had people bragging about spending a year or more trying to get an AI model to do simple tasks that were easily solved deterministically with simple math, for example.

Fad-chasing often leads to silly technical decisions. Same thing happened with blockchains when they were at the peak of the famous hype cycle. [0]

[0] https://en.wikipedia.org/wiki/Gartner_hype_cycle


It is getting doubly weird with the LLM/Diffusion explosion over the last year.

The applied research ML role has evolved from being a computational math role to a Pytorch role to a 'informed throw things at the wall' role.

I went from reading textbooks (Murphy, Ian Goodfellow, Bishop) to watching curated NIPS talks to reading Axriv papers to literally trawling random discord channels and subreddits to get a few month leg up on anyone in research. Recently, a paper formally cited /r/localllama for their core idea.

> follow online tutorials

The Open Source movement moves so quickly, that running someone's collab notebook is the way to be at the cutting edge of research. The entire agents, task planning and meta-prompting field was invented in random forums.

________________

This is mostly relevant to the NLP/Vision world......but take a break for 1-2 years, and your entire skill set is obsolete.


This comment describes a real problem for senior engineers who want to explore a new domain. It is impractical for someone with 10+ years of experience to work as an entry level engineer. What other options exist besides completing online course to get experience in that domain?

This is not specific to ML/AI roles. The same problem applies to anyone who wants to explore any of these domains - SRE, DataEng, Backend, Frontend.

Personally, I am a backend engineer who wants to get into ML Infra roles. My current plan is to do these online courses, and hopefully transfer internally to a team working in this area. Only after having real industry experience in this area, look for more opportunities elsewhere.

I am genuinely curious if anyone has better ideas for people in my situation.


> The number of candidates applying with claimed ML/AI experience who haven’t done anything more than follow online tutorials is wild.

Sure, I get this, but I suspect that the number of people who have actual ML/AI experience is pretty small given that the field is nascent. If you really want to hire people to do this kind of work you're going to need to go with people who have done the online tutorials, read the papers, have an interest, etc. Yes, once in a while you're going to find someone who has actual solid ML experience, but you're also going to have to pay them a lot. That's just how things work in a field like this that's growing rapidly.


> The weirdest trend was all of the people who had done large AI projects on things that didn’t need AI at all.

Yeah this is a major phenomenon. Everybody's putting "ai" stickers on everything. So the job market screams "we need ai experts!" in numbers far exceeding the supply of ai experts, because it was a tiny niche until a couple years ago. Industry asks for garbage, industry gets garbage.


Reminds me of the software hiring market during the dotcom boom.

I think the hype on the field and the shitty candidate pool go hand in hand. The shitty candidate pool will groupthink / cargo cult the space without much critical thinking. The groupthink / hype will cause people to jump into the field who don't have any business being in the field.


I’ve done heavy infra, serious model and feature engineering, or both on all of FB Ads and IG organic before 2019, did a startup on attention/transformer models at the beginning of 2019, and worked in extreme multi-class settings in bioinformatics this year.

And out of all the nightmare “we have so many qualified candidates we can’t even do price discovery” conversations in 2023, the ML ones have been the worst.

If you’re running a serious shop that isn’t screwing around and you’re having trouble finding tenured pros who aren’t screwing around, email me! :)


So having no experience is bad, but also going out of their way to get experience is also bad? Isn't this presenting a bit of a no-win scenarios?

Then again, that seems to be common with the job market.


Corporations want trained people but don't want to invest in training them.


This isn't unique to AI. Post any programming job and something like 50-80% of applicants with seemingly perfect resumes won't be able to pass a FizzBuzz test.


100% this. ML has been very hyped for a while now and having it was seen as a badge for the company. To be fair, ML is not also something that was historically central to a degree, so many people wanting to get into AI, even good engineers, did not have the background in it. This too is changing though, but the hype and the lack of an experienced pool doesn't help.


I don’t think the background is really that important tbh.

From physics I have a good theoretical grounding in how ML works (optimizing a cost function over a high dimensional manifold to reconstruct a probability distribution, then using the distribution for some task) but I personally find actually ‘doing ML’ to be rather dull.


I had two successful hires who just graduate from college, with no machine learning experience (major in accounting and civil engineering). With 3 months training by working on real world projects, they become quite fine machine learning engineers.

You probably do not need AI experts if you just need good machine learning engineer to build models to solve problems.


Hasn't that just been the tech market ever since software dev appeared on the lists of best paid low stress jobs?


I think it’s not even about low stress, but low barrier to entry. There are plenty of things I’d rather be doing than software development (in fact I never planned on going into this field professionally), but I just can’t.

I’m also not surprised by the “The number of candidates applying with claimed ML/AI experience who haven’t done anything more than follow online tutorials is wild”. Just go look at any Ask HN thread about “how do I get into ML/AI”. This is pretty typical advice. Hell it’s pretty typical advice given to people asking how to get into any domain. Now sure we’ll how it works outside of bog standard web development though.


>We had people bragging about spending a year or more trying to get an AI model to do simple tasks that were easily solved deterministically with simple math, for example.

TBF theres whole companies doing this. It's a good way to learn too, as you have existing solutions to compare yourself too.


> The weirdest trend was all of the people who had done large AI projects on things that didn’t need AI at all.

I can relate to this a lot. In my company many things you can sell as "AI" can really be solved with traditional data processing.


I have a background in computational linguistics from a good university, and then I got sidetracked by life for the last decade. What real experience did you look for that was a good signal?


What motivates you and other ML researchers to do this work? What's the end goal and why do you want it?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: