For (2), I had a 1hr video from 1 year ago, but I didn't actually expect that video to be some kind of authoritative introduction to LLMs. The history is that I was invited to give an LLM talk (to general audience), prepared some random slides for a day, gave the talk, and then re-recorded the talk in my hotel room later in a single take, and that become the video. It was quite random and haphazard. So I wanted to loop back around more formally and do a more comprehensive intro to LLMs for general audience; Something I could for example give to my parents, or a friend who uses ChatGPT all the time and is interested in it, but doesn't have the technical background to go through my videos in (1). That's this video.
Great work! I love your videos; they've taught me so much. Any plans for a Mixture of Experts (MoE) video? My understanding is that starting from GPT4 most advance models use MoE to some extent. For example, can I take the model from your GPT2 video and just change the feed forward layer to an MoE layer like the one found here (1)? I guess I can just try it myself but I enjoy the expert guidance you provide in your videos. Please don't stop! great content!
1. technical track (all the GPT repro series)
2. general audience track
For (2), I had a 1hr video from 1 year ago, but I didn't actually expect that video to be some kind of authoritative introduction to LLMs. The history is that I was invited to give an LLM talk (to general audience), prepared some random slides for a day, gave the talk, and then re-recorded the talk in my hotel room later in a single take, and that become the video. It was quite random and haphazard. So I wanted to loop back around more formally and do a more comprehensive intro to LLMs for general audience; Something I could for example give to my parents, or a friend who uses ChatGPT all the time and is interested in it, but doesn't have the technical background to go through my videos in (1). That's this video.