| | Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com) |
| 1 point by ibobev 4 months ago | past |
|
| | Writing an LLM from scratch, part 21 – perplexed by perplexity (gilesthomas.com) |
| 1 point by gpjt 4 months ago | past |
|
| | Writing an LLM from scratch, part 20 – starting training, and cross entropy loss (gilesthomas.com) |
| 41 points by gpjt 4 months ago | past | 3 comments |
|
| | How Do LLMs Work? (gilesthomas.com) |
| 2 points by gpjt 5 months ago | past | 1 comment |
|
| | How Do LLMs Work? (gilesthomas.com) |
| 1 point by ibobev 5 months ago | past |
|
| | The maths you need to start understanding LLMs (gilesthomas.com) |
| 616 points by gpjt 5 months ago | past | 120 comments |
|
| | What AI chatbots are doing under the hood (gilesthomas.com) |
| 2 points by gpjt 5 months ago | past |
|
| | LLM from scratch, part 18 – residuals, shortcut connections, and the Talmud (gilesthomas.com) |
| 2 points by gpjt 6 months ago | past |
|
| | The fixed length bottleneck and the feed forward network (gilesthomas.com) |
| 1 point by gpjt 6 months ago | past |
|
| | Writing an LLM from scratch, part 17 – the feed-forward network (gilesthomas.com) |
| 8 points by gpjt 6 months ago | past |
|
| | Writing an LLM from scratch, part 16 – layer normalisation (gilesthomas.com) |
| 1 point by gpjt 7 months ago | past |
|
| | Leaving PythonAnywhere (gilesthomas.com) |
| 3 points by gpjt 8 months ago | past |
|
| | Writing an LLM from scratch, part 15 – from context vectors to logits (gilesthomas.com) |
| 7 points by gpjt 8 months ago | past |
|
| | Writing an LLM from scratch, part 14 – the complexity of self-attention at scale (gilesthomas.com) |
| 1 point by gpjt 9 months ago | past |
|
| | Writing an LLM from scratch, part 13 – attention heads are dumb (gilesthomas.com) |
| 351 points by gpjt 9 months ago | past | 67 comments |
|
| | Writing an LLM from scratch, part 12 – multi-head attention (gilesthomas.com) |
| 3 points by gpjt 9 months ago | past |
|
| | Writing an LLM from scratch, part 11 – batches (gilesthomas.com) |
| 2 points by gpjt 10 months ago | past |
|
| | Writing an LLM from scratch, part 10 – dropout (gilesthomas.com) |
| 90 points by gpjt 11 months ago | past | 8 comments |
|
| | Adding /Llms.txt (gilesthomas.com) |
| 1 point by gpjt 11 months ago | past |
|
| | Writing an LLM from scratch, part 9 – causal attention (gilesthomas.com) |
| 4 points by gpjt 11 months ago | past |
|
| | Writing an LLM from scratch, part 8 – trainable self-attention (gilesthomas.com) |
| 380 points by gpjt 11 months ago | past | 31 comments |
|
| | It’s still worth blogging in the age of AI (gilesthomas.com) |
| 333 points by gpjt 11 months ago | past | 222 comments |
|
| | The benefits of learning in public (gilesthomas.com) |
| 311 points by gpjt 11 months ago | past | 97 comments |
|
| | Getting MathML to render properly in Chrome-based browsers (gilesthomas.com) |
| 3 points by LorenDB 12 months ago | past |
|
| | Do reasoning LLMs need their own Philosophical Language? (gilesthomas.com) |
| 1 point by gpjt on Jan 17, 2025 | past | 1 comment |
|
| | Messing around with fine-tuning LLMs, detailed memory usage for an 8B model (gilesthomas.com) |
| 1 point by vednig on Aug 21, 2024 | past |
|
| | LLM Quantisation Weirdness (gilesthomas.com) |
| 2 points by gpjt on Feb 28, 2024 | past |
|
| | Pam-unshare: a PAM module that switches into a PID namespace (gilesthomas.com) |
| 5 points by gpjt on April 15, 2016 | past |
|
| | Does #EUVAT make charging Bitcoin impossible for EU digital services businesses? (gilesthomas.com) |
| 3 points by gpjt on Dec 20, 2014 | past |
|
| | How many python programmers are there in the World today? (gilesthomas.com) |
| 1 point by lifeisstillgood on May 8, 2014 | past | 2 comments |
|
|
| More |