> Similarly, consider a series like A Song of Ice and Fire. A human reader is still consciously aware of (and waiting for) the answers to questions raised in the very first book.
Some of them, some of the time. This is best comparable with ChatGPT having those books in its training dataset.
The context window is more like short-term memory. GPT-4 can fit[0] ~1.5 chapters of Game of Thrones; GPT-4-32k almost six. Making space for prompt, questions and replies, say one chapter for GPT-4, and five chapters for GPT-4-32k.
Can you imagine having a whole chapter in your working memory at once? Being simultaneously aware of every word, every space, every comma, every turn of phrase, every character and every plot line mentioned in it - and then being able to take it all into account when answering questions? Humans can do it for a paragraph, a stanza, maybe half a page. Not a whole chapter in a novel. Definitely not five. Not simultaneously at every level.
I feel in this sense, LLMs already surpassed our low-level capacity - though the comparison is a bit flawed, since our short-term memory also keeps tracks of sights, sounds, smells, time, etc. and emotions. My point here isn't really to compare who has more space for short-term recall - it's to point out that answering questions about immediately read text is another narrow, focused task which machines can now do better than us.
----
[0] - 298000 words in the book (via [1]), over 72 chapters (via [2]), gives us 4139 words per chapter. Multiplying by 4/3, we get 5519 tokens per chapter. GPT-4-8k can fit 1.45x that; GPT-4-32k can fit 5.8x that.
Just thinking about this, I realized that as a musician I do it all the time. I can recall lyrics, chords, instrumental parts and phrasing to hundreds if not thousands of pieces of music and "play them back" in my head. Unlike a training set, though, I can usually do that after listening to a piece only a few times, and also recall what I thought of each part of each piece, and how I preferred to treat each note or phrase each time I played it, which gives me more of a catalog of possible phrasings the next time I perform it. This is much easier for me than remembering exact words I've read in prose. I suspect the relationships between all those different dimensions is what makes the memory more durable. I must also be creating intermediary dimensions and vectors to do that processing, because one side effect of it is that I associate colors with pitches.
If we are trying to at least match human level then all we have to do is summarize and store information for retrieval in the context window. Emphasis on summarize.
We take out key points explicitly so it's not summarized, and for the rest (less important parts) we summarize it and save it.
That would very likely fit and it would probably yield equal to or better recall and understanding than humans.
Some of them, some of the time. This is best comparable with ChatGPT having those books in its training dataset.
The context window is more like short-term memory. GPT-4 can fit[0] ~1.5 chapters of Game of Thrones; GPT-4-32k almost six. Making space for prompt, questions and replies, say one chapter for GPT-4, and five chapters for GPT-4-32k.
Can you imagine having a whole chapter in your working memory at once? Being simultaneously aware of every word, every space, every comma, every turn of phrase, every character and every plot line mentioned in it - and then being able to take it all into account when answering questions? Humans can do it for a paragraph, a stanza, maybe half a page. Not a whole chapter in a novel. Definitely not five. Not simultaneously at every level.
I feel in this sense, LLMs already surpassed our low-level capacity - though the comparison is a bit flawed, since our short-term memory also keeps tracks of sights, sounds, smells, time, etc. and emotions. My point here isn't really to compare who has more space for short-term recall - it's to point out that answering questions about immediately read text is another narrow, focused task which machines can now do better than us.
----
[0] - 298000 words in the book (via [1]), over 72 chapters (via [2]), gives us 4139 words per chapter. Multiplying by 4/3, we get 5519 tokens per chapter. GPT-4-8k can fit 1.45x that; GPT-4-32k can fit 5.8x that.
[1] - https://blog.fostergrant.co.uk/2017/08/03/word-counts-popula...
[2] - https://awoiaf.westeros.org/index.php/Chapters_Table_of_cont...