Big enough LLM models can have are emerging characteristics like long term planning or agentic behavior, while gpt4 don't have this behaviors right now, it is expected that bigger models will begin to show intent, self-preservation, and purpose.
The gpt4 paper have this paragraph "... Agentic in this context
does not intend to humanize language models or refer to sentience but rather refers to systems characterized by ability to, e.g., accomplish goals which may not have been concretely specified and which have not appeared in training; focus on achieving specific, quantifiable objectives; and do long-term planning. "
GPT-4 has 32k tokens of context. I'm sure someone out there is implementing the pipework for it to use some as a scratchpad under its own control, in addition to its input.
In the biological metaphor, that would be individual memory, in addition to the species level evolution through fine-tuning
Yeah, I’m doing that to get GPT-3.5 to remember historical events from other conversations. It never occurred to me to let it write it’s own memory, but that’s a pretty interesting idea.
Chat gpt changes when we train or fine-tune it. It also has access to local context within a conversation, and those conversations can be fed back as more training data. This is similar to a hard divide between short term and long term learning.
And I don’t understand how one assumes that can be known.
I see this argument all the time: it’s just a stochastic parrot, etc.
How can you be sure we’re not as well and that there isn’t at least some level of agency in these models?
I think we need some epistemic humility. We don’t know how our brain's work and we made something that mimics parts of its behavior remarkably well.
Let’s take the time and effort to analyze it deeply, that’s what paradigmatic shifts require.