"More or less" ... I can see that it's hard after decades of "data-centric" perspectives to think in terms of "computers" rather than "data", and about semantics rather than pragmatics. It's not "data contains procedures" but that objects are (a) semantically computers, and are impervious to attack from the outside (a computer has to let an attack happen via its own internal programming), and (b) what's inside can be anything that is able to deal with messages in a reasonable way. In Smalltalk, these are more objects (there are only objects in Smalltalk). The way the internals of typical Smalltalk objects are organized could be done better (we used several schemes, but all of them had variables and methods (which were also objects).
So "2" is not "data" in Smalltalk (unfortunately, it is in Java, etc.)
We had planned that the interior of objects should be an "address space of objects" because this is a better and more recursive way to do modularization, and to allow a different inter-viewing scheme (we eventually did some of this for the Etoys system for children about 20 years ago). But the physical computers at Parc were tiny, and the code we could run was tiny (the whole system in the Ted Nelson demo video was a little over 10,000 lines of code for everything). So we stayed with our top priority: to make a real and highly interactive system that was very comprehensive about what it and the user could do together.
Taking your feedback into consideration, I had the thought that it would be more accurate to talk about things like "2" in an OOP context as symbols with semantics (which provide meaning to them), not data, since "data" connotes more a collection of inputs/quantities, where we may be able to attach a meaning to it, or not, and that wasn't what I was going after. I was going after a relationship between information and semantics that can be associated with it, but trying to provide a transition point from the idea of data structures to the idea of objects, for someone just learning about OOP. Doing a sleight of hand may not do the trick.
My starting point was to use a very interesting concept, when I encountered it, in SICP, where it discussed using procedures to emulate data, and everything that can be done with it. It seemed to help explain for the first time what "code is data" meant. It illustrated the inversion I was talking about:
"In 2.1.3 it questions our notion of “data”, though it starts from a higher level than most people would. It asserts that data can be thought of as this constructor and set of selectors, so long as a governing principle is applied which says a relationship must exist between the selectors such that the abstraction operates the same way as the actual concept would." It went on to illustrate how in Lisp/Scheme you could use functions to emulate operations like "cons", "car", and "cdr", completely in procedural space, without using data structures at all.
This is what I illustrate with "2 + 2", and such, that code is doing everything in this operation, in OOP. It's not a procedure applied to two operands, even though that's how it looks on the surface.
Yes, the SICP follows "simulate data" ideas much further back in the past, including the B5000 computer, and especially the OOP systems I did. But the big realization is that there are very few things that are helpful when they are passive, and the non-passive parts are the unique gift of computing. The question is not whether ideas from the past can be simulated (easy to see they can be if the building blocks are whole computers) but what do we "mean by 'meaning' "?
Good answers to this are out of the scope of HN, but we should be able to imagine -processes- moving forward in both our time and in simulated time that can answer our questions in a consistent way, and can be set to accomplish goals in a reasonable way.
I was using "data" in the spirit of a saying I heard many years ago in CS, that, "Data is code, and code is data." It seems that people in CS are still familiar with this phrase. I was focusing on the latter part of that phrase. I was trying to answer the question that I think is often implied once you start talking to people about real OOP, "What about data?" I almost don't like the term "data" when talking about this, because as you say, it gets one away from the focus on semantics, but whenever you're talking to people in the computing field, such as it is, I think this question is unavoidable, because people are used to thinking of code and information as separate, hence the notion of data structures. People need a way to translate in their minds between what they've done with information before, and what it can be. So, I used the term "data" to talk about "literal objects" (like "2", or other kinds of input), but I was using the description of "processors" (ie. computers), "containing procedures," which can also be thought of as "operators."
I think the idea of an "inversion" is quite apt, because as you've said before, the idea of data structures is that you have procedures acting on data. With real objects, you still have the same essential elements in programming, the same stuff to deal with, but the kinds of things programmers typically think about as "data" are objects/computers in OOP, with intrinsic semantics. So, you're still dealing with things like "2", just as procedures acting on data do, but instead of it being just a "dead" symbol, that can't do anything, "2" has semantics associated with an interface. It's a computer.
> We had planned that the interior of objects should be an "address space of objects" because this is a better and more recursive way to do modularization
Something that nags me in the back of my mind is that messages are not just any object, they always have the selector attached. Why not let objects handle any other object as a message? Is this what you mean by the above?
Thinking about the biological analogy (maybe taking it too far...): the system of cells is distinct from the system of proteins inside the cells and going up the layers we have the systems of creatures. So the way proteins interact is different from how cells interact, etc. but each system derives its distinct behaviors from the lower ones. Also, the messages are typically not the entities themselves but other lower level stuff (cells communicate using signals that are not cells). So in a large scale OO system we might see layers of objects emerge. Or maybe we need a new model here, not sure.
Take a look at the first implemented Smalltalk (-72). It implemented objects internally as a "receive the message" mechanism -- a kind of quick parser -- and didn't have dedicated selectors. (You can find "The Early History of Smalltalk" via Google to see more.)
This made the first Smalltalk "automatically extensible" in the dimensions of form, meaning, and pragmatics.
When Xerox didn't come through with a replacement for the Alto we (and others at Parc) had to optimize for the next phases, and this led to the compromise of Smalltalk-76 (and the succeeding Smalltalks). Dan Ingalls chose the most common patterns that had proved useful and made a fixed syntax that still allowed some extension via keywords. This also eliminated an ambiguity problem, and the whole thing on the same machine was about 180 times faster.
I like your biological thinking. As a former molecular biologist I was aware of the vast many orders of magnitude differences in scale between biology and computing. (A typical mammalian cell will have billions of molecules, etc. A typical human will have 10 Trillion cells with their own DNA and many more in terms of microbes, etc.) What I chose was the "Cambrian Revolution Recursively": that cells could work together in larger architecture from biology, and that you can make the interiors of things at the same organization of the wholes in computing because of references -- you don't have to copy. So just "everything made from cells, including cells", and messages made from cells, etc.
Some ideas you might find interesting are in an article I wrote in 1984 -- called "Computer Software" -- for a special issue of Scientific American on "Software". This talks about the subject in general, and looks to the possibility of "tissue programming" etc.
I should have mentioned a few other things for the later Smalltalks. First, selectors are just objects. Second, you could use the automatic "message not understood" mechanism to field an unrecognized object. I think I'd do this by adding a method called "any" and letting it take care of arbitrary unknown objects ...
A selector is an object -- so that is pure -- and its use is a convention of the messaging, and the message itself is one object, that is an instance of Class message.
What's fun is that every Smalltalk contained the tools to make their successors while still running themselves. In other words, we can modify pretty much anything in Smalltalk on the fly if we choose to dip into the "meta" parts of it, which are also running. In Smalltalk-72, a message send was just a "notify" to the receiver that there was a message, plus a reference to the whole message. The receiver did the actual work of looking at it, interpreting it, etc.
This is quite possible to make happen in the more modern Smalltalks, and would even be an interesting exercise for deep Smalltalkers.
> A selector is an object -- so that is pure -- and its use is a convention of the messaging
The selector 'convention' is hard coded in the syntax - this appears to elevate selector based messaging over other kinds. But now I'm rethinking this differently - i.e. selectors isn't part of the essence, but a specific choice that could be replaced (if we find something better.)
I can't remember if I've brought this up already in this thread, but if you want to "kick the tires" on ST-72, Dan Ingalls has an implementation of it up on the web. It's running off of a real ST-72 image. I wrote about it at https://tekkie.wordpress.com/2014/02/19/encountering-smallta...
I include a link to it, and described how you can use it (to the best of my knowledge), though my description was only current to the time that I wrote it. Looking at it again, Ingalls has obviously updated the emulation.
The nice thing about this version is it includes the original tutorial documentation, written by Kay and Adele Goldberg, so you can download that, and learn how to use it. I found that I couldn't do everything described in the documentation. Some parts of the implementation seemed broken, particularly the class editor, which was unfortunate, and some attempts to use code that detected events from the mouse didn't work. However, you can write classes from the command line (ST-72 was largely a command-line environment, on a graphical display, so it was possible to draw graphics).
If you take a look at it, you will see a strong resemblance to Lisp, if you're familiar with that, in terms of the concepts and conventions they were using. As Kay said in "The Early History of Smalltalk," he was trying to improve on Lisp's use of special forms. I found through using it that his notion of classes, from a Lisp perspective, existed in a nether world between functions and macros. A class could look just like a Lisp function, but if you add parsing behavior, it starts behaving more like a macro, parsing through its arguments, and generating new code to be executed by other classes.
The idea of selectors is still kind of there, informally. It's just that it takes a form that's more like a COND construct in Lisp. So, rather than each selector having its own scope, as in later versions, all of them exist in an environment that exists in the scope of the class/instance.
After using it for a while, I could see why they went to a selector model of message receipt, because the iconic language used in ST-72 allowed you to express a lot in a very small space, but I found that you could make the logic so complex it was hard to keep track of what was going on, especially when it got recursive.
> existed in a nether world between functions and macros
Macros are just functions that operate on functions at 'read-time', from my POV. So if you eliminate the distinction between read-time and run-time, they're the same.
> It's just that it takes a form that's more like a COND construct in Lisp.
And even COND isn't special, it's just represented as messaging in Smalltalk, right?
> you could make the logic so complex it was hard to keep track of what was going on
"And even COND isn't special, it's just represented as messaging in Smalltalk, right?"
Right. What I meant was that the parsing would begin with "eyeball" (ST-72 was an iconic language, so you would get a character that looked like an eye viewed sideways), and then everything after that in the line was a message to "eyeball," talking about how you wanted to parse the stream--what patterns you were looking for--and if the patterns matched, what messages you wanted to pass to other objects. That was your "selector" and method. What felt weird about it, after working in Squeak for a while, is these two concepts were combined together into "blobs" of symbolic code. You would have a series of these "messages to eyeball" inside a class. Those were your methods.
The reason I said it was similar to COND was it had a similar format: A series of expressions saying, "Conditions I'm looking for," and "actions to take if conditions are met." It was also similar in the sense that often that's all that would be in a class, in the same way that in Lisp, a function is often just made up of a COND (unless you end up using a PROG instead, which I consider rather like an abomination in the language).
In ST-72, there's one form of conditional that uses a symbol like "implies" in math (can't represent it here, I don't think), and another where you can be verbose, saying in code, "if a = b then do some stuff." But what actually happens is "if" is a class, and everything else ("a = b then do some stuff") is a message to it. Of course, you could create a conditional in any form you want.
In ST-80, they got rid of the "if" keyword altogether (at least in a "standard" system), and just started with a boolean expression, sending it a message.
a = b ifTrue: [<do-one-thing>] ifFalse: [<do-something-else>].
They introduced lambdas (the parts in []'s) as objects, which brought some of the semantics "outside of the class" (when viewed from an ST-72 perspective). It seems to me that presents some problems to its OOP concept, because the receiver is not able to have complete control over the meaning of the message. Some of that meaning is determined by partitioned "blocks" (lambdas) that the receiver can't parse (at least I don't think so). My understanding is all it can do with them is either pass parameter(s) to the blocks, executing them, or ignore them.
One of the big a-ha moments I had in Smalltalk was that you can create whatever control structures you want. The same goes for Lisp. This is something you don't get in most other languages. So, a temptation for me, working in Lisp, has been to spend time using that to work at trying to make code more expressive, rather than verbose. A positive aspect of that has been that it's gotten me to think about "meanings of meaning" in small doses. It creates the appearance to outsiders, though, that I seem to be progressing on a problem very slowly. Rather than just accepting what's there and using it to solve some end goal, which I could easily do, I try to build up from the base that's there to what I want, in terms of expression. What I have just barely scratched the surface of is I also need to do that in terms of structure--what you have been talking about here.
It's an extensible language with a meta system so you can make each and every level of it do what you want. And, as I mentioned, the first version of Smalltalk (-72) did not have a convention to use a selector. The later Smalltalks wound up with the convention because using "keywords" to make the messages more readable for humans was used a lot in Smalltalk-72.
So "2" is not "data" in Smalltalk (unfortunately, it is in Java, etc.)
We had planned that the interior of objects should be an "address space of objects" because this is a better and more recursive way to do modularization, and to allow a different inter-viewing scheme (we eventually did some of this for the Etoys system for children about 20 years ago). But the physical computers at Parc were tiny, and the code we could run was tiny (the whole system in the Ted Nelson demo video was a little over 10,000 lines of code for everything). So we stayed with our top priority: to make a real and highly interactive system that was very comprehensive about what it and the user could do together.