Actually ... only the first version of Smalltalk was done in terms of the NOVA (and not using BCPL). The subsequent versions (Smalltalk-76 on) were done by making a custom virtual machine in the Alto's microcode that could run Smalltalk's byte codes efficiently.
The basic idea is that you can win if the microcode cycles are enough faster than the main memory cycles so that the emulations are always waiting on main memory. This was generally the case on the Alto and Dorado. Intel could have made the "Harvard" 1st level caches large enough to accommodate an emulator -- that would have made a big difference. (This was a moot point in the 80s)
I know this is getting nit-picky, but I think people might be interested in getting some of the details in the history of how Smalltalk developed. Dan Ingalls said in "Smalltalk-80: Bits of History":
"The very first Smalltalk evaluator was a thousand-line BASIC program which first evaluated 3 + 4 in October 1972. It was followed in two months by a Nova assembly code implementation which became known as the Smalltalk-72 system."
The first Altos were produced, if I have this right, in 1973.
I was surprised when I first encountered Ingalls's implementation of an Alto on the web, running Smalltalk-72, because the first thing I was presented with was, "Lively Web Nova emulator", and I had to hit a button labeled "Show Smalltalk" to see the environment. He said what I saw was Nova machine code from a genuine ST-72 image, from an original disk platter.
I take it from your comment that you're saying by the time ST-76 was developed, the Alto hardware had become fast enough that you were able to significantly reduce your use of machine code, and run bytecode directly at the hardware level.
I could've sworn Ingalls said something about using BCPL for earlier versions of Smalltalk, but quoting out of "Bits of History" again, Ingalls, when writing about the Dorado and Smalltalk-80, said of BCPL that the compiler you were using compiled to Alto code, but ...
"As it turned out, we only used Bcpl for initialization, since it could not generate our extended Alto instructions and since its subroutine calling sequence is less efficient than a hand-coded one by a factor of about 3."
The Alto didn't get any faster, and there was not a lot of superfast microcode RAM (if we'd had more it would have made a huge difference). In the beginning we just got Smalltalk-72 going in the NOVA emulator. Then we used the microcode for a variety of real-time graphics and music (2.5 D halftone animation, and several kinds of polytimbral timbre synthesis including 2 keyboards and pedal organ). These were separate demos because both wouldn't fit in the microcode. Then Dan did "Bitblt" in microcode which was a universal screen painting primitive (the ancestor of all others). Then we finally did the byte-code emulator for Smalltalk-76. The last two fit in microcode, but the music and the 2.5 D real-time graphics didn't.
The Notetaker Smalltalk (-78) was a kind of sweet spot in that it was completely in Smalltalk except for 6K bytes of machine code. This was the one we brought to life for the Ted Nelson tribute.
The basic idea is that you can win if the microcode cycles are enough faster than the main memory cycles so that the emulations are always waiting on main memory. This was generally the case on the Alto and Dorado. Intel could have made the "Harvard" 1st level caches large enough to accommodate an emulator -- that would have made a big difference. (This was a moot point in the 80s)