How did we get from two chip modules to multi chip modules? Was the upper address decoding logic its own chip, just part of one of the chips or replicated in each chip?
It depends on the module, but for example a 1 gigabyte DRAM module can be built from sixteen 64 meg x 8 chips. Each chip is 8 bits wide, so 8 chips give you 64 bits. This is duplicated for two banks. The row and column addressing is repeated in each chip, while the circuitry to select the right module and bank is external. I'm sure someone else can provide more details on modern modules.
Yes, what Ken said. There was another way to do it if the memory chips were already wide enough. Some static RAMs had multiple enables of different polarities. For example, the 6264 had an E1* (active low) and an E2 (active high)... wire an address bit to E1* on one chip and E2 on the other and you'll get the decoding done without adding a third chip.