In the case of the BBC Master, there are three things which control which parts of the 320KB total address space are mapped to CPU address space:
1: The value of the ROMSEL latch at $FE30. The low four bits select the "sideways" slot mapped to the $8000-BFFF window (four of these are half of the stock RAM fitted, a further seven are sections of the stock ROM), and the high bit overrides a 4KB segment of the Shadow RAM onto $8000-8FFF for use by the MOS.
2: The value of the ACCCON latch at $FE34. Three bits of this select respectively:
- Whether VDU driver accesses (see below) to screen memory (defined as the 20KB in $3000-7FFF) go to main or shadow RAM;
- Whether non-VDU-driver accesses to screen memory go to main or shadow RAM;
- Whether the remaining 8KB segment of Shadow RAM is mapped over the bottom end of the MOS ROM at $C000-DFFF. This is normally used as temporary storage by the floppy disk filesystem.
3: Whether the most recent SYNC cycle was in the $C000-DFFF range or not. If so, the access is assumed to be from the VDU drivers for the purpose of $3000-7FFF mapping. If not, the access will be treated as a non-VDU-driver mapping.
And yes, this means you can access shadow display memory from code located in the 8KB shadow RAM block, effectively overriding the stock VDU drivers, provided you have finished with the floppy disk system. Probably few people do that, but for maximum compatibility you should expect it to occur.
I/O space proper consists entirely of the FRED, JIM and SHEILA pages ($FCxx, $FDxx, $FExx). FRED and JIM both map to the 1MHz Expansion Port; SHEILA is where all internal hardware is mapped, including the memory mapping latches.
To be compatible with the above, you will need to track the state of $FE3x writes (which I assume are incompletely decoded, so blocks of four addresses map to the same latch) and the VDU driver flag. This is nine bits of relevant state in all, so can theoretically be kept in a single Spartan "byte". Writes to the display memory area (whether shadow or main) must be write-through cached so that the CRTC will fetch the updated values; the rest of main and shadow RAM can be kept internally for speed. When the code jumps into or out of the VDU driver region, at least one SYNC cycle must be generated to the new region (in-order relative to pending writes) so as to update the external hardware.
The above will correctly handle all 64KB of main and shadow RAM, the 16KB MOS ROM area, and hardware I/O. With the Spartan-7, you might as well assign a permanent 80KB mapping of internal RAM to those, allowing full-speed reads once the MOS ROM has been read in; this is particularly important for zero page and will also greatly accelerate graphics operations, even with write-through. Then we can consider how to handle the sixteen-way, 16KB window of Sideways RAM and ROM slots. Collectively, these are 256KB and thus too large to permanently map to internal RAM - but usually only a subset of this total area is used during a particular session.
Because a Sideways slot might contain either RAM or ROM, a naive write-through or write-back caching scheme might not be appropriate. Some of the slots may map to the cartridge sockets which are accessible to the user, and can be fitted with arbitrary memory-mapped hardware. However, you might assume that the machine has not been extensively modified in this respect, and thus make the following assumptions:
1: Slots 9-F always map to the stock ROM. Writes can therefore be discarded, and reads can be LRU-cached. You might consider extending this assumption to slot 8, which maps to a physical ROM socket.
2: Slots 4-7 always map to Sideways RAM. Write-back caching can therefore be employed. Or, if this is too complicated, assign 64KB of internal RAM permanently to these slots.
3: Slots 0-3 (and optionally also slot 8), which map to cartridge sockets, are probably best handled as uncached accesses. This accommodates the full range of hardware which could be plugged into them, including non-memory devices and devices which include their own internal mapping systems.
With 80KB assigned to main and shadow RAM and the MOS ROM, plus 64KB assigned to the Sideways RAM slots, about 36KB of internal RAM remains for caching the stock ROM slots. That should be enough for full-speed handling of BASIC, DFS, and the extended graphics routines which are scattered more-or-less randomly through spare areas of other ROM slots.
Of course, implementing the Second Processor is much easier. There's no memory mapping, just a small I/O window and a tiny boot ROM.