Throwing out a couple of thoughts:
As you're using an 8bit data bus I'd imagine the fastest way to communicate with an LCD would be via that 8bit bus. I know I've seen LCD displays that use a parallel interface but it's not something I've ever looked into so they might be quite expensive or just not match the 816's architecture nicely.
I bring up speed because even 320x240x16bpp @ 60fps is lot of data for an '816 to handle unassisted.
Let's say you're running 14MHz at 320x200x8bpp. The '816 - using block move instructions - will just barely be able to draw an entire frame 30 times a second. Crossing to more than 320x200 (say 320x240x8) means that the video memory is going to be split over two banks. And that's adding more complicated calculations to a CPU that's already only barely keeping up.
All is not lost though. The 816's block move instruction takes 7 cycles per 8bit pixel whereas if you use a dedicated hardware blitter you could get that down to 2 (or even 1) cycles per pixel. You can also use a colour palette if you want to keep 16bit or even 24bit colour but also still keep a pixel width of only 8bits.
For a preliminary build I'd suggest 320x200x8 is a really good starting point.
Makes the full 24-bit address bus of the 65C816 available to software developers (currently using the de-multiplex setup suggested in the 65C816 datasheet, but it might not be suitable for higher CPU speeds)
I wouldn't worry about the time it takes to capture the 816's bank address. Plenty of logic families have latches that have sub 10ns propagation times. I use LVC in a 3.3V setup but you could use AC or similar if you're running with a 5V setup. Probably AHC is what you're looking for if HC seems too slow.
Personally I fully intend to take Plasmo's 40MHz crown with an 50MHz '816 with the complete 24bit bus available. But that's just smack talk for now as I keep running into problems :lol: