Ok I will try to explain that a little bit further. DP-RAMs are in fact very tricky to use. So don't be to careless when using them.
Normally a Dual-Port RAM has an internal arbitration circuit. This circuit compares the addresses of the two ports. Whenever both sides access the DP-RAM (/CE=Low) and either side is doing a Write (/WE=Low) and the addresses are both the same, the circuit elects a winner based on first come first serve basis (the first side pulling /CE Low is the winner). If the writing side is the winner then the write is executed. The data read on the other side is not defined as the bits written are not all written at exactly the same time and hence the bits read can have either the previous, new or a transitory value. If the reader wins, the write is not executed until the reader releases it's port (/CE=High). Then for the data to be written successful the writer must extend it's /CE and /WE to be asserted by at least tacc (access-time) after the winning ports /CE has been de-asserted. Most DP-RAM provide a /BUSY signal that signals the loser

that he has lost.
In our case, the Video Controller must not lose because of VGA signal timing constraints. So you cannot make use of /BUSY to signal the Video Controller to wait. If the Video Controller cannot make use of /BUSY, you must make sure a potential /BUSY would never have to be signaled to the Video Controller. So you need to do the arbitration by "hand", that is by dedicated external circuitry. If you do not pay attention to this, the data read by the Video Controller might be wrong (but on the next screen refresh chances are high that the correct data is read) so you might see some sort of flicker. If you don't care about the flicker you can ignore this.
However you cannot allow that data, written by the CPU, is not stored correctly. Here I took the lazy approach, I'm using a IDT7134. This device has no internal arbitration logic and writes are always executed, and simultaneous reads could return wrong values. I'm currently investigating in a circuit that delays and extends write access to times the video controller is not using the DP-RAM (i.e. /CEL is de-asserted). Such a circuitry would also allow to use a IDT7132 with arbitration logic or a IDT7007 (I have plans to use this one so I can create a 640*400 pixel bitmap graphics). Should be fairly easy using the outputs of the 74LS161A and the RDY input of a CMOS CPU.
For dual processor systems with shared-memory you need to provide a symmetric logic. This can be tricky. That's when /BUSY helps a lot, as it is only triggered if both addresses are the same. For multi-processor systems, there are also 4-port RAMs

with full arbitration logic. But for shared memory designs, many DP-RAM provide semaphores in hardware. You better use these. Then you can avoid any additional synchronization circuit and implement a software access protocol. E.g. you can define data-structures logically protected by a semaphore to create a mutex. Mostly you will implement a control block queue and when you en-queue or de-queue a control block you lock the semaphore before manipulating the pointers and after the pointers are updated you can release the semaphore. Then you define a access protocol, that defines who "owns" a control block. One way is to say control blocks in the queue must not be accessed/modified. So before using it you need to de-queue the control block and you must update it before you en-queue it. With this logic you make sure that only one side is using a given address. And so you don't care about arbitration.
As for the video controller, there is not too much cpu power left in the MCU, so it is not a good idea to let the video controller do a lot processing on the video RAM, especially time critical tasks like character attributes. This you better do in hardware. You just use a second DP-RAM and a attribute register (you take something like a 74AC646 or 74AC652 which are bidirectional). Whenever you read the video RAM the attribute register is loaded with the value of the second DP-RAM and on writes you previously set the attribute register with the desired values. That's how
granati implemented attributes. Attributes are then applied to video signal in hardware.
Here again I will take the lazy approach. I will do blinking for just 64 characters as in the Apple IIplus.