this is my first attempt at this.
Looks like you've got a good grasp of the subject, yvibod.

But it may be your circuit is more complex than necessary.
Goal is to divide the address space in 8 chunks of 8k, and the first and last chunks into 4 2k each.
And your circuit maps the
six 8k chunks in the middle to nothing. But is it an actual
requirement to have nothing there (so there's plenty of room for future expansion perhaps)? If not, there's an alternative arrangement (below) which omits the '139 decoder and uses just a '138. Apologies, Daryl, for hacking your diagram!
It's A13, A12 and A11 that drive the '138 inputs, and that means each of the '138 outputs will span a 2K range. But A15 and A14 are "no connect" -- they are ignored. The resulting memory map looks roughly like this. (I didn't draw the whole thing.)

- map mod-A.png (3.38 KiB) Viewed 1654 times
Notice that the 0 - $3FFF region is identical to the $4000 to $7FFF region. Although I didn't draw the whole thing, $8000-$BFFF is the also the same, and $C000-$FFFF too. It's the result of ignoring A15 and A14.
From a coding perspective, LDA $0000 (for example) has the same effect as LDA $4000 or LDA $8000 or LDA $C000. And LDA 1 is the same as $4001 and $8001 and $C001, and so on.
It's harmless to have all those aliases. The only drawback is if you want to add more devices in future. There's only a 2K space left available (at $1800 aka $5800 aka $9800 aka $D800).
Edit:
Here are the schematics on the chips:
Re: the 6522 you'll need to connect the '138 output to 6522 /CS (pin 23), and tie the 6522 CS input (pin 24) high.
cheers,
Jeff
ps- here at 6502.org you can include images as attachments to your post. You'll find that more convenient than putting the images on a third-party site, and it also ensures the images will remain permanently available.