If I understand what you're after, I think your best path to success is the Cross-32 (C32) assembler. See
http://www.datasynceng.com/c32doc.htm . It's not free, nor terribly expensive ($99), but it not only gives the tables for assembling for lots of different processors when you buy the one assembler, but also provides a way to write your own tables for a processor of your own design. You don't have to do any programming to do it. They say that an entirely new processor should take about 40 hours to set up the tables for; but in your case it should be a lot less since yours is a variant of the 6502 which is already provided for, so you will only need to edit a copy of that table.
There's no need to find a minimal assembler to do what you want. This is a full-featured macro assembler. I've used it a lot and I really like it.