Don't the other methods also need to do the "AND $0F"?
Also, if we do both nybbles in a byte, the cost of the table is only an additional 8 bytes per instance.
And don't forget the efficient nybble swap at http://6502.org/source/general/SWN.html, only 8 bytes and 12 clock cycles, and no variables, no stack usage, no look-up table, no X or Y usage. It uses only the accumulator and status register.