I am going to use the definition that "human-friendly" means that I can program it without having to struggle or think too hard.
I am the odd duck who voted for X86. Partly because I have done it so much that it is burned into my brain and partly because if you program it in 32-bit virtual mode (popularly called flat-mode,) it is free of many of the warts and restrictions of 8086 programming as most know it. You do not have to worry about segment registers or a 64 KB limit on the size of data structures. Most of the register usage special cases go away. The addressing modes are plenty and powerful.
Some legacy features can come in handy, such as the ability to use some of the registers as two independent 8-bit registers. On the 486 and higher, you can actually use one 32-bit register as four independent 8-bit registers with the use of the BSWAP instruction, albeit with a slight performance penalty because only two of them can be accessed at one time - think of this feature as an analogy of the Z80 alternate register set.
Starting with the 486, the processor effectively executes most "simple" instructions in a single clock cycle if they are in the pipeline and there are no contentions with the effects of preceding instructions. Starting with the Pentium, there are actually two execution units operating in parallel; one can execute any instruction and the second a subset if there is no contention. The code for much of my processor simulators look like two streams of instructions interdigitated together because that is exactly what it is. Up to twice as much can be done at the same time if it is structured correctly. Newer processors can do much of this automatically within one thread. The 486 and the Pentium are fun to optimize code for.
The 6800 is generally enjoyable to program though there are several nagging irritants. Utmost is that it only has one index register and the fact that the index register cannot be placed onto or retrieved from the stack easily; retrieval is not too bad but pushing it is painful. Another is that not all instructions support the direct addressing mode (equivalent to the zero page for 6502) so some code is larger and slower. A rather surprising wart is that storing a register to memory affects the condition codes. Overall, the 6800 makes good use of two almost equal accumulators.
Another nice feature of the 6800 is the full set of conditional branches, signed and unsigned =, <>, >, >=, <, <=.
The 6809 solves some of the 6800 limitations in spades with four registers (including the stack pointer) usable for indexing along with a number of new indexed addressing modes.
The price you pay is that some of the small and fast 6800 instructions have been replaced with larger and slower instructions or even sequences of instructions. A particular loss is that INX/DEX has been replaced by LEAX 1/LEAX -1,X which is two bytes instead of one. Some relative branches "on the edge" go out of range as a result. Replacing such branches with their long forms can trigger a chain reaction with other branches.
The Hitachi 6309 is to the Motorola 6809 much as the 65816 is to the 6502. It adds more registers and a faster "native" mode.
The 6502 boasts many faster instructions than the 6800. The fact that index registers are limited to 8-bits is offset by the fact that there are two of them instead of one and the powerful addressing modes provided. The (addr),Y mode to be specific; I do not know of any other processor which has this little gem. The BIT instruction with a memory operand is another gem.
Some of the irritants of the 6502 have been eliminated with the 65C02 - it should have been another choice in the poll?
I prefer a decimal adjust instruction instead of a decimal mode.
The 8080/8085/Z80 sports many more registers than the other 8-bitters. However, special usage cases (restrictions) abound. If the processor suits your algorithm, you can do magic. Otherwise, be prepared to fight the code. Unlike the 680x and 650x, the 80s do not set the condition codes with register loads. Sometimes that is a good thing but many times it is not.
Many of the "new" Z80 instructions suffer from the time needed to fetch an extra byte of machine code. Only a few of them make up for it in functionality. Much of the time, the Z80 index registers are too slow to be of benefit unless the algorithm really needs them.
Most of the 8-bit programmers drooled over the design of the 68000, me included. It was many years before I actually got to program on one. The early generations of the 680x0 are relatively slow because of its microcoded architecture.
INC/DEC has been replaced by ADDQ/SUBQ. The advantage is that it is good for bumping a value by small amounts besides just one; the drawback is that the carry flag is affected, making multiple-precision calculations difficult unless it can fit the requirements of the DBcc instruction.
I do not really understand the need for the X (extend) flag beyond what is provided by the carry flag.
The AVR is a rather nice architecture to program. Most instructions execute in a single cycle. You get 32 8-bit registers. These can be viewed as sixteen 16-bit registers for register to register moves. Six of them can be used as three index registers. Sixteen of them can be used in operations with an immediate operand. The biggest limitation is that most devices do not have much RAM. The AVR does not have a decimal adjust instruction, but it has a half-carry flag to roll your own not-so-efficient equivalent.
I am learning ARM assembly language using a Raspberry Pi. I find the Raspberry Pi OS Assembly Language Hands-On-Guide by Bruce Smith to be a good way to learn. So far, it seems pleasant enough. A nice feature is the ability to specify whether an instruction affects the condition codes. My biggest gripe so far is no support for decimal arithmetic.
If I appear to put too much emphasis on decimal math, you are right. I am a heavy user of the Double Dabble method of converting a number from binary to ASCII decimal. If you look at the algorithm, you may notice that a bunch of the testing and adjusting is just a version of decimal adjust for architectures lacking that instruction.
https://en.wikipedia.org/wiki/Double_dabble