The Amiga used the "Paula" chip to control the FDD.
This is a common misconception; it's not
really true.
They used the CIA chips to actually control it (drive-select, motor on/off, step direction, step movement, is the disk in the drive? etc.). The Paula chip is nothing more than a serial shift register which reads bits raw off the data line, shoves them into 16-bit words, and then tells the Agnus chip when to stuff a word into RAM (or fetch a word
from RAM if you happen to be writing data). It's simplicity is stupifying.
The Amiga uses its
blitter (the same feature responsible for lightning-fast graphics updates) to perform MFM (de/en)coding.

This permits the computer to process up to three tracks of data at once (and why trackdisk.device always had a minimum of 3 diskbuffers allocated). One is being DMA-ed into RAM by the Paula/Agnus combo. One would be decoded by Agnus. And the third would be processed by the 680x0 processor for filesystem operations. So, if you think the Amiga's floppy interface was slow compared to other computers, keep in mind that most of that sluggishness was in the filesystem driver,
not in trackdisk.device. Anyone who has played "Space Ace" on the Amiga knows
exactly how fast the floppy could be (30fps, full-screen, 384x240 resolution, HAM video mode animation --
off of a floppy disk!).
The Commodores to use higher densities had to spin high density disks at half the speed to get the with the chipset.
This saved the cost of having to re-engineer the Paula to have a 1Mbps shift register instead of a 500kbps shift register (which all floppies made up until HD was introduced used; 250kbps, used for 360K media, could be effortlessly emulated by just doubling up on the bits when you shifted them out).
In theory a 1.44MB floppy can be made to hold 2MB of data.
1.8MB seems more reasonable (the Amiga's full-track capabilities have 1.76MB using the OFS filesystem). The reason you won't be able to pull a solid 2MB is because of encoding and synchronization issues. For example, because the Paula essentially used a single monostable multivibrator for its read path, you would lose about 4 to 8 bits worth if MFM every time the head would step or the motor started. As a result, you need at least twice that for a gap, both in front of and behind a special invalid MFM sequence which signals, "Hey, the start of data is here."
Although, if you used GCR instead of MFM, or even RLL encoding, you could probably pull off 3MB to 4MB. But, you'll probably be dealing with motor speed control tricks again to get the precision timing right for the interface chip.