Most of the Pentium-class and up x86 32-bit processors had a 64-bit wide memory bus. I wouldn't be surprised if Power, Sparc, etc, do or have done the same. Basically, once you add a cache you can effectively decouple the bus from your registers.
I wish somebody would start producing these:
http://www.venraytechnology.com/Implementations.htm It's hard to tell the details on the newer stuff, but the original design is for a 4096-bit wide bus right in the DRAM chips themselves, for single-cycle access to an entire cache line.
Intel's Haswell systems with the 128MB of L4 cache gets into somewhat similar territory, though that's not actually the external data bus to main memory.