I think it's common enough for general machine emulators to run some number of instructions and then handover to the emulation of peripherals such as video, and this turns out not to be a good match for highly detailed emulations which need accurate cycle-by-cycle accesses.
Possibly a good start would be to take the leaf routines from lib65816 and rework them into a different framework. But then, these routines won't be taking care to perform cycle by cycle activity, so they might need a fair bit of work.
For hints on approaches, I do recommend the blog posts on the construction of JSBeeb: it's a different language, but a very accurate simulator of a 6502 system.
https://xania.org/201405/jsbeeb-emulati ... javascript