Speaking more generally, troubleshooting can take two possible approaches. Both can be useful, and you may even wish to alternate.
One way is speculative. For example, you could try replacing the 'hct245 with 'ahct245, or try rerouting the clock signal for less propagation delay. I'm not saying you should -- I'm just inventing examples where you're acting on Best Practices and maybe a hunch -- not on specific evidence.
The other way is analytical, where you first try to observe the problem in detail... then later ask "why" you're seeing what you're seeing (so you can fix the problem).
Your scope traces are analytical, of course.
For these and for ALL your tests I encourage you to simplify as much as possible. I can't teach you this -- it's a philosophy!
Speaking of loops, this one is a simple as it gets -- just one instruction! In a previous post I explained how to use pullup and pulldown resistors to ensure your machine executes an endless series of JSR(abs,X).
-- Jeff