dclxvi wrote:
By using a 16-bit counter instead of an 8-bit counter you can extend the range, but the minimum number of cycles increases. With a 24-bit counter, the minimum will be far less 1283 but you can hit over beyond a million cycles exactly, which will be sufficient for most applications (I've never needed more than a 24-bit counter). So with an assembler macro that covers a handful of cases (including cases like using a pair of NOPs to delay 4 cycles, PHA PLA or PHP PLP to delay 7 cycles, etc.), you can easily hit any number of cycles you need. You can add register and/or flag preservation too, as that can be useful.
The Kestrel is intended to be a general purpose pseudo-replacement for my desktop computer. As such, the CPU may start out at 4MHz because that's what every piece of logic around it can handle, or later on when I move to FPGAs, it may go up to 16MHz or 20MHz. The problem is, if *you* write a program for the Kestrel, and I upgrade, why would/should I have to come to you to recompile some binary that I received from you?
Or, why should I have to recompile literally everything that I've ever compiled for it from day one?
It doesn't make sense from a usability point of view.
I stand by my comments.
Quote:
million cycles) this single subroutine might be all you ever need. For very short delays, say 5-10 cycles, you'll probably have to use self-modifying code that pre-modifies the code.
Again, it's all you need
if and only if you're running precisely one and only one program on the CPU at any given time. Otherwise, while burning cycles in an idle loop, you're denying other programs the right to run. In a multitasking environment (even event-driven and cooperative!), this is performance suicide.
Quote:
So it IS possible to add and delete routines on the fly without disturbing your timing.
Yes, and the program logic to make this magic happen will more than dwarf the time spent by most subroutines that interface to this mechanism. So to mitigate, you end up with time slices allocated in hundreds of milliseconds, thus limiting your multitasking capabilities at best to 10 tasks or so. Most mutlitasking engines are capable of handling more than 100 task switches per second before you even begin to notice sluggishness. Note that the limit of human perception is 100ms, so with 10 task switches per second, you're going to notice it being dog slow anyway.
Quote:
An interesting point to note is that once you've built the central "time manager" you really don't have rewrite any extant subroutines, just add the cycle count as a value returned.
Unless said extant subroutine returns something else in its registers or whatever. Unless I'm misunderstanding your definition of extant subroutine?
Quote:
Now all of this may be far more exotic than anyone (me included) needs, but I find this sort of approach absolutely fascinating. There is vast amount of unexplored territory because many people are of the mindset that interrupts are the only way to go. I should mention that one reason it hasn't been explored is because there are usually very few places where exact timing is a necessity.
Don't tell the folks of QNX, Inc. that, where their *entire business* is built around QNX, a
hard real-time, interrupt-driven microkernel that supports POSIX API (in fact, it is the world's first microkernel to do so). I'm sure they had a very good rationale for not going the cycle-counting route, and that route is explained above -- software overhead, pure and simple.
Quote:
Even languages like C and Forth that are closely related to low level code tend to obscure cycle counts.
This is false. They never even considered cycle counts to begin with, but if they did, managing cycle counts would be trivial for each.
Quote:
Alas, it seems like posting unconventional approaches accomplishes little but filling message boards with debate.
But this type of banter is precisely how people learn new things. However, you need to recognize, that the learning goes both directions.
What you're describing is essentially how pure-event-driven architectures work, where you have a centralized main loop that is responsible for scheduling units of functionality (I'm going to use the term
callback since that is what they really are). However, instead of the message target being the primary scheduling parameter, it is instead based on execution delay heuristics, passing whatever messages happens to be waiting for that callback at that time.
Doable? You bet! I'm not saying it's not. And it's something I've thought about.
But the overhead involved with using interrupts is minimal, compared to the scheduling overhead of a pure delay-based model. Interrupts work so well because it is *truely* a parallel environment (you have an external timing source that is operating independently of the CPU itself). Indeed, interrupts are the core of inter-processor communications. You cannot get true inter-processor communications with a delay-based approach -- at least, not without very expensive polling operations.