Page 3 of 5
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 2:27 pm
by Dr Jefyll
But he used NOP instructions to make the 6502 generate increasing address values, and I wanted to find another instruction that was 2 bytes instead of one byte, so the address increments would go twice as fast. As it turns out, I couldn't think of any instruction that doesn't have any sort of side effect and is 2 bytes, and 2 clock cycles.
Jac, it sounds like you should consider using one of the 32 identical "nop" instructions that fill columns _3 and _B on the 'C02 opcode map. These are one-byte, one-cycle nop instructions that don't affect the flags. They can be used for all sorts of tricks! (Example:
Ultra-fast output port) And they're ideal for
generating increasing address values (for boot loading, or for a DMA-like "cheap video" video interface that uses minimal hardware).
(Because STP and WAI opcodes reside in the columns mentioned, the WDC 'C02 has 30 of these instructions, not 32. But you only need one!)
cheers
Jeff
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 3:51 pm
by jac_goudsmit
Jac, it sounds like you should consider using one of the 32 identical "nop" instructions that fill columns _3 and _B on the 'C02 opcode map. These are one-byte, one-cycle nop instructions that don't affect the flags.
I don't remember seeing in the WDC docs that they are one-byte one-cycle; I assumed they would be two cycles; that's interesting and I guess I should review...
I wouldn't want to use an instruction that's not present on all types of 6502, though; I want to make it compatible with as many different types of 6502 as possible, so that someone could eventually even make e.g. a 1541 drive emulator out of my kit (the firmware of the 1541 uses an undocumented opcode on the MOS 6502 if I'm not mistaken). For my purpose, one-clock one-cycle is just as efficient as two-clock two-cycle.
Thanks for the tip!
===Jac
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 4:03 pm
by Michael
Hey, Jeff. That really is genius! Thank you.
Regards, Mike
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 5:37 pm
by Dr Jefyll
Glad I was able to share an item of interest with you guys. My own experience is with the Rockwell 'C02. (I had to document all the undefined NOP's before building my
Kimklone years ago.) As for the WDC '02, I expect its undefined NOP's would behave identically to the Rockwell's. It's my understanding the two chips share a common origin -- they are not independent designs.
For my purpose, one-clock one-cycle is just as efficient as two-clock two-cycle.
Yes, and your choice of CMP Immediate runs on all versions of the 6502, as you say. For your needs, the only (slight) advantage offered by the one-cycle NOP's is preserving the flags.
-- Jeff
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 6:04 pm
by jac_goudsmit
I read the Rockwell 65C02 datasheets. One of the nice things is that, even though they presumably still have the undocumented instructions as the NMOS 6502, it specifies explicitly that you can hold the chip indefinitely while the clock is HIGH. This inspired me to start and end my main loop with CLK0=HIGH too: when the main loop isn't running, the 6502 is held in a state where at least the Rockwell 65C02 can be held indefinitely (and an NMOS 6502 can be held for some time with CLK0=high).
Later on in the development, I realized that I could use this in combination with the fact that the outputs on the Propeller are OR'ed from all cogs. There's not enough time in a 1 microsecond cycle to check a flag in hub memory, in order to see whether another cog wants to stop the main loop (e.g. for debugging). But it IS possible for the main control cog to drop CLK0 low and then check if another cog is still pulling it high, and break out of the loop if so. It's a pretty nifty idea even if I do say so myself
Which reminds me, I really should finish that "Theory of Operations" article on my website
===Jac
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 7:22 pm
by BigDumbDinosaur
I don't remember seeing in the WDC docs that they are one-byte one-cycle; I assumed they would be two cycles; that's interesting and I guess I should review...
There are no single cycle instructions in the W65C02S and W65C816S. Illegal instructions in the 65C02 are NOPs that can be of varying length and execution time. As WDC has recently changed foundries, it's conceivable that the behavior of illegal instructions could change as well.
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 8:08 pm
by Michael
... I really should finish that "Theory of Operations" article on my website

Hi Jac,
I would love to see that, especially if it describes how you've allocated the various cogs to various tasks (with a brief description of the task). I've never looked at your source code because I wasn't familiar with Propeller Spin or Assembler (which I hope to remedy soon) so I can only guess at how you're doing things.
Cheerful regards, Mike
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 8:25 pm
by Michael
Later on in the development, I realized that I could use this in combination with the fact that the outputs on the Propeller are OR'ed from all cogs. There's not enough time in a 1 microsecond cycle to check a flag in hub memory, in order to see whether another cog wants to stop the main loop (e.g. for debugging). But it IS possible for the main control cog to drop CLK0 low and then check if another cog is still pulling it high, and break out of the loop if so. It's a pretty nifty idea even if I do say so myself

I assume you're talking about stopping the clock after you've started running the 65C02 at full speed? If so, then under what conditions would a cog want to stop the main loop?
Mike
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 8:42 pm
by Dr Jefyll
There are no single cycle instructions in the W65C02S and W65C816S.
May I ask what makes you so certain of this? I'll grant the datasheet says the NOPs can be of varying length and execution time, but, to my reading, that hardly excludes an execution time of one cycle. I hope you can cite a source other than your own surmise.
My own information comes from direct observation of the Rockwell '02 actually running these one-cycle NOP's. I suspect the WDC '02 will show the same behavior, but I have not confirmed this.
I read the Rockwell 65C02 datasheets. One of the nice things is that, even though they presumably still have the undocumented instructions as the NMOS 6502
Jac, I have a feeling you wrote that in a hurry.

I'm sure you know the undefined opcodes on the NMOS 6502 are
not NOP's.
cheers
Jeff
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 8:43 pm
by jac_goudsmit
I assume you're talking about stopping the clock after you've started running the 65C02 at full speed? If so, then under what conditions would a cog want to stop the main loop?
The answer is pretty much the same as to the question you asked earlier: "Why would I want to access the SRAM chip after I've already started the 6502?"
I want to think of Propeddle not only as a 6502 computer (either a newly developed system or emulating an existing system), but also as a 6502 development system. Making it possible to stop the 6502, to debug the code or do things such as reading the contents of the RAM to make a backup is a no-brainer for a development system. It really didn't even come up in my mind why it might NOT be needed.
===Jac
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 8:59 pm
by jac_goudsmit
There are no single cycle instructions in the W65C02S and W65C816S.
May I ask what makes you so certain of this? I'll grant the datasheet says the NOPs can be of varying length and execution time, but, to my reading, that hardly excludes an execution time of one cycle. I hope you can cite a source other than your own surmise.
I haven't looked closely enough at places like visual6502.org but I think because of the way the 6502 loads and executes instructions, the minimum number of cycles per instruction is always 2. I may be wrong (I was hoping I am -- one-cycle NOPs would definitely come in handy e.g. for timing adjustments).
I read the Rockwell 65C02 datasheets. One of the nice things is that, even though they presumably still have the undocumented instructions as the NMOS 6502
Jac, I have a feeling you wrote that in a hurry.

I'm sure you know the undefined opcodes on the NMOS 6502 are
not NOP's.
What I was trying to say there is that my project should (and will) at least be compatible with the Rockwell 65C02 as well as the WDC 65C02S, because someone might want to use undocumented opcodes as part of some hardware emulation project. And if the original hardware that they want to emulate, contained an NMOS 6502, they can probably emulate the hardware by using a CMOS 65C02 with my project (obviously with resistors in the circuit to convert 5V to 3.3V) because the 65C02 allows you to stop it by stopping the clock, which is what I have to do for short periods of time, e.g. while the main cog is switching between normal operation and "DMA mode". I wasn't talking about using those undocumented opcodes in my own code: we already established that it's better if I don't.
===Jac
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 9:16 pm
by BigEd
Looking at
http://visual6502.org/wiki/index.php?ti ... 56_Opcodes, it seems that opcode 0x80 (among others) is a 2-byte and 2-cycle NOP: that would presumably increment the address in each cycle, so it kind of works like a pair of 1-cycle NOPs.
Of course, the C02 designs might well work differently.
Re: My 3-Chip Design Is Working...
Posted: Wed Feb 19, 2014 9:28 pm
by jac_goudsmit
... I really should finish that "Theory of Operations" article on my website

Hi Jac,
I would love to see that, especially if it describes how you've allocated the various cogs to various tasks
In short:
- The control cog runs an assembler program that loops around, getting instructions from the main program. Instructions are things such as "download bytes to SRAM", "upload bytes from SRAM", "run" and some other ones that are mostly for debugging. When it runs the main "run" loop, it clocks the 6502, generates signals (RESET/IRQ/NMI/RDY/SO) from a hub storage location, and activates the SRAM chip according to the R/W line, and it checks with the previously mentioned algorithm if another cog wants to stop it. It also optionally keeps track of the number of cycles executed so you can tell it to e.g. run 6 clock cycles after a reset. Normally exactly one of these cogs is needed.
- A memory control cog builds a lookup table from a memory map, and overrides the RAM control from the control cog, so that the SRAM chip doesn't get activated for any areas that I want to designate as ROM or external hardware. In most cases, exactly one of these is needed but it's optional if no memory areas need to be read-only and if all hardware is emulated by an I/O cog, see below.
- A hub mapping cog maps any contiguous 6502 memory area to hub memory, and disables the SRAM chip for that memory area. For some functionality such as a video buffer, this cog can map the same area for input as well as output. But for most I/O emulation (e.g. a 6850 ACIA where writing to a location means something other than reading from the same location) two or more cogs are required that run this code, with different configurations.
- Optionally, a trace cog can be used to log which locations the 6502 reads from and writes to.
The reason why I had to divide the work this way is that with all the glue logic, I need about 17 instructions per cycle to turn the chips on and off and clock data through the D-flipflops. I had to use a spreadsheet to figure out how to get the timing right, because I also have to take propagation delays into account. I can possibly make it work a little faster but then I'd need even more cogs: a system with some I/O chips and a video buffer such as the PET or the OSI Challenger/Superboard/UK101, will require up to 5 cogs to control the 6502 and SRAM chip, plus 2 cogs for the video and keyboard driver, plus one cog for initialization and I/O emulation.
For more complicated systems, it's possible to connect multiple Propellers "in parallel". This is an idea for later, but one that I definitely want to prepare for when I design the new PCB.
===Jac
Re: My 3-Chip Design Is Working...
Posted: Thu Feb 20, 2014 5:42 am
by Dr Jefyll
There are no single cycle instructions in the W65C02S and W65C816S.
BDD, we seem to have contradicted one another in regard to the W65C02S. Probably you meant no discourtesy, but you do seem to have a penchant for presenting your inferences as fact.
Of course, even the suspicion I presented ought to be supported, and I apologize. I've now located the reference that eluded my memory earlier. It is WDC's own data sheet for the W65C02S, and the second row of Table 7-1 describes Execution of invalid OpCodes.
- listed as 2 byte, 2 cycle are 02,22,42,62,82,C2, E2
- listed as 1 byte, 1 cycle are X3,OB-BB,EB,FB
- listed as 2 byte, 3 cycle is 44
- listed as 2 byte, 4 cycle are 54,D4,F4
- listed as 3 byte, 8 cycle is 5C
- listed as 3 byte, 4 cycle are DC,FC
Folks who find such minutia interesting may wish to visit
this page of my site. I've documented not just the bytes and cycles, but also what the undefined NOP's actually
do (on the Rockwell 'C02). Edit: later tests show that WDC's W65C02S cpu behaves identically except that it features WAI and STP, thus reducing the number of undefined NOP's by two.
Something I'd like to know is whether the one byte, one cycle NOP's delay interrupt acceptance. Edit: they do. Interrupts will not be recognized while a one-cycle NOP (or a string of such NOP's) is executing. Interrupts are recognized on the first non-one-cycle instruction that follows. More info
here.
-- Jeff
[Edit]: Rockwell '
C02 -- not '02.
[Edit]: fix link. WDC cpu. Interrupt acceptance update.
Re: My 3-Chip Design Is Working...
Posted: Thu Feb 20, 2014 6:08 am
by Dr Jefyll
I think because of the way the 6502 loads and executes instructions, the minimum number of cycles per instruction is always 2. I may be wrong (I was hoping I am -- one-cycle NOPs would definitely come in handy e.g. for timing adjustments).
I agree that what I've reported seems strange in light of the way the 6502 loads and executes instructions. It occurs to me that, to achieve the one-cycle NOP's, maybe they just jimmied the logic somehow so that the opcode fetch is artificially lengthened, discarding the first byte fetched (the one-cycle NOP). IOW, it doesn't really "execute." This is a case where that sort of cheap kludge is quite acceptable. After all, the one-cycle NOP isn't required to do anything (except advance the PC).