6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 6:54 am

All times are UTC




Post new topic Reply to topic  [ 130 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 9  Next
Author Message
PostPosted: Sat Aug 20, 2016 5:16 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
BigEd wrote:
it makes it harder to make subsequent improvements
Good point. Sometimes documentation has unexpected subtlety!

BigEd wrote:
I'm not sure if it's even interesting to keep arguing about what we call something. Much more interesting to talk about how things work and how to program them.
Agreed. I wish my post had included a comment like this, as I wasn't especially keen to add fuel to the fire.

White Flame wrote:
I don't quite understand why even/odd comes into play
It needn't, necessarily. But the dispatch routine in litwr's post here uses bit0 of the index to choose between odd and even tables.

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 1:50 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
Even in kernels, how often is a table of more than 128 addresses even used?  (C64?  Others?  I really don't know.)  128 fits in 256 bytes.  Then you could either:

  1. ignore the msb of the starting number and then start with a left shift, meaning you have to start in A, do ASL A, then TAX before the JMP (<table_addr>,X), or
  2. allow up to $FE but require the number to be even.  Then you can load the number into X directly, and only do a JMP (<table_addr>, X), a single instruction, without disturbing A.

If more than 128 are needed, as in a kernel, they might be put in separate tables, separated by class, and accessed by something like JMP (class1_table,X) or JMP (class2_table,X) or JMP (class3_table,X), etc..  Then there would not be a limit of 256 items.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 7:41 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I imagine in the case under discussion, the 256-way branch is part of an extended-precision arithmetic function - it's dealing with one byte of operand, in such a way that each code branch deals with a constant operand. Fast and furious.

It does feel to me that Klaus has provided the vital insight which makes the original difficulty disappear - taking the high bit instead of the low bit. litwr (or anyone) can almost certainly now avoid crossing a page boundary and therefore write portable code.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 9:12 am 
Offline

Joined: Sat Jul 09, 2016 6:01 pm
Posts: 180
I have to express my admiration for Hitachi 6309 chip. It can even divide! So I have to say that it is a bit sad that due to a kind of grey power 8-bit systems had not reached their top possibilities. :( We can only have dreams about them. IMHO MOSTEC team might create something much better than 6309 if they had the opportunity.
BigDumbDinosaur wrote:
If using the 65C02 or 65C816, there is the JMP (<addr>,X) instruction, which requires no preparation of any jump vectors. You use it with a table containing 16 bit addresses in little endian order. The following code does the work:

This code requires 16 bit in XR. So it is impossible for 65C02. It may be used with 65816 but requires rare 16-bit index registers mode.
Klaus2m5 wrote:
That's simply because it doesn't work.

It is because the right working example was just ignored. :( So I had to make a "fast and dirty" example which shows only the right idea without 100% correctness. Anyway, thank you very much for analyse of the example. However, I repeat, this example is artificial. The natural example was the first. It is taken from the code of still the fastest 6502 division for http://forum.6502.org/viewtopic.php?f=2&t=4185. So I have to add more explanations to it. The spigot algorithm uses only the odd divisors so as the specialized division for it. So the only one 256 bytes table for the odd divisors is required. The used code is very easy and fast
Code:
        ldx divisor
        stx mjmp+1
mjmp    jmp (divjmp)

The divjmp table for CMOS 6502 has to have an ugly 1 byte displacement. It maybe considered as a bug. :shock: This example also shows that it doesn't allow to use bit 7 instead of bit 0.
GARTHWILSON wrote:
JMP (<addr>) is not a ZP instruction.

It is not any kind of 6502 absolute address too. JMP (addr) is the special addressing mode for this instruction only. It is perfectly documented in the original 6502 manuals:
MOS Technology 6500 Series Hardware Manual
MOS Technology 6500 Series Programming Manual
NMOS 6502 works fine according its specification. So it is CMOS 6502 which has a JMP (addr) bug. IMHO CMOS 6502 is a step in the wrong direction. It is the direction to decline 6502. :( Almost all of its additional instructions have very little importance. They may make codes slightly faster and smaller but in the completely tiny scale. I can estimate less than 2% smaller size and less than 1% faster speed. I want only BIT #imm and DEA, INA of its instructions. Even the useful BIT #imm can't set N and V flags. :( Sometimes 65c02 is even slower than 6502. :( The major evil of CMOS 6502 is the occupation of the valuable opcodes by these unimportant instructions. This had halted the natural development of 6502. 65816 and even 4510 had to follow this heavy and bad inheritance. :( Instead of these "occupants" maybe placed much more powerful instructions: 16 bit arithmetic, POP XY, PUSH XY, work with two (or more) segment registers (it might give short and fast operations with 20 or 24 address bus and the relocation of codes), 16-bit accumulator, maybe another 16 bit accumulator, etc.
I have also to note that we have tens of thousands (or even more) ML programs for NMOS 6502...
GARTHWILSON wrote:
That's one reason I'm intrigued by Jonathan Halliday's preemptive multitasking GUI OS for Atari 6502 computers.

Thank you. I missed this great project. It looks much better than GEOS but requires a lot of fine applications to match it.

_________________
my blog about processors


Last edited by litwr on Sun Aug 21, 2016 10:16 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 9:34 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
I think the C02 was an incremental development, with the main aim being to move to CMOS technology. I suspect the limiting factor for the people designing and implementing it would not be 'have we used all the opcodes yet' but 'how much can we add without blowing the chip size budget' - it's natural for the programmer to be thinking about how to fit opcodes into a table, but for a chip designer it's almost always chip size, and design effort, which is paramount.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 10:44 am 
Offline

Joined: Sat Mar 27, 2010 7:50 pm
Posts: 149
Location: Chexbres, VD, Switzerland
Quote:
It is perfectly documented in the original 6502 manuals:
MOS Technology 6500 Series Hardware Manual
MOS Technology 6500 Series Programming Manual

Perhaps it would help if you could quote how exactly the jmp () warparound is documented in order to identify whether this is really a feature and not a bug.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 11:19 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
Bregalad wrote:
Perhaps it would help if you could quote how exactly the jmp () warparound is documented in order to identify whether this is really a feature and not a bug.

The detailed description I've quoted above shows the wrap, and the hardware manual has the same description again. At the very least, this indicates that the wraparound was a known and documented issue at the time. Whether they considered this a 'feature' or a 'known bug' is impossible to tell.

According to the description quoted by Dr Jefyll, the NMOS 6502 uses the PC to increment the address, like I do in my core, so I would think that it would be rather simple to increment the MSB as well. The hardware is already there.


Attachments:
6502-hw.png
6502-hw.png [ 67.06 KiB | Viewed 1346 times ]
Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 11:36 am 
Offline

Joined: Sat Jul 28, 2012 11:41 am
Posts: 442
Location: Wiesbaden, Germany
litwr wrote:
Klaus2m5 wrote:
That's simply because it doesn't work.

It is because the right working example was just ignored. :( So I had to make a "fast and dirty" example which shows only the right idea without 100% correctness. Anyway, thank you very much for analyse of the example. However, I repeat, this example is artificial. The natural example was the first. It is taken from the code of still the fastest 6502 division for http://forum.6502.org/viewtopic.php?f=2&t=4185. So I have to add more explanations to it. The spigot algorithm uses only the odd divisors so as the specialized division for it. So the only one 256 bytes table for the odd divisors is required. The used code is very easy and fast
Code:
        ldx divisor
        stx mjmp+1
mjmp    jmp (divjmp)

The divjmp table for CMOS 6502 has to have an ugly 1 byte displacement. It maybe considered as a bug. :shock: This example also shows that it doesn't allow to use bit 7 instead of bit 0.
O.K., so now I understand! I think it was impossiple to anticipate what you were trying to do by simply posting the 3 lines of code in your first post. The explanation now helps more than your artificial example.

However, it is bold to tie the assumption of a bug in the 65c02 to the very special case that you are talking about. It is like calling the missing undocumented opcodes in the 65C02 a bug. I am sure, that there are much more coders having been bitten by the lack of the carry into the upper byte of the indirect address, than there are coders facing the same problem as you.

The original programming manual did not warn of that fact, but rather boldly stated in the details where it comes to increment the indirect address: IAH,IAL+1 - fetch ADH. And yes, the manual does not mention the word absolute in conjunction with JMP indirect.

The changes versus the NMOS CPU are well documented in the 65C02 datasheet, so according to your logic are features and not bugs of the CMOS CPU.

_________________
6502 sources on GitHub: https://github.com/Klaus2m5


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 11:43 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Arlet: not sure about what you say, maybe I'm misreading or you mistyped. But the 6502 (the NMOS part) can be seen in visual6502 not to be using the PC:
http://www.visual6502.org/JSSim/expert. ... loglevel=5
(I don't think the 6502's PC is capable of wrapping around within a page - it has 16-bit incrementer)


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 11:54 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
I was referring to the text in the programming manual: "In the JMP Indirect instruction, the second and third bytes of the instruction represent the indirect low and high bytes respectively of the memory location containing ADL. Once ADL is fetched, the program counter is incremented with the next location containing ADH."

Maybe the original design meant to use the PC, and get a free 16 bit increment, but then they changed the design to use the ALU instead.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 11:58 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Thanks, yes, I see it now. That's a good example of a manual trying to explain too much, perhaps - it doesn't reflect the implementation! In fact, it almost looks like a classic case of confusing the PC with the address bus.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 12:11 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Quote:
The spigot algorithm uses only the odd divisors so as the specialized division for it. So the only one 256 bytes table for the odd divisors is required. The used code is very easy and fast
Code:
        ldx divisor
        stx mjmp+1
mjmp    jmp (divjmp)

The divjmp table for CMOS 6502 has to have an ugly 1 byte displacement.

I see that it's inconvenient to have to account for the difference between the 02 and the C02. But, I think it may still be possible to use a 257 byte table which will suit both CPUs?


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 5:30 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
litwr wrote:
It is because the right working example was just ignored. :(
Apologies, litwr -- you must feel somewhat frustrated! But the artificial example turned out not to be helpful, and it might've been better just to stick to the actual code and keep trying to explain what it does and why you're dissatisfied.

BTW and FWIW the JMP (addr) bug/feature is a common source of confusion, and some of us probably jumped to the conclusion you didn't understand its behavior (not true!). Anyway I'm glad we've finally reached an understanding of what it is we're talking about! :D

BigEd wrote:
But, I think it may still be possible to use a 257 byte table which will suit both CPUs?
Sounds right to me. Bit0 of the index is always =1, which means every 16-bit value in the table begins at an odd address. When the index=$FF, an NMOS cpu will fetch the 16-bit value from offset $FF (low byte) and offset 0 (high byte). A CMOS cpu will fetch from offset $FF and offset $100 -- hence the 257-byte table. I guess this is what's meant by an "ugly1 byte displacement," and I agree it's not elegant having to store the same value both at offset 0 and at offset $100. At least it'll work properly with either cpu.

Regarding the doc, I'd say the conflicting indications are both too insubstantial or untrustworthy to read much into. Arlet suggested MOS might've eventually changed the design, which'd explain the puzzling mention of the Program Counter in the text of the programming manual. That's possible, but to me it seems more likely the author was simply writing in a hurry and chose to include a slight inaccuracy rather than being conscientious about internal details that don't affect the user anyway.

Attachment:
6502-hw.png
6502-hw.png [ 67.06 KiB | Viewed 1328 times ]
As for this cycle-by-cycle description above, again I suspect that haste has won out over precise accuracy. I don't believe they meant anything by omitting mention of a carry into IAH -- I think they either didn't notice or didn't care/deemed it too awkward to express. IOW it's just coincidence that a slight omission of detail accidentally confirms an actual bug/feature. I could be wrong. But IMO that entire section -- the cycle-by-cycle stuff -- needs to be taken with a grain of salt, given that at least one rather drastic error slipped in (see below; details here).
Attachment:
Table A-5-8 correction.png
Table A-5-8 correction.png [ 111.25 KiB | Viewed 1328 times ]

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 6:20 pm 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
Quote:
As for this cycle-by-cycle description above, again I suspect that haste has won out over precise accuracy. I don't believe they meant anything by omitting mention of a carry into IAH -- I think they either didn't notice or didn't care/deemed it too awkward to express

They didn't omit anything. The description is exactly how the NMOS 6502 behaves.


Top
 Profile  
Reply with quote  
PostPosted: Sun Aug 21, 2016 6:42 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
Arlet wrote:
They didn't omit anything. The description is exactly how the NMOS 6502 behaves.
I don't have a strongly-held position on this -- I'm just indulging in some idle speculation. :) Yes that's how the NMOS chip behaves -- the lack of mention (in the cycle-by-cycle section) of a carry into IAH turns out to be accurate. You feel the lack of mention was deliberate, which is a plausible outlook -- but an alternative explanation is implied, IMO, by an error that to me says the cycle-by-cycle section was hastily done and lacked proper proof-reading.

Supposing the lack of mention in the cycle-by-cycle section is deliberate, can you suggest why the text portion isn't similarly candid and forthright? (Maybe it is their way of being subtle!)

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 130 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6, 7 ... 9  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 19 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: