6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 24, 2024 3:51 am

All times are UTC




Post new topic Reply to topic  [ 136 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8, 9, 10  Next
Author Message
PostPosted: Thu Oct 19, 2023 11:40 pm 
Offline
User avatar

Joined: Fri Sep 08, 2017 9:41 pm
Posts: 43
Location: UK Expat living in Washington State, US
Ok Chromatix, if I hold my breath any longer I'll burst!

Any update?


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 21, 2023 4:35 pm 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Well, there have been some distractions…

As of tonight, I've implemented most of the expression parser. I still need to write routines for 32x32b multiplication and signed division, and add at least a PRINT statement so that it can all be tested conveniently. So far all output has been using VDU, which was much simpler to implement but is less convenient to use as a debugging aid. I expect testing and debugging to take a while in itself, since there's a lot of very flexible functionality just been added.

The code seems to have turned out reasonably clean, so there shouldn't be any glaring inefficiencies, aside from my (space efficient) design choice to base half the comparison operators on logically negating the other half, and to implement subtraction using arithmetic negation and addition; this reduces the number of operator execution paths at the expense of some slight complication in the front-end of the parser. Interestingly, this implementation of subtraction is natural for floating-point arithmetic, but comparison operators on floating-point would not quite be so amenable to this kind of optimisation, since comparisons involving NaNs should always return false. (Even "not equal" is written as "greater than or less than" in BASIC, so is not strictly the logical complement of "equal".)

One very useful side-effect of separating the operator and operand contexts is that I only need to handle the distinction between keyword-based operators and variable names in one of those places, not both. I think this contextualisation is a very important insight which is missing from most discussions of the shunting-yard algorithm.

As an aside, I found a (rare) natural use for pre-indexed indirect mode. The other uses of it in this work are just emulations of non-indexed indirect mode.


Top
 Profile  
Reply with quote  
PostPosted: Sat Oct 21, 2023 6:48 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Chromatix wrote:
Well, there have been some distractions…

As of tonight, I've implemented most of the expression parser. I still need to write routines for 32x32b multiplication and signed division, and add at least a PRINT statement so that it can all be tested conveniently. So far all output has been using VDU, which was much simpler to implement but is less convenient to use as a debugging aid. I expect testing and debugging to take a while in itself, since there's a lot of very flexible functionality just been added.


Good progress update, thanks.

Feel free to steal the code for mul & div from my stuff if you like - not the fastest, nor smallest, but as a "get you going" sort of thing it might help. My strategy has always been to check the signs of the numbers, make positive if negative (and set a flag), do the mul or div, then adjust the resulting sign based on the signs of the starting numbers. It's not as space efficient as it could be, but it's a solution I used in my BCPL system which seems to work well. (and the div code returns mod for free - could be optimised not to, but...)



Looking forward to trying it!

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 2:28 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Interestingly, when the output of a multiply is the same size as both of its inputs, the bit pattern of the output doesn't depend on whether the values are treated as signed or unsigned - just like addition and subtraction. Only when you calculate the more-significant bits does the signedness become important, so for example the ARM has a single MUL instruction but separate UMULL and SMULL. That will simplify things for that routine. I don't think I can use that assumption to simplify division, however, so I'll need to extract and remember the sign of the result, rectify the inputs, and conditionally negate the output. I've already got the negate code in a subroutine to save space with that in mind.

I'm also going to see if I can use algorithms that are efficient while avoiding the ROR instruction, just as an extra challenge point. That means the operands will need to move in a particular way, one shifting left past another instead of the latter being shifted right past the former. I'm not sure whether early 6502 experimenters did this, rather than using algorithms that required shifting right and just implementing clumsy direct workarounds for ROR.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 5:27 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 988
Location: Potsdam, DE
Chromatix wrote:
Interestingly, when the output of a multiply is the same size as both of its inputs, the bit pattern of the output doesn't depend on whether the values are treated as signed or unsigned - just like addition and subtraction. Only when you calculate the more-significant bits does the signedness become important, so for example the ARM has a single MUL instruction but separate UMULL and SMULL.


That's true, because the product of the two sign bits ends up in bits 14 and 15 (for an 8*8) or bits 30 and 31 (for a 16*16). But you don't get the sign of the result correct until you complete the multiplication, no? And is an 8*8=8 (or 16*16=16) multiplication telling you anything useful beyond a partial product? Or is there an optimisation I can't immediately see?

(I note that AVR parts have partial product multiplications similar to the ARM but on 8*8=16 basis).

Neil


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 5:36 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Technically what I'm going for here is 16x16b -> 32b, but potentially repeatedly without losing range or precision until the final assignment. The least complicated implementation is to have all values on the operand stack be the same type - ie. signed 32 bits - even though variables only store 16 bits. When a variable or literal is referenced in the expression, it gets sign-extended to 32b at that time (well, a negative literal becomes a zero-extended unsigned literal which is then negated, because the minus sign is treated like any other unary negation operator).

The sign of the result will be correct, provided the magnitude of the result doesn't overflow for a signed 32-bit representation. That's also true for the ARM. It means you can do stuff like accumulating a 24-bit partial result then multiplying it by an 8-bit further operand, then dividing it down into the 16-bit range - provided it's all in the same expression. Very useful when you want to do fixed-point arithmetic to compensate for the lack of floating-point support.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 5:45 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 988
Location: Potsdam, DE
Gotcha, thanks. That makes sense.

Neil

edit: and if bits 15 and 14 aren't the same, you got an overflow...


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 6:01 am 
Online
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1949
Location: Sacramento, CA, USA
barnacle wrote:
and if bits 15 and 14 aren't the same, you got an overflow...

Well, I haven't thought it through in detail, but I think an overflow would be indicated if bit 15 is different from one or more of bits 16 to 31. For unsigned, overflow detection is a bit simpler ... one or more of bits 16 to 31 are non-zero.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 6:47 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
It doesn't overflow until the magnitude requires more than 31 bits to represent, or until the assignment would need to truncate meaningful bits to fit it into 16. These are things that are not checked at runtime in BASIC, any more than they are in C. (I also don't intend to explicitly check for division by zero, and just let the result come out as whatever it happens to be, probably ±1 depending on the sign of the dividend.)

To put it another way, the sum of the magnitude bit widths is what determines the meaningful bit width of the result. You can easily see that by considering multiplication by adding in the most-significant position first, and noting that carry from subsequent additions can propagate to at most one bit to the left of that.


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 7:21 am 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 988
Location: Potsdam, DE
It gets messy :mrgreen:

Consider an 8 * 8 _signed_ multiplication. You are actually multiplying two seven bit values - for a 14 bit product - and two sign bits - for a two bit resulting sign. The two sign bits will always be the same as each other, and so the highest bit is a valid representation of the resultant sign.

For an 8 * 8 _unsigned_ multiplication, the resulting sign is implicitly positive and the top two bits represent value (and may be different) not sign.

8 _signed_ * 8 _unsigned_ of course gives a fifteen bit + sign result; bit 14 is significant and may be different from the sign.

So if you're doing a 16 * 16 using 8-bit partial products, you need an unsigned 8 * 8 of the lower bits, two signed 7 * unsigned 8 of each upper and the opposite lower, and two signed 7 * 7 of the two uppers before you add it all up. Which is probably worth the effort if your instruction set has those instructions natively, but if it doesn't the standard shift and add until there's nothing more to do is probably easiest.

As ever with binary arithmetic, the result means what you want it to mean.

Neil


Top
 Profile  
Reply with quote  
PostPosted: Sun Oct 22, 2023 9:31 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
For my Mandelbrot plotter I did as Gordon said above - check the signs first, invert the numbers so they're positive, do the multiplication, and then invert again if the result was meant to be negative. I felt the cost of checking signs and inverting numbers was so cheap compared to the multiplication that it was worthwhile as it allows us to easily early-out of the multiplication sooner for smaller numbers. I didn't check if that was actually a saving in practice.

The code is here for reference: https://github.com/gfoot/mandelbrot6502 ... el2.s#L109


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 23, 2023 7:07 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
With the multiply and divide routines in place, the total code size is now very close to 2KB. Hopefully I can fit an adequate PRINT statement into the remaining 51 bytes, but after that I would have to go byte-hunting with a vengeance. This also means that FOR-NEXT loops might be a luxury that we can't afford with this code size limitation.

One thing I had to decide with the divide routine was which remainder-sign convention to adopt, which effectively determines the rounding mode of the quotient. It turned out to be natural to have the remainder adopt the sign of the dividend, which is the C99 convention, because the magnitudes of the quotient and remainder are both unaffected by the signs of the dividend and divisor. With any other convention, they would need to be conditionally adjusted. This also means that the quotient always rounds towards zero, which is the natural result for a shift-subtract unsigned division routine.

The other common conventions are to have the remainder adopt the sign of the divisor, which for a positive divisor causes the quotient to round towards negative infinity (but for a negative divisor would round towards positive infinity), or to have the remainder always be positive, which causes the quotient to always round towards negative infinity. Both of these require relatively complex handling to resolve the signs.

At present the (unsigned) remainder is generated (which is itself free) and the sign it should adopt is extracted, but nothing actually uses it (ie. there is no MOD operator). A few bytes could be saved by removing that part of the sign extraction logic. Or I could probably save more bytes (at the expense of speed) by using the general division routine to extract digits for the integer printing routine, and removing the specialised 32b/constant division routine that's currently used for that. There are definite possibilities here. But first I need to test and debug, and there may be other places to save bytes that don't affect functionality.


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 23, 2023 7:15 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Squeezing the mod operator in (given you have code that generates it as a side-effect anyway) might please the prime number benchmark folks ...

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 23, 2023 7:26 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
Sure, but the decision-tree I use to parse expression operators is very expensive to add keyword-based operators to, since it has to look for 'M' then 'O' then 'D' and decide what, specifically, to do if there isn't a match in each position. I would have to see whether a table-based solution would be smaller, but at the moment I think it's a wash.

What I do already have in place are byte and word "peek" operators, using the BBC BASIC style indirection operators, ? and !. The converse "pokes" would be added to the statement table as the more conventional keywords, assuming I can find space for the associated routines. Probably the word-oriented versions could be trimmed out, leaving only the byte-oriented ones, to gain some space - they would be a more esoteric feature, not worth keeping at the expense of a more standard feature.


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 23, 2023 8:05 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Do you use % at all? That could perhaps stand for MOD - would that work?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 136 posts ]  Go to page Previous  1 ... 4, 5, 6, 7, 8, 9, 10  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 53 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: