6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 11:35 am

All times are UTC




Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Wed Jan 17, 2024 11:31 am 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 51
By far my most common mistake writing assembly is neglecting a # in front of a literal (lda ff) or named constant (lda MAGIC_NUMBER),
so I end up with a memory reference instead of the intended immediate value.

Any tricks for getting the assembler or linker to help me detect these problems? For example when I write a literal value I almost never mean to refer to a memory location (I would always give it a symbolic name). Or if I could generate a list of all referenced memory locations (vs a named symbol map) I might see that MAGIC_NUMBER and ff shouldn't appear there.

Thanks!


Last edited by pdragon on Wed Jan 17, 2024 3:23 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 12:10 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
A list of labels is a good idea, if your assembler can produce it. (I suppose you're using ca65 and have typo'd in the topic title?)

But for myself I don't have any tricks - I'd proofread what I'd written, and I might notice the wrong opcode, because after some experience you will recognise the most common opcodes. And then, when debugging, it might become clear that all is not well. In some ways this is more a question about debugging tactics than about tactics in getting the source correct.

Interested to hear other suggestions, of course.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 12:37 pm 
Offline

Joined: Thu Mar 12, 2020 10:04 pm
Posts: 690
Location: North Tejas
In addition to the 6502, I also program the 6800 and 6809, both of which have the same problem.

There is no easy solution to this problem except that the more code you write, the less often you will tend to make this error.

Stepping through your code in a debugger when it does not do what you intended makes this kind of error stand out.

Some relocatable assemblers will put an indication in the object code field such as an "r" next to relocatable addresses and you can scan for those.

<rant>
This problem is a direct result of instruction set designers who strive to make their architecture "appear more orthogonal" while ignoring the unintended consequences.
</rant>

If Motorola had used LDIA and LDIB instead of LDAA # and LDAB # in the original 6800 design, the problem solves itself. MOS might have chosen to use LDI instead of LDA for the immediate form.

Zilog made the same mistake with the Z80.

Instead of
Code:
    lda     Label
    mvi     A,Value


they used
Code:
    ld      A,(Label)
    ld      A,Value


though you will get few Z80 fans to admit this is a problem...it is not as bad because () is more visible than a single #


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 3:03 pm 
Offline

Joined: Mon Jan 19, 2004 12:49 pm
Posts: 660
Location: Potsdam, DE
Though they probably did the whole rewrite the assembler syntax to avoid the copyright lawyers at Intel.


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 3:37 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 51
yes, typo sorry, corrected in the initial post. and yes to debugging, that's ultimately how I usually catch them along with lots of source re-reading.

having different opcodes would be nice for sure. i suppose you could use macros to write code that uses different mnemonics for different addressing modes and adds the required '#' or not? not sure i'd get used to it, and might make it hard for others to read later.

i guess it's really a linting problem? are there any 6502 lint tools?

I notice ca65 allows you to distinguish label vs numeric constants (SOMETHING = 2 vs SOMEWHERE := 200) though it doesn't seem to treat them differently while assembling.
maybe i'll try making a little python linter script that looks for bare literals or numeric constant without # (e.g. lda 3 or lda SOMETHING).

could you catch other common mistakes like that, e.g.? mismatch ph{x|y|a] / pl[x|y|a] within same scope? unlabeled (unreachable) code after bra? obviously none of these are hard and fast but it might save a few debugging cycles...


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 3:47 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
One of the assemblers used by Acorn back in the day had a more regular syntax, with longer opcodes to distinguish addressing modes. I think it was MASM: you'll see LDAIM, ANDIM, and similar. Also LDAIY, and so on. In fact, I see there's a Bush & Holmes assembler for the C64 which did similarly.
https://archive.org/details/Commodore_6 ... rogramming


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 4:04 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 387
Location: Minnesota
Quote:
If Motorola had used LDIA and LDIB instead of LDAA # and LDAB # in the original 6800 design, the problem solves itself. MOS might have chosen to use LDI instead of LDA for the immediate form.


This is an assembler problem, not a hardware problem. The CPU doesn't care what lays down the object code, just that it's correct.

If the assembler you're using has a macro capability, you can make up whatever mnemonics you find useful.

Details vary depending on which assembler you're using, but as a simple example:

Code:
.macro LDIA, ?immediate
    LDAA #?immediate
.endmacro


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 9:00 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8144
Location: Midwestern USA
pdragon wrote:
By far my most common mistake writing assembly is neglecting a # in front of a literal (lda ff) or named constant (lda MAGIC_NUMBER), so I end up with a memory reference instead of the intended immediate value.

Been there innumerable times, done that innumerable times, used nasty words innumerable times when I found my mistake.  :D  It’s even more fun with the 65C816, in which an immediate-mode operand can be an 8- or 16-bit value.  Getting that mixed up makes for some really “entertaining” results when the program is run.  :twisted:

Quote:
Any tricks for getting the assembler or linker to help me detect these problems? For example when I write a literal value I almost never mean to refer to a memory location (I would always give it a symbolic name). Or if I could generate a list of all referenced memory locations (vs a named symbol map) I might see that MAGIC_NUMBER and ff shouldn't appear there.

There is no magic bullet for preventing this class of error.  One thing I will note is, except in a few very specific situations, I never embed magic numbers into code.  Even if a constant will only be used once in 30,000 lines of code, it will be assigned to a symbol.  That right there eliminates one class of error, which is mistyping the magic number.  If the name of a magic number’s symbol is mistyped, the assembler will belch an error instead of assembling wrong code—unless, of course, the mistyped name is that of another symbol.  :shock:

I’ve been writing 65xx assembly language programs for some 48 years and can tell you there is no shortcut to writing error-free assembly language programs.  Your best defense against bugs caused by typos is being meticulous, using macros where possible, and carefully reading the listing produced by your assembler to spot questionable code.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 9:38 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8144
Location: Midwestern USA
pdragon wrote:
I notice ca65 allows you to distinguish label vs numeric constants (SOMETHING = 2 vs SOMEWHERE := 200) though it doesn't seem to treat them differently while assembling.

It’s mostly a distinction without a difference.  Explicitly assigning a value to a symbol (SOMETHING = 2 is a symbolic assignment, not a label assignment...see next) automatically makes that symbol a constant.

Technically speaking, labels and symbols are two different entities.  In ye olden days of programming when everything was written in assembly language (that is, in the pre-FORTRAN era :D), a label referred to a location in memory, whereas a symbol referred to a constant, the latter either explicitly stated by assignment or computed during assembly.  That distinction was intended to prevent an immediate-mode load of a label, the thinking being that only constants should be operands for immediate-mode instructions.

The fallacy of that thinking soon became evident when programmers wanted to load registers with pointers known at assembly-time, but could not due to the assembler not allowing an immediate-mode reference to memory.  No assembler that I have used since I started writing software has ever enforced that distinction and, indeed, it would be tough to write an efficient program without being able to use an address as an immediate-mode operand.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Jan 17, 2024 10:50 pm 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 51
yup, tho if I adopt that = vs := convention then it could be leveraged for linting, i.e. it's usually wrong when I use a constant symbol as an address (without a #), and it's sometimes wrong if I use a label as a immediate value (tho not as often, and at least I had to explicitly add a # there).

so i might try a tiny lint script that looked at least for the first case, with some ;NOLINT comment to opt out when I really mean it

i've seen a few other threads about 6502 lint but they mostly seem concerned with code formatting which I don't care nearly as much about as pointing out places where I might have a bug


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 18, 2024 3:52 am 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 51
proof of concept illustrating the idea, where both of these references are missing a leading #

however If LCD_WIDTH was defined with :=, indicating a label, it knows not to warn since label as address is fine.

Code:
rom.asm:
...
LCD_WIDTH = 20
...

% cl65 -g ... -l rom.lst -o rom.bin rom.asm
% python lint65.py rom.lst
Line 318: symbol or constant 3 used as address
    0004B7r 2  E4 03                cpx 3
Line 455: symbol or constant LCD_WIDTH used as address
    00055Er 2  A5 14        loop:   lda LCD_WIDTH


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 18, 2024 4:15 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8144
Location: Midwestern USA
pdragon wrote:
yup, tho if I adopt that = vs := convention then it could be leveraged for linting, i.e. it's usually wrong when I use a constant symbol as an address (without a #), and it's sometimes wrong if I use a label as a immediate value (tho not as often, and at least I had to explicitly add a # there).

I’ve not seen any software that can perform a “lint” operation on 6502 assembly language source.  Your best defenses against forgetting to use # are to adequately comment everything and to carefully proofread your code before assembly.  No software can possibly know your intentions when you are writing your code.  Even C’s lint doesn’t catch many dubious constructs.  I consider tools such as lint to be crutches that should be avoided.  If you are dependent on checkers to weed out bad code, you will tend to write bad code.  :D

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Jan 20, 2024 2:35 am 
Offline

Joined: Tue Sep 26, 2023 11:09 am
Posts: 51
not sure I agree with that sentiment. the more help I can get finding dumb mistakes through lint, static type checking and the like, the more time I can spend thinking about the actual difficult problems. bugs like this seem more like avoidable friction: "do what I mean, not what I say", so what not have a crutch to help avoid them?


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 25, 2024 8:21 pm 
Offline

Joined: Fri May 05, 2017 9:27 pm
Posts: 851

Because that crutch will slow you down in the long run.


Top
 Profile  
Reply with quote  
PostPosted: Fri Jan 26, 2024 6:13 am 
Offline
User avatar

Joined: Fri Jan 26, 2024 5:47 am
Posts: 37
Location: Prague; Czech Republic; Europe; Earth
Hello,

I find it helpful to set highlighting in my editor so that immediate arguments look different from others, like this:
Attachment:
6309_highlight_1.png
6309_highlight_1.png [ 15.21 KiB | Viewed 1629 times ]

It doesn't fix anything automatically, but the visual distinction makes it easy to spot differences, and I've become accustomed to noticing whether the value is in green (like a comment and, therefore, literal) or aggressive magenta (indicating a hot memory reference stuff). It might be a silly mnemonic, but it works for me.

(I use Linux Gentoo, edit nearly everything in Vim, have a half-transparent picture in the background, and use tabs for indenting, shown as blue "|----" marks.)

_________________
http://micro-corner.gilhad.cz/, http://8bit.gilhad.cz/6809/Expanduino/Expanduino_I.html, http://comp24.gilhad.cz/Comp24-specification.html, and many others


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 22 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: