6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 09, 2024 10:55 pm

All times are UTC




Post new topic Reply to topic  [ 32 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Tue Dec 22, 2015 6:22 pm 
Offline

Joined: Thu Jun 04, 2015 3:43 pm
Posts: 42
This whole discussion makes me wonder whether it would be useful to write in a slightly higher level of abstraction, and let the problem of where to put variable data be handled entirely by the assembler/compiler. Treat ZP as registers and use normal register allocation techniques. From what (little) I know, this can be done pretty well.

And if you're already using a macro assembler it's not that much of a stretch (IMHO).

Then again, I suppose that might be just reinventing another language that already exists.


Top
 Profile  
Reply with quote  
PostPosted: Wed Dec 23, 2015 3:18 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
magetoo wrote:
This whole discussion makes me wonder whether it would be useful to write in a slightly higher level of abstraction...Then again, I suppose that might be just reinventing another language that already exists.

If high(er) level abstraction is desired then a compiled or interpreted language should be chosen. Assembly language is used to get close to the bare metal and produce a program that executes as rapidly as possible. Hence the notion of using abstraction in assembly language programming seems to be counter-intuitive.

Yes, macros can be very useful and with a certain amount of forethought, can automate the repetitive aspects of assembly language programming and in some cases, as Garth has demonstrated, give the appearance of structured programming to what is an inherently unstructured language. However, misused macros can cause bloat and slow down execution. As is always the case in software development, automation leads to less efficient code.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 26, 2015 3:04 pm 
Offline

Joined: Thu Jun 04, 2015 3:43 pm
Posts: 42
BigDumbDinosaur wrote:
If high(er) level abstraction is desired then a compiled or interpreted language should be chosen. Assembly language is used to get close to the bare metal and produce a program that executes as rapidly as possible. Hence the notion of using abstraction in assembly language programming seems to be counter-intuitive.


Assembly language is itself an abstraction. If you want to be absolutist about writing bare metal code I hope you code your projects in machine code directly!

Anything above that implies some level of abstraction, and it's not clear to me there are any clear lines to be drawn here. Symbolic labels and automatic address calculation is taken for granted when writing assembly code, so why not symbolic data-holders and automatic register allocation?


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 26, 2015 4:15 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
magetoo wrote:
Assembly language is itself an abstraction. If you want to be absolutist about writing bare metal code I hope you code your projects in machine code directly!

Anything above that implies some level of abstraction, and it's not clear to me there are any clear lines to be drawn here. Symbolic labels and automatic address calculation is taken for granted when writing assembly code, so why not symbolic data-holders and automatic register allocation?

I said "close to the bare metal," and I think you are attempting to equate symbolism with abstraction.

It doesn't matter whether the programmer enters A9 04 in a 65C02 machine language monitor or LDA #4 in a 65C02 assembler. The result will be the generation of one, and only one, machine instruction that will load .A with $04 and do nothing else. LDA is merely a mnemonic device to save the programmer the effort of remembering a set of machine opcodes. The assembler provides the convenience of converting the operand into its binary equivalent and selecting the proper opcode for the immediate addressing mode. That's not an abstraction, as it was up to the programmer, not the assembler, to choose the correct instruction and the correct addressing mode to cause the microprocessor to perform the desired operation. Had it been an abstraction, the programmer could have written something like ACCUMULATOR=4 and the resulting object code would have been (presumably) A9 04.

The same applies with assigning constants and memory addresses to symbols and labels. Here again, those symbols and labels are a mnemonic device, a convenience, not an abstraction. There are no "variables" in assembly language that are in any way comparable to variables in BASIC, C, FORTRAN or any other 3GL or higher language. You can't say something like SUBTOTAL=123.45 in assembly language and have instructions magically fall into place that will allocate storage for the variable SUBTOTAL and magically deposit the machine representation of 123.45 into that storage space.

Any abstraction that occurs in assembly language will come from the use of macros, which are merely a way of symbolically representing a sequence of instructions—another mnemonic device that is a convenience. The program that results from assembly of the source code will behave the same whether you use the macro or write out the individual instructions.

Address calculation as the assembler processes the source code is also not an abstraction. It's a convenience that saves the programmer from the tedium of manual calculations. As with symbolically representing opcodes with mnemonics and addresses and constants with labels and symbols, the assembler is merely shifting some of the workload from the programmer to the machine.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 26, 2015 5:14 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8538
Location: Southern California
Quote:
let the problem of where to put variable data be handled entirely by the assembler/compiler. Treat ZP as registers and use normal register allocation techniques.

That's pretty much what we do in assembly language anyway, but we give the registers names that are relevant to what we're using them for (rather than R0, R1, R2,...).

Quote:
The program that results from assembly of the source code will behave the same whether you use the macro or write out the individual instructions.

I'll go further and say that the program that results from the the assembly of the source code will not only behave the same, but in most cases be exactly the same, and that from looking at the machine language, you wouldn't be able to tell whether it came from macros or from writing out the individual instructions longhand in the source code.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 26, 2015 9:49 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
GARTHWILSON wrote:
I'll go further and say that the program that results from the the assembly of the source code will not only behave the same, but in most cases be exactly the same, and that from looking at the machine language, you wouldn't be able to tell whether it came from macros or from writing out the individual instructions longhand in the source code.

In either case, no abstraction was involved, only macroinstruction processing by the assembler.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Dec 26, 2015 9:53 pm 
Offline

Joined: Thu Jun 04, 2015 3:43 pm
Posts: 42
BigDumbDinosaur wrote:
I said "close to the bare metal," and I think you are attempting to equate symbolism with abstraction.


I interpreted "close" as meaning "as close as possible". I see now you meant something else.


As for the rest, this could easily devolve into multiple pages of arguing about definitions, so I'll just restate what I meant to say and leave it at that:

  • The job of tools like assemblers, compilers, editors, and so on is to make the life of the programmer simpler.
  • One thing a tool can do (and does) is to let the programmer use a different representation for addresses - labels.
  • If the zero page is seen as an extended register set, then register allocation is something that could usefully be handled by a tool like an assembler.
  • To avoid the aliasing problem, the tool could do its magic only on representations of zero page addresses that are distinct from explicit addresses, in effect abstracting this representation, these labels, into a new set of registers (R0, R1 ... Rn).


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 6:41 am 
Offline

Joined: Sun Nov 17, 2013 5:15 am
Posts: 12
[last edit 12/28 @ 23:30 ish]

BigDumbDinosaur wrote:
magetoo wrote:
This whole discussion makes me wonder whether it would be useful to write in a slightly higher level of abstraction...Then again, I suppose that might be just reinventing another language that already exists.

Assembly language is used to get close to the bare metal and produce a program that executes as rapidly as possible. Hence the notion of using abstraction in assembly language programming seems to be counter-intuitive.

I see huge value in making something slightly higher than assembler, something that is essentially a full-featured assembler with some convenience features. It lets you get as close to bare metal as you want, BUT you can choose to leave a few things to the translator that either it can do a better job of, or that are tedious, as long as it produces the correct behavior. Something beyond a macro assembler. (A macro assembler doesn't really do this, because the programmer is still in complete control of the code produced, and can know in advance what he's producing as long as he understands macro expansion rules and cares to think it through.)

The key is knowing where it works to let the translator do its magic, and where it won't, and having an adequate way to supply hints that keep the thing from going crazy on you, and transparency into what it's doing.

I have been toying with the idea of an optimizer. I can't tell you how many times I've tried to rearrange or refactor assembly code, or borrow a routine from another project, only to discover through a lot of debugging that there was some optimization I missed that is not valid in the new location.

So basically, I want to be able to write code that may be non-optimal as written, but is straightforward, portable, and reusable. It's only after putting all the subroutines together that you can know all of the optimizations that will actually work. I'm quite OK with letting an optimizer do some of the work for me when I choose to use it. Consider the following two workflows:
  • asm source -> (assembler) -> binaries -> (keyhole optimizer) -> optimized binaries
  • asm source -> (static analyzer) -> new source w/ suggested optimizations -> (assembler) -> optimized binaries
Now, I think those are valid and useful tool-chains. If those aren't terrible, then why not:
  • asm source + hints/constraints/options -> (optimizing assembler) -> BETTER optimized binaries + listfile showing optimizations

Trying to make the front-end or back-end optimizations as efficient as what a translator could do with appropriate hints and constraints is a very tall order, not to mention that an assembler and/or disassembler of some sort must also be present in the optimizer, which is needless duplication and increases development time and complexity.

BigDumbDinosaur wrote:
If high(er) level abstraction is desired then a compiled or interpreted language should be chosen.

It don't think it has to be so black-and-white. There is a huge chasm between high-level languages and assembly. Is there no room for something in between? Asm can be tedious and prone to bugs. Interpreters are very slow, and high-level languages (at least for 6502 architecture) produce slow and very bloated code.

ca65 provides segments, allowing you to keep variables close to the code they refer to, but consolidate all the data somewhere else determined by the linker. Many assemblers also provide alignment options to, for example, keep code or data from crossing page boundaries; they will insert space as needed to stick to those constraints. If I need code to stay within a single page, then the assembler could either pad with NOP statements (no longer a true assembler), or I could do this:
Code:
    code, code, code
    JMP SKIPSPACE
    .alignment option
SKIPSPACE:
     aligned code

But, what if only one byte needed to be inserted to get the alignment? Now the JMP instruction could bump the aligned block up another page, wasting 256 bytes. The assembler could be allowed to insert either 1-2 NOPS or a JMP statement, as appropriate, but again this goes beyond a true assembler. Kick-assembler (I think, from what I've seen) would allow you to write script that makes some such decisions for you, but even though the programmer wrote that script, it is still not obvious until viewing the compiled output what code was generated. So why not let the assembler provide such scripts that are activated using pseudo-ops (hints), rather than making the coder do it? (BTW, I would still like to be able to write my own code-generating scripts as well.)

Also, why not let the assembler locate some of the data in the padding inserted to the code, or rearrange data so that data itself is used to align data structures, rather than blank space. That is clearly something I would do if I were hand-optimizing and needed to save space, but every time I edit some code, I would have to adjust that optimization and move stuff around. Total PITA.

Don't get me wrong... I love bare metal, and the challenge of coming up with the most efficient code possible. It's better than any puzzle, IMO. I just like the idea of having a tool that lets me do more in less time, some of the time.

BigDumbDinosaur wrote:
As is always the case in software development, automation leads to less efficient code.

If so, optimizers would not exist... but they do. :-)


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 7:39 am 
Offline

Joined: Sun Nov 17, 2013 5:15 am
Posts: 12
Oh, another one, relating to 65816... Let assembler automatically choose near or far jmp based or target... Do assemblers already do that, or does it have to be adjusted manually?

BTW BigDumbDinosaur, I love your posts and have great respect for you... just debating with you here. :)


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 8:36 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8538
Location: Southern California
I've used macros in assembly language to insert space as needed to keep tables from crossing 256-byte block boundaries on PIC16 since accessing a table straddling a boundary in PIC16 is incredibly inefficient; but I never tried to take advantage of those places to store a string or whatever.

Quote:
Oh, another one, relating to 65816... Let assembler automatically choose near or far jmp based or [on?] target... Do assemblers already do that, or does it have to be adjusted manually?

That could be done with a macro too, but might produce phase errors that would take more passes to resolve.  If the assembler will keep doing passes until it's all resolved, it's probably ok, as they run so quickly now that they don't really keep us waiting.  I had something like that 25 years ago, but I don't remember what the special thing I was doing was.  It took a lot of passes (maybe 20 or 30) to finish.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 9:32 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Very interesting ideas, AgentFriday - keep us posted.


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 11:15 am 
Offline
User avatar

Joined: Tue Mar 02, 2004 8:55 am
Posts: 996
Location: Berkshire, UK
AgentFriday wrote:
Oh, another one, relating to 65816... Let assembler automatically choose near or far jmp based or target... Do assemblers already do that, or does it have to be adjusted manually?

My assembler does that when you use the structured programming commands. If the target address is out of range of a relative branch then the inverse branch is generated to hop over a JMP. For example:
Code:
00005E' F0034C????        :                 if eq
                                             .repeat 40
                                             lda >0
                                             .endr
000063' AF000000          +                  lda >0
000067' AF000000          +                  lda >0
00006B' AF000000          +                  lda >0
00006F' AF000000          +                  lda >0
000073' AF000000          +                  lda >0
000077' AF000000          +                  lda >0
00007B' AF000000          +                  lda >0
00007F' AF000000          +                  lda >0
000083' AF000000          +                  lda >0
000087' AF000000          +                  lda >0
00008B' AF000000          +                  lda >0
00008F' AF000000          +                  lda >0
000093' AF000000          +                  lda >0
000097' AF000000          +                  lda >0
00009B' AF000000          +                  lda >0
00009F' AF000000          +                  lda >0
0000A3' AF000000          +                  lda >0
0000A7' AF000000          +                  lda >0
0000AB' AF000000          +                  lda >0
0000AF' AF000000          +                  lda >0
0000B3' AF000000          +                  lda >0
0000B7' AF000000          +                  lda >0
0000BB' AF000000          +                  lda >0
0000BF' AF000000          +                  lda >0
0000C3' AF000000          +                  lda >0
0000C7' AF000000          +                  lda >0
0000CB' AF000000          +                  lda >0
0000CF' AF000000          +                  lda >0
0000D3' AF000000          +                  lda >0
0000D7' AF000000          +                  lda >0
0000DB' AF000000          +                  lda >0
0000DF' AF000000          +                  lda >0
0000E3' AF000000          +                  lda >0
0000E7' AF000000          +                  lda >0
0000EB' AF000000          +                  lda >0
0000EF' AF000000          +                  lda >0
0000F3' AF000000          +                  lda >0
0000F7' AF000000          +                  lda >0
0000FB' AF000000          +                  lda >0
0000FF' AF000000          +                  lda >0
000103' 8003              :                 else
000105' EA                :                  nop
000106' EA                :                  nop
000107' EA                :                  nop
                                            endif

I suppose it should use BRL on a 65816 target.

Edit: It does now.

_________________
Andrew Jacobs
6502 & PIC Stuff - http://www.obelisk.me.uk/
Cross-Platform 6502/65C02/65816 Macro Assembler - http://www.obelisk.me.uk/dev65/
Open Source Projects - https://github.com/andrew-jacobs


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 5:00 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 410
Location: Minnesota
Quote:
But, what if only one byte needed to be inserted to get the alignment? Now the JMP instruction could bump the aligned block up another page, wasting 256 bytes. The assembler could be allowed to insert either 1-2 NOPS or a JMP statement, as appropriate, but again this goes beyond a true assembler.


I disagree. An assembler with conditional macros could easily interrogate the program counter and emit either depending on what it found. It would still be a "true assembler".

I'm not so certain that using skipped space for data storage could be automated so easily, though. I've done it manually but never thought about trying to get an assembler to do it for me. Offhand it would seem to be related to what memory allocators do - it would have to maintain a list of "free spaces" and their sizes, then decide whether to put something in one using one of the ancient strategies ("first fit", "best fit", etc).


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 6:39 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8538
Location: Southern California
teamtempest wrote:
I'm not so certain that using skipped space for data storage could be automated so easily, though. I've done it manually but never thought about trying to get an assembler to do it for me. Offhand it would seem to be related to what memory allocators do - it would have to maintain a list of "free spaces" and their sizes, then decide whether to put something in one using one of the ancient strategies ("first fit", "best fit", etc).

I can envision using macros to see how much space, if any, needs to be skipped to page-align something, and build a table of these skipped spaces. Then the table would be examined by a macro which forms the data variables and arrays encountered later, even if not in the same pass, and if one of the spaces has enough room, it could use it, and make the necessary adjustments to the table. In my macros to form program structures in 65c02 assembly language, the program pointer keeps getting saved, moved, and restored using the ORG assembler directive, and I had to jury-rig a stack in the assembler itself since my assembler allows assembler variables using SETL (SET Label value, like EQU but it can be changed as many times as you like) but it does not offer arrays and a way to index into them during the assembly process, without jurry-rigging.

Yep, it would be pretty complex, and some assemblers' macro capabilities may be inadequate. However, I don't remember ever having to page-align anything in 65c02 anyway. The only use I can think of for page alignment is if something had to be cycle-exact for timing a loop or process, and the extra cycle taken by branching or indexing across a page boundary would be a problem.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
PostPosted: Tue Dec 29, 2015 6:47 pm 
Offline

Joined: Sun Nov 17, 2013 5:15 am
Posts: 12
teamtempest wrote:
I disagree. An assembler with conditional macros could easily interrogate the program counter and emit either depending on what it found. It would still be a "true assembler".

Sure, fine with me. Hence my statement " So why not let the assembler provide such scripts that are activated using pseudo-ops (hints), rather than making the coder do it?"

I wasn't trying to define a line between an assembler and something higher... more the opposite. If assemblers already let you do such things, it's not much of a stretch to let them do a little more, or make it easier to invoke. It's a sliding scale, all working toward the same the same thing. There's value in having it, let the coder decide how much to use.

teamtempest wrote:
I'm not so certain that using skipped space for data storage could be automated so easily, though. . . .

Maybe... But I don't see why not. Relocating linker/loaders do it.

Consider:
  • A data (or BSS) segment can be generated, and a loader can relocate to available RAM.
  • Usually all data is grouped into 2 segments, but this could be more fine-grained if you wanted. There can be multiple of each, giving a relocater more flexibility.
  • The assembler could break the code into multiple code segments as well, leaving any auto-generated gap outside of any code segment, as long as the linker honors the same alignment criteria.
  • The o65 format does not currently support relocation entries for relative branches, but there's nothing preventing them from being included. Only branches outside of a segment require any attention.
  • With relocation data for branches, a linker/loader would appropriately place the two segments close enough to keep the branches in range. In fact, it could even scoot the preceding code up to be right next to the aligned section. Yes, it might change the start address of the code, but that is what relocaters do anyway. If the start address needs to be at a particular place, then that is another constraint the relocater has to honor.

Add that all up, it works. I'm not trying to suggest that going to such extremes for a little 6502 system is a good idea... Just that all the concepts have been successfully implemented before, and the stuff I proposed only draw on those. Rather than trying to pass all the relevant info to an external linker, it would be relatively simple for the assembler to use the information it already has and generate a monolithic binary with a single alignment constraint that cause all the individual constraints to be preserved. (And most assemblers for 6502 already output one monolithic binary.) Or they could be less monolithic if the assembler outputs to o65.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 32 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: