6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 8:17 am

All times are UTC




Post new topic Reply to topic  [ 68 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next
Author Message
 Post subject:
PostPosted: Fri Feb 13, 2009 6:24 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
My new code failed. However, I think I know why. I was trying to relocate "zero page" by setting the Direct register to $0300. However, there are several jump instructions that jump into self-modifying code in zero page. These JMP instructions do not use the Direct register, they are absolute addresses.

My main conflict is with the zero page locations used by both Monitor and EhBASIC. Since I am using the first 30 bytes for I/O, this limits the free space. I was hoping to get EhBASIC to use page $0300 for its "zero page".

I'll see how many JMP's there are and try to find a work around.

Daryl


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 13, 2009 6:45 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
Here is perhaps one idea -- if you JMP into zero-page instead of JSR into it, you might want to put direct-page immediately in front of your EhBASIC RAM or ROM image. E.g., let BASIC's direct page sit precisely 256 bytes in front of EhBASIC itself.

Then, you can use the 65816's PC-relative "BRL" instruction to invoke the routine. That way, EhBASIC can be relocated anywhere in bank 0 memory, without having to worry about where precisely it's loaded.

It is a REAL pity that the 65816 lacks greater support for late-binding in software. In fact, PC-relative branches should have been the norm (and not reserved just for conditional branches) from day one back when the 6502 was introduced. Better support for indirection would be nice too.

The 6809 has the 65xx architecture beat hands down in this area, for sure.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 13, 2009 8:17 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
The '816 does have a branch-relative-long (BRL) and more indexed modes too, but also the PER instruction. For the 6502, figuring out an address relative to where the program counter is at the time is a complex, inefficient process; but the 65816's PER instruction adds the operand to the address of the next instruction (regardless of where you started loading the program), and pushes the result on the stack. From there, you can use it to get to data or program addresses (like with a simulated JSR-relative) that might be different each time you run the program. Stack-relative addressing further expands the possibilities.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 13, 2009 8:30 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
Sadly, it's no substitute for a genuine PC-relative JSR or a JSR with indirect indexed addressing. PER takes a butt-load of time to run (almost as much as a JSR itself). Consider the overhead of this snip of code:
Code:
.macro JSP  ; Jump Subroutine PC-relative absolute
  pea *+7
  per \0-1
  rts
.endmacro


The PEA takes a minimum of 5 cycles alone, PER another 5 cycles, and another 6 for the RTS. Ouch -- that's 16 cycles to CALL the subroutine in question. Add another 6 for the subroutine's RTS, for a minimum overhead of 22 cycles.

Similar latencies exist for indirect subroutine calls too (assuming the vector sits in bank 0; longer still if not!)

What's infuriating to me is that I know the 65816/6502 are architecturally fast enough to go much faster; we know this because absolute-indexed addressing modes are almost as fast as pure absolute modes! Grrrr....


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 13, 2009 9:10 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
Quote:
that's 16 cycles to CALL the subroutine

Fortunately the '816 comes in 14MHz minimum now, compared to 6809's 2MHz. A quick look at Mot's (Freescale's) website seems to indicate the 6809 is no longer in production, but the last I remember, they were still at 2MHz. So even though this JSR relative in not one of the 816's shining areas, it still does it faster, just a little more clumsily.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 13, 2009 9:12 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
The 68HC11 (a variant of the 6809 architecture) is still in production, and available at MUCH faster speeds than 2MHz, particularly from third parties.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Feb 13, 2009 9:46 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
I just downloaded a 68HC11 data sheet, and it runs at 3MHz. The 12MHz input clock gets divided by four.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Feb 14, 2009 10:30 am 
Offline

Joined: Mon Oct 16, 2006 8:28 am
Posts: 106
kc5tja wrote:
Sadly, it's no substitute for a genuine PC-relative JSR or a JSR with indirect indexed addressing. PER takes a butt-load of time to run (almost as much as a JSR itself). Consider the overhead of this snip of code:
Code:
.macro JSP  ; Jump Subroutine PC-relative absolute
  pea *+7
  per \0-1
  rts
.endmacro


The PEA takes a minimum of 5 cycles alone, PER another 5 cycles, and another 6 for the RTS. Ouch -- that's 16 cycles to CALL the subroutine in question. Add another 6 for the subroutine's RTS, for a minimum overhead of 22 cycles.


I'm sorry if this is a dumb question, but why can't one do a pc-relative jsr like this?

Code:
.macro BSR  ; Branch to Subroutine
  per *+6
  brl \0
.endmacro

That's 100% position independent and takes only 10 cycles, 6 for PER plus 4 for BRL.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Feb 14, 2009 5:08 pm 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
I hadn't thought of that; I'm not sure why. Good call. Though, I'm still not happy with those extra four cycles. Subroutine performance on the 65816 is bad enough as it is; with a minimum overhead of 12 cycles (6 for JSR, 6 for the corresponding RTS), it's no wonder people shied away from well-factored software over the years.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Feb 14, 2009 6:40 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
The problem wasn't as simple as replacing the JMP's and JSR's.

In addition to those, there were several "STA zppointer,y" types that do not have a zp,y addressing mode. The assembler converted them to abs, y. I had to fix those along with the Immediate modes that loaded the upper byte of a zp address, i.e., LDA #>zpptr became LDA #>ZeroPG where ZeroPG was equated to $0300.

After several hours of picking out the absolute references to addresses in $00xx, and fixing the immediates and a few other places that had assumed the upper address byte was 0, I was able to get most of it to work. However, I finally chose to abandon this effort. I would literally have to read every line of code to figure out where the rest of the bugs lie.

I have reworked the zero page labels to where EhBASIC and my Monitor all fit without overstepping eachother. Should have done that first!

I added an EhBASIC command, SYS, to simplify jumps from EhBASIC to the SBC-3 Monitor. I can now cold and warm start EhBASIC. The load and save still need a few tweeks, but that should be easily solved.

I have learned much from this effort and the discussions on this thread that I hope to apply to future projects.

Daryl


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sat Feb 14, 2009 9:34 pm 
Offline

Joined: Mon Oct 16, 2006 8:28 am
Posts: 106
kc5tja wrote:
I hadn't thought of that; I'm not sure why. Good call. Though, I'm still not happy with those extra four cycles. Subroutine performance on the 65816 is bad enough as it is; with a minimum overhead of 12 cycles (6 for JSR, 6 for the corresponding RTS), it's no wonder people shied away from well-factored software over the years.

Well, the 6502 ISA is pretty darn old, from well before the time when good programming discipline started to trickle down from academia. For me, the biggest source of frustration is that you can't elegantly switch single instructions from 8 to 16 bit mode (cf the m68k's .l .w and .b suffixes). In my programs I find myself using words when bytes will suffice and segregating the code that absolutely needs to access single bytes into their own part of the subroutine, otherwise I end up having reps and seps all over the place.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 15, 2009 12:21 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8544
Location: Southern California
Quote:
otherwise I end up having reps and seps all over the place

I don't switch between 16- to 8-bit much in my '816 Forth, but when I do, I use macros to make it more clear what's happening:
Code:
ACCUM_16: MACRO
          REP  #00100000B
          ENDM
 ;-------------
ACCUM_8:  MACRO
          SEP  #00100000B
          ENDM
 ;-------------
INDEX_16: MACRO
          REP  #00010000B
          ENDM
 ;-------------
INDEX_8:  MACRO
          SEP  #00010000B
          ENDM
 ;-------------


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 15, 2009 12:31 am 
Offline

Joined: Sat Jan 04, 2003 10:03 pm
Posts: 1706
I've run into situations where I need to set A and X to different register widths. So, I not only have A8/A16/X8/X16 macros, but I also have AX8, A8X16, A16X8, and AX16 macros as well.

Hooray for combinatorial explosions, eh? :)


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 15, 2009 6:42 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
I now have the load and save working. I thought all was good.

However, after a few test programs, have discovered the floating point is all messed up. 4/2 does not return 2 and 4^2 does not return 16.

I made the same memory mods to EhBASIC in Michal Kowalski's simulator and it all ran correctly.

There must be some 6502 commands running differently in the 65816 native mode that I have overlooked.

The only thing I didn't add is the fixes for the TXS commands. But I'm sure those are correct.

The first version I posted earlier for download, also has the FP errors.

Lee, if you are reading, any thoughts????

Daryl


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Sun Feb 15, 2009 4:42 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1748
Location: Sacramento, CA
I took the problem one step further and set the EhBASIC code to run in emulation mode. Any calls to the system (input, output, load, save) first switch back to native mode. Upon completion, these routines switch emulation mode back on.

The FP issue is corrected. Now, a FOR/NEXT loop will only execute the first pass and then locks up.

This is getting a little frustrating. I'm going to take a step back from this for a while and go work on the SPI interface.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 68 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 37 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: