6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 10, 2024 3:46 am

All times are UTC




Post new topic Reply to topic  [ 8 posts ] 
Author Message
PostPosted: Sun Oct 23, 2016 10:16 pm 
Offline

Joined: Sun Oct 23, 2016 8:36 pm
Posts: 5
Hey everyone, I'm in the middle of building my first 128kByte 65816 based machine with SPI, Serial and AVR based ram loader and clock control. Address decoding is done using a Altera CPLD (MAX7000), this gives me quite a bit of flexibility.

I'm learning the ropes by building it on a breadboard first and plan to make a real PCB in the later stages. The AVR allows me to control the clock pulse by pulse inspecting the address and data lines as well as frequency control from 1kHz to 10MHz. Currently it is executing a very simply assembler program that creates output using a serially connected VT100 terminal (currently Putty). Total code size is 354 bytes :) I'm pleased that it runs at 10MHz on the breadboard.

Now to the question, is there any documentation on the WDC816CC calling convention? I'd like to mix C and ASM in my home brewed OS to go along with the hardware. The PDF from WDC is woefully short on information.


Attachments:
File comment: board with labels
IMG_20161016_165842052.jpg
IMG_20161016_165842052.jpg [ 3.39 MiB | Viewed 1365 times ]
File comment: first output of machine.
vt100.jpg
vt100.jpg [ 26 KiB | Viewed 1365 times ]
File comment: hex file uploader and clock controller.
monitor.jpg
monitor.jpg [ 107.15 KiB | Viewed 1365 times ]
Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 24, 2016 6:30 am 
Offline

Joined: Sun Oct 23, 2016 8:36 pm
Posts: 5
Of course reading the doc further I find my answer on page 21. Will answer my own question in more detail as i experiment with it.


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 24, 2016 8:34 am 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10975
Location: England
Welcome! It's a good question you've posed. Looking forward to hearing more.


Top
 Profile  
Reply with quote  
PostPosted: Mon Oct 24, 2016 2:10 pm 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
Welcome, altera -- nice project! Not too shabby, for someone who's just learning the ropes! 8)

It looks like one of Daryl's SPI Controller Devices you've got there, am I right?

cheers,
Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 25, 2016 2:11 am 
Offline

Joined: Sun Oct 23, 2016 8:36 pm
Posts: 5
Thanks, I'm very happy with how it's turning out.

That's right that is one of Daryl's SPI controllers. Works well!


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 25, 2016 3:26 am 
Offline

Joined: Sun Oct 23, 2016 8:36 pm
Posts: 5
Alright found a few minutes after work here, don't have a whole lot of time during the week. I'm going to keep a running log of the investigation here until I collect it all. Maybe I can write it all up when done.

Compiling the following code with WDC816CC

Code:
void test()
{
   const char * buf = "hello world!\r\n";   
   console_write(buf);
}


this code assumes that console_write is defined elsewhere. It will in fact be in ASM in a later post. Compiled using:

Code:
wdc816cc.exe -a test.c


produces the following code, comments are mine. If I understood anything wrong feel free to correct me.

Code:
;:ts=8
R0   equ   1
R1   equ   5
R2   equ   9
R3   equ   13
   code
   xdef   __test
   func
__test:
   
   longa   on      ; clearly the compiler is assuming we are running A, X and Y as 16 bits.
   longi   on      ; clearly the compiler is assuming we are running A, X and Y as 16 bits.
   
   
   tsc            ; Transfer Stack Pointer to 16-bit Accumulator
   sec            ; set carry flag -- WHY?
   sbc   #L2         ; subtract L2 = 2 from A store result in A --  create two bytes of storage space on stack, remember stack GROWS DOWN!
   tcs            ; Transfer 16-bit Accumulator to Stack Pointer
   phd            ; Push Direct Page Register to stack      -- keep a copy of the old direct page.
   tcd            ; Transfer 16-bit Accumulator to Direct Page Register -- point the direct page pointer to the stack FAST ACCESS to variables
   

   
buf_1   set   0
   lda   #<L1         ; load address of string
   sta   <L3+buf_1   ; store the address at L3 + buf_1.
   pei   <L3+buf_1   ; Push Effective Indirect Addres, aka value in DirectPage 1 as an address pushed onto Stack - the address of the string... Couldn't this be done easier? PEA #L1
   
   ; looks like the string is now on the stack.
   ; for the function it will actually be the 2nd piece on the stack since
   ; a call will push the return address L4 on the stack.
   
   jsr   __console_write
   
   ; Functions called by a C function and C functions themselves return values in the X
   ; register and the Accumulator. The high word of the result, if any, is in the X
   ; register, while the low word is in the Accumulator.
      
L4:
   pld            ; Pull Direct Page Register = e.g. restore the DP to what it was at the beginning of the Func.
   tsc            ; Transfer Stack Pointer to 16-bit Accumulator
   clc            ; Clear carry
   adc   #L2      ; add 2
   tcs            ; Transfer 16-bit Accumulator to Stack Pointer
   rts            ; Done baby, no return values.
L2   equ   2
L3   equ   1
   ends
   efunc
   data
L1:
   db   $68,$65,$6C,$6C,$6F,$20,$77,$6F,$72,$6C,$64,$21,$0D,$0A,$00
   ends
   xref   __console_write
   end


next the ASM code __console_write and the ASM code that loads this guy.


Top
 Profile  
Reply with quote  
PostPosted: Tue Oct 25, 2016 7:10 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
Some comments...

    altera wrote:
    Code:
    ;:ts=8
    R0       equ 1
    R1       equ 5
    R2       equ 9
    R3       equ 13
             code
             xdef __test
             func
    __test:   
             longa on              ; clearly the compiler is assuming we are running A, X and Y as 16 bits.
             longi on              ; clearly the compiler is assuming we are running A, X and Y as 16 bits.
             tsc                   ; Transfer Stack Pointer to 16-bit Accumulator
             sec                   ; set carry flag -- WHY?
             sbc #L2               ; subtract L2 = 2 from A store result in A -- create two bytes of storage space on stack,
                                     remember stack GROWS DOWN!
             tcs                   ; Transfer 16-bit Accumulator to Stack Pointer

  • longa on is the WDC assembler's pseudo-op that says to generate 16 bit immediate mode operands for accumulator instructions, such as LDA #<operand>. Similarly, longi on tells the assembler to generate 16 bit immediate mode operands for index register instructions, e.g., LDX #<operand>. The compiler itself isn't assuming anything, as it was responsible for generating those pseudo-ops.

  • Implied, but not stated, is that the compiler generated assembly language source code corresponding to main() to set all registers to 16 bits, likely a REP #%00110000 instruction. Without that happening, the 16 bit operand assembled for the SBC #L2 instruction could be wrong interpreted by the MPU as an 8 bit load followed by a BRK instruction.

  • TSC and TCS are always 16 bit transfers, regardless of the state of the m bit in the status register (SR). The C in these mnemonics is the hint that this is the case.

  • Carry must be set before subtraction, as the SBC instruction means subtract with carry. In this case, carry is an inverted borrow. We don't know what the state of carry might be at the start of this routine, so carry must be explicitly set.

    Quote:
    Code:
             phd                   ; Push Direct Page Register to stack -- keep a copy of the old direct page.
             tcd                   ; Transfer 16-bit Accumulator to Direct Page Register --
                                     point the direct page pointer to the stack FAST ACCESS to variables

    buf_1    set 0
             lda #<L1              ; load address of string
             sta <L3+buf_1         ; store the address at L3 + buf_1.
             pei <L3+buf_1         ; Push Effective Indirect Addres, aka value in DirectPage 1 as an address pushed onto Stack -
                                     the address of the string...Couldn't this be done easier? PEA #L1

  • Pointing DP at the stack is convenient for programming purposes and assists in generating fully relocatable code. However, unless DP starts on a physical RAM page, that is, on $00xx00, where xx is any page in the range $000000-$00FFFF, the performance gain normally seen with direct page accesses will be largely negated, as each load/store access will use the same number of clock cycles as an absolute load/store.

    It should be noted that there is a potential booby trap involved in pointing DP at the stack. If the local operating environment doesn't correctly preserve the MPU's state during API calls or within interrupt handlers, the stack could get trashed due to the operating environment blindly writing on the relocated direct page, inadvertently stepping on parameters, return addresses, etc.

  • The LDA #<L1 instruction could be ambiguous. Not seeing the entire program, I might be inclined to think that only the LSB of L1 would be loaded into the A-accumulator, with who-knows-what loaded into the B-accumulator. You might be reaching for the reset push button after the 65C816 executes that instruction. :roll:

  • PEA is an immediate mode instruction, which is why it isn't used here. If PEA were used the program would have to be self-modifying, which would be unsuitable if the code is to be burned into ROM. Obviously, the WDC compiler can't make any assumptions in that regard, which explains the presence of PEI.

  • The instructions

    Code:
             sta <L3+buf_1
             pei <L3+buf_1

    are odd. The < operator implies least significant byte (LSB) operand generation. However, L3 is defined as 1, making it resolvable to eight bits. Hence there would be no need to use the < operator. Of course, since L3 is defined at the end of the source code, the above instructions constitute a forward reference (more on that below).

    Incidentally, I see nothing defining BUF_1.

    Quote:
    Code:
                                   ; looks like the string is now on the stack.
                                   ; for the function it will actually be the 2nd piece on the stack since
                                   ; a call will push the return address L4 on the stack.

  • Actually, the string itself is not on the stack, only a pointer to it. The string could be almost anywhere in RAM and in fact, would normally be linked into the data section of the object file. Given the fact that the '816 has separate registers for the program and data banks (the PB and DB registers, respectively), the string could be loaded into a different bank than the executable code.

    Quote:
    Code:
             jsr __console_write
             
                                   ; Functions called by a C function and C functions themselves return values in the X
                                   ; register and the Accumulator. The high word of the result, if any, is in the X
                                   ; register, while the low word is in the Accumulator.

    L4:      pld                   ; Pull Direct Page Register = e.g. restore the DP to what it was at the beginning of the Func.
             tsc                   ; Transfer Stack Pointer to 16-bit Accumulator
             clc                   ; Clear carry
             adc #L2               ; add 2
             tcs                   ; Transfer 16-bit Accumulator to Stack Pointer
             rts                   ; Done baby, no return values.

    L2       equ 2
    L3       equ 1
             ends
             efunc
             data
    L1:      db $68,$65,$6C,$6C,$6F,$20,$77,$6F,$72,$6C,$64,$21,$0D,$0A,$00
             ends
             xref __console_write
             end

  • The code at L4 is figuring that the accumulator is set to 16 bits upon return from __CONSOLE_WRITE. That is a potentially dangerous assumption to make, given that __CONSOLE_WRITE could be a ROM API that doesn't fully preserve the MPU state. :D

  • If possible, assembly-time constants such as L2 and L3 should be declared *before* any other reference is made to them. Otherwise, a forward reference condition occurs, which may trigger an assembly-time error or cause the assembler to needlessly treat these as 16 bit values and generate less efficient code. Direct page locations should always be declared before reference so the assembler understands that they are to be treated as eight bit addresses when an instruction references them.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Oct 26, 2016 2:47 am 
Offline

Joined: Sun Oct 23, 2016 8:36 pm
Posts: 5
Thanks for the insights! The ASM function is coming next but not likely to be today unfortunately.

Side note, all the functions are under my control of course so I can make them behave as required.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: