6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Thu May 09, 2024 11:19 pm

All times are UTC




Post new topic Reply to topic  [ 72 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next
Author Message
PostPosted: Tue Nov 17, 2015 7:54 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
Quote:
Not that you need too much encouragement, but I thought that I'd note that I like to follow your progress on your various projects. Invariably I find that your efforts provide great insights on how I can improve my efforts. I've had virtually the same thoughts regarding task switching that you are adding to your core. I am looking forward to your continued posts on how your implementation pans out.

Thanks for the compliment.

A unix like FORK instruction has been added to the core. This allows a unix paradigm to used when setting up a task. A sample test of the FORK instruction seems to work. It runs keyboard initialization as a task.
Code:
   FORK   #6         ; use context register #6
   TTA               ; get the running context (pid) (Transfer TR to Acc)
   CMP      #6         ; are we running in context #6 ?
   LBEQ   KeybdInit   ; if so, initialize keyboard
   .... continues with monitor code

The keyboard message from keyboard initialization comes up, and system gets to the BIOS prompt so I'm pretty sure the FORK instruction is working. But, I can't get the invaders app to run (it hangs) and it's also started with FORK. I'm guessing it's a software problem though.

The stack has been modified so that it no longer resides in the data segment. Instead the stack pointer is a flat unsegmented address pointer (or rather based on stack segment value of zero).
I ran into problems calling BIOS routines where the data segment needed to be switched. This created havoc with the stack since it was located in the data segment.

I've been writing a little invaders app that runs with its own data segment at $10000. However I'd like to be able to call BIOS routines. BIOS routines assume the data segment is $0000. That means the data segment needed to be switched. However, as soon as the DS was changed, the stack was lost because the stack was located by the data segment.
Unfortunately, the only way to change the DS is by pushing a value onto the stack, then popping the DS.
Since the stack was being switched around, interrupts also had to be disabled while this took place. It ended up being a whole bunch of ugly nonsense. 30+ bytes of code just to call a subroutine.

I'd planned to use shared memory located by using segment zero, as a means to communicate parameters to tasks, rather than the stack. So I thought having the stack in the data segment wouldn't be a big issue. Live and learn.
The other alternative I've been toying with is having four segment registers like the x86.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Thu Nov 19, 2015 2:26 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
I've had some trouble getting the DDR2 ram controller interface to work. A lot of the problem getting the invaders app to work has to do with getting bad responses from ram. I can test the DDR2 ram interface with Supermon and it doesn't always work. Testing the block ram interface works all the time. So I moved the invaders data segment into block ram, and it works much better. Of course it still hangs.
I found and fixed a bug with the segment prefix instructions by trying to get the invaders app to work.
Part of the problem is that there is a clock domain to cross over in the DDR ram controller. So signals have to be registered onto the new domain. Generation of a ready signal from the controller is tricky then as ready needs to go low as soon as ram is selected (not after several clocks for domain crossing). And it needs to remain low until the ram is ready.
Another problem is that timing constraints for the DDR2 memory aren't fully specified. Well I specified them but Vivado ignores the specification claiming there's syntax errors.

I added the impure context switching operations JCR - jump context routine, and RTC - return from context routine. The context is switched however all registers are not saved and restored. The .A, .X, .Y, and flags registers are allowed to propagate between contexts. This allows a context switch to be treated a little bit like a subroutine call. This is intended for synchronous context switches only, where the context will not be running asynchronously to the caller.
The JCR and RTC instruction support BIOS calls where it is desirable to switch to the BIOS stack and data segment.
For instance in the invaders app when a keystroke is tested the BIOS is called like this:
Code:
   JSR      Initialize
   JSR      RenderInvaders
.0002:
   JCR    KeybdGetCharNoWaitCtx,7   ; check for char at keyboard (BIOS call)
.0004:
   LDX      #0
.0006:

The '7' is the BIOS context register number.
Then in the BIOS the keyboard routine is called and returns a value in .A and flags.
Code:
KeybdGetCharNoWaitCtx:
   JSR      KeybdGetCharNoWait   ; call the "real" BIOS routine
   RTC      #0               ; pop zero bytes off the stack and return to original context

For a real mind-bender the JCL instruction could be studied. JCL is the long form of the JCR instruction which also allows copying parameters between stacks. So JCL allows stack based parameters.

After task and context switching I'll be having a look at ways to speed up interpreters. There's about 80 instructions free yet on the second page of opcodes.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 20, 2015 5:04 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
I've been reviewing the following set of posts on VM support http://forum.6502.org/viewtopic.php?f=10&t=3046

It has been mentioned on the forum that it would nice if the IP (interpretive pointer) register were an internal cpu register.
Even better if it auto-increments after fetching from memory.
I put some thought into supporting the IP register as an internal core register. And came to the conclusion that the way to do it was to use the existing program counter (at least in a multi-context core). The program counter both fetches from memory and auto-increments. Even better it uses the instruction cache to cache interpreter code.
Hence I came up with the following.

The following requires multiple register contexts in the processor. It ping-pongs between two register set contexts. One is a special interpretive task that fetches the code; the other is an interpreter that interprets the code fetches.
A special core operating mode called interpretive mode is used to assist the implementation of interpreters. It can be turned on by setting bit 2 in the extended status register. An interpretive mode task uses the PC itself to fetch opcodes into the accumulator register rather than the instruction register.
The PC of the interpretive task takes on the function of the IP register in Forth parlance. In the IFETCH stage of the core for an interpretive task the core then performs a task switch back to the invoking task. The accumulator is passed back to the interpreter where it can be processed. The program counter of the interpretive task is passed back to the invoker in the .X register.

Benefits of using a task are: the bytecode can reside in a different code or data segment than the interpreter itself. It is relatively fast as an internal registers are being used rather than a variable in memory. Fetchs are governed by the PC and hence from the code segment, meaning the instruction cache is used to enhance performance.

The mechanism here allows the interpreter to be written in 32 bit code while at the same time allowing a bytecode implementation.
Task # 10 is the special interpretive task in this case. The program for task #10 is perhaps a bytecode.

; A potential Forth NEXT routine.
Code:
NEXT:
   ; switch to interpretive task to fetch word at interpretive PC
   ; switching to the interpretive task, then returning takes 6 clocks
   TSK      #10
NEXT2:
   STA      W
   LDA      {W}
   ; A second variable is used to allow a double indirect jump
   ; rather than using self-modifying code. As self-modifying code
   ; would require a cache line invalidate that takes 20 cycles
   STA      W2
   JMP      {W2}


; How to perform a branch
Code:
   TSK      #10         ; get the branch displacement
   AAX               ; add .A and .X
   ORA      #$0A000000   ; select context register #10 (in high eight bits of acc).
   JCI               ; indirect jump to the context to set PC (JCI [acc])
   BRA      NEXT2      ; the JCI context switch will cause the next instruction fetch
                  ; so we can skip over the TSK #10


; Bytecode interpreter, dispatch
Code:
NEXT:
   TSK      #10         ; fetch a byte
NEXT2:
   ASL
   TAX
   JMP      (JmpTable,X)

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 23, 2015 5:29 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
Branch to subroutine instructions BSR and BSL have been added for normal and long addresses respectively. This will help with position independent code.
A far jump to context routine (JCF) has been added.

A couple of changes were made to the stack pointer in the interest of getting memory protection from stack overflow.
- the stack pointer is now only 28 bits in size to make room for a new stack size field in the context register
- the stacks must be in the lower 256MB of memory
- the stack size is now limited according to the stack size field in the context register (00=256,01=4096,10=65536,11=16777216 bytes)
- the stack will "wrap around" within the size boundary

I had toyed with the idea of stack bound registers but this would increase the complexity and require a larger context image.
The core is now about 10600 LUTs or 16900 LC's and 6,000 lines of code. There's lots of room left in the FPGA (xc7a100t).

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core
PostPosted: Tue Nov 24, 2015 8:08 pm 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
Now that I had the core working fairly well, I decided to change it all around.
The way segmentation works has changed in order to reduce the number of bytes moved around and increase functionality.
The 32 bit segment registers have been turned into 16 bit selectors which index a table to get the 32 bit segment value and segment attributes. So there is a level of indirection now to fetch the segment value.
The additional attributes include segment execute and write flags. With the additional segment attributes some memory protection features are added.

My quandry was what to do when there's a segmentation problem ? My solution is to execute the break instruction and leave the segmentation status in a register. That way segment faults may even be processed in 8 or 16 bit emulation modes.
The core now has some memory protection via read-only segments and executable segments.
Segment bounds are also checked during a read or write operation.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
PostPosted: Mon Jan 16, 2017 6:57 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
2017/01/15
Added an LLA: (load linear address) prefix instruction to the core, which causes the linear address of the next instruction to be calculated and loaded into the accumulator. The linear address is the address after segmentation has been applied. The instruction was added as a prefix rather than another operation because of the large number of address modes available. It was undesirable to use up a whole bunch of opcodes for an infrequent operation. So
Code:
LLA: LDA {$21},Y

loads the accumulator with the 32 bit contents of address $21 plus the .Y register plus the data segment rather than performing the LDA operation.
I thought I’d use the DisplayString() routine to show how the core is working. It has an example of stack relative addressing indexed by the .Y register. The “FAR” prefix indicates that the indirect address contains two more additional bytes which indicate the selector used in addressing. So the address occupies a total of four bytes on the stack. Indirect long, and indirect extra long addressing with a FAR prefix is also possible. The instructions looks like: LDA FAR [4,S],Y and LDA FAR {4,S},Y respectively.
Code:
DisplayString:
   PHP            ; push reg settings
   SEP      #$20      ; ACC = 8 bit
   MEM      8      ; tell the assembler 8 bits in use
   LDY      #0
.0002:
   LDA      FAR (4,S),Y   ; a Far short address
   BEQ      .0001
   JSR      SuperPutch   ; put the character
   INY
   BRA      .0002
.0001:
   PLP            ; restore regs settings
   MEM      16      ; tell the assembler 16 bits in use
   RTS      #4      ; pop stack argument


DisplayString() is called in the following manner
Code:
   PEA      8      ; push segment selector part of address
   PEA      msgSSM      ; push the offset part of the address
   JSR      DisplayString   ; call displayString

msgSSM:
   .byte   "Single step mode task starting.",CR,LF,0


_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core
PostPosted: Wed May 24, 2017 3:22 pm 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
2017/05/23
Talk of the 65VM02 http://forum.6502.org/viewtopic.php?f=10&t=4531 in another thread has spurred me on to work on this project some more.
Ported the FT832 test system to a Nexys Video board. There was some trouble getting the keyboard to work until it was identified that pullup resistors for the signal lines were not properly specified. Once spec’d properly the keyboard started working.

Working on software, issues with the segmented architecture arise. Currently in order to access I/O devices from a generic app a segment must be setup to establish an address range, then the selector specified during a load or store operation. So updating the LEDS for instance looks like STA 5:$7000 where 5 is the selector containing a base address of zero, and $7000 is the address of the LEDs. This adds four bytes to every I/O access. It may not be that bad if I/O were to be accessed by only the BIOS/OS which could then use the data segment rather than needing to specify a selector. The other gotcha is accessing the BIOS ROM. It would be nice if it could be mapped into every code segment at a specific address. In order to do this segmentation would have to be bypassed for a ROM address range.
A lot of the I/O is placed in the $00Fxxxxx address range in order to make it available to ‘816 mode. This is likely to be somewhere in the middle of an app’s data segment.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core / FT832
PostPosted: Thu May 25, 2017 4:50 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
I can't seem to get the screen scrollup code to work. Instead of scrolling the screen up and blanking out the last line, the screen is blanked and characters scroll around on the last line. Code for the scrollup which falls through into blanking a line is posted below. The code is running in 16 bit mode ('816 compatible).

Code:
   8317 00EA36                             ScrollUp:
   8318 00EA36 A0 00 00                        LDY      #0            ; .Y used as index to char
   8319 00EA39 A2 2B 0A                        LDX    #2603         ; number of chars on screen
   8320 00EA3C                             .0001:
   8321 00EA3C 5A                              PHY                  ; save off current .Y
   8322 00EA3D 98                              TYA                        
   8323 00EA3E 18                              CLC                  ; Add double the number of text
   8324 00EA3F 65 4C                           ADC      Textcols      ; columns to .Y to find start of next
   8325 00EA41 18                              CLC                  ; row
   8326 00EA42 65 4C                           ADC      Textcols
   8327 00EA44 A8                              TAY                  
   8328 00EA45 42 DA 42 B7 40                  LDA      FAR {Vidptr},Y   ; .A = Load buffer[textcols+Y]
   8329 00EA4A 7A                              PLY                  ; .Y = restore current .Y
   8330 00EA4B 42 DA 42 97 40                  STA      FAR {Vidptr},Y   ; Store .A in buffer[0+Y]
   8331 00EA50 C8                              INY                  ; advance to next character
   8332 00EA51 C8                              INY                  ; decrement total char count
   8333 00EA52 CA                              DEX
   8334 00EA53 D0 E7                           BNE      .0001
   8335 00EA55 A5 4E                           LDA      Textrows
   8336 00EA57 3A                              DEA
                                           
   8338 00EA58                             BlankLine:
   8339 00EA58 0A                              ASL
   8340 00EA59 A8                              TAY
   8341 00EA5A 42 1B B9 59 F2                  LDA      CS:LineTbl,Y
   8342 00EA5F 0A                              ASL
   8343 00EA60 A8                              TAY
   8344 00EA61 A6 4C                           LDX      Textcols      ; number of chars to clear
   8345 00EA63 A5 36                           LDA      NormAttr
   8346 00EA65 09 20 00                        ORA      #$20         ; space
   8347 00EA68                             .0001:
   8348 00EA68 42 DA 42 97 40                  STA      FAR {Vidptr},Y
   8349 00EA6D C8                              INY                  ; increment to next char
   8350 00EA6E C8                              INY
   8351 00EA6F CA                              DEX                  ; decrement number of chars
   8352 00EA70 D0 F6                           BNE      .0001
   8353 00EA72 60                              RTS

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core
PostPosted: Thu May 25, 2017 9:28 am 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
I didn't read through all of this topic, perhaps I'm missing something fundamental, but if I understand your code correctly you have 2603 16 bit characters per screen.

When you compute Y+2xTextcols there is a wrong CLC @ 8325 but this is not very important. And you advance your source pointer one full line beyond the screen - I don't know if this matters. Perhaps you should calculate X = X - Textcols before starting the loop.

Assuming you would correct X before looping, then when the branch @ 8334 is not taken Y would point to the first non screen character. You could then simply assemble the "blank" char (8345, 8346) and setup the counter (8344) and then start a loop: DEY, DEY, STA, DEX, BNE.

Hope this helps :)

Arne

edit(1):
2603 characters is really an odd number. Elsewhere you presented a piece of a terminal emulation which seems to have 30 lines by 85 characters. That would be 2550 char/scr.


Top
 Profile  
Reply with quote  
PostPosted: Thu May 25, 2017 2:00 pm 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
Quote:
2603 characters is really an odd number. Elsewhere you presented a piece of a terminal emulation which seems to have 30 lines by 85 characters. That would be 2550 char/scr.

It should actually be 2604 (the screen is 84 x 31). I suppose I could have gone with an 80x25 screen and a large blank area around the text. It's completely programmable anyway, but the internal memory is limited to 4096 characters. The display is a wide-screen monitor so there's some sense to using up more of the display area.

The code should probably have the length of one line subtracted from the total chars as you say. There's no reason to scroll beyond the end of the screen.

There does not appear to be anything fundamentally wrong with the code causing it to operate the way it does. So I'm guessing it's a hardware problem of some sort.
It's either executing an instruction incorrectly or not fetching the right instructions. It's strange that it runs all the way through without hanging.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core
PostPosted: Thu May 25, 2017 6:11 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Perhaps starting somewhere in the middle of the screen and then scroll only 4 or so lines may help to identify the nature of the bug. If it still blanks the lines or perhaps do weird addressing.

Because it clears the screen as you say, there is a good chance that FAR fetching didn't address the proper cell or reads zero or something that looks like a blank. I would first insert a couple of NOPs between PLY and STA FAR to verify that it is not position dependent. And then replacing the NOPs by saving A and Y using absolute,x addressing mode into a free mem region. Then you can figure out what is fetched and to where it should go.

You may also try to set the attribute half to something fixed during transfer - perhaps you saving black characters with black background? :)

BTW - why are you not using TFR ?


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core
PostPosted: Thu May 25, 2017 8:01 pm 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
Tried a few things as you suggested, and I got it fixed. I noticed that the backspace key on the keyboard didn't work either.
It was traced to the LDX zp instruction. The LDX zp and LDY zp instructions were screwy. Fixing those instructions cleaned up all sorts of problems.
Now Supermon816 runs! The screen scrolls as expected.

Quote:
BTW - why are you not using TFR ?

TFR ? I assume this is the MVN / MVP instructions ?

The system maintains a far (6 byte) pointer (called Vidptr) to the video buffer in the data segment for each task. This allows a task to write to a virtual screen which may not be displayed depending on the task state. In the optimized system the MVN/MVP instructions could be used, but it's a fair amount of work to setup it up from indirect pointer. I don't think the MVN/MVP can make use of the FAR prefix to indicate far addressing. But it's not impossible to calculate the physical address into a 32 bit reg. IT just aint' simple.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core
PostPosted: Thu May 25, 2017 8:43 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Ooops - my fault. TFR :lol: of course I mentioned MVN/MVP - somehow I really dislike most of these 816 mnemonics, they are not well choosen.

Glad that I could help a little.

Concerning the MVN/MVP instructions - especially in the context of far addressing: I didn't test this, but IMHO MVN or MVP didn't make a page-crossing when you wish to move s.th. from page:F123...page+1:4567 to somewhere else. If you are going to implement far block moves, you should manage this more universal. And if you could manage it to make this instruction interuptable BUT not always rereading the code... ;)


Top
 Profile  
Reply with quote  
PostPosted: Thu May 25, 2017 11:38 pm 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
The way I have the MVN / MVP instructions working in 32 bit mode is to ignore the bank values specified in the instruction. The entire 32 bit .X and .Y registers provide the load and store locations which must be in the same data segment. The instruction is interruptible, but for simplicity re-reads the opcode for each move. It's not too bad to re-read the opcode since the instruction is coming from a cache and there's no external bus activity for reading the opcode. All three bytes of the opcode are read in a single clock cycle. All that shows up on the external bus is the load, then the store operation. I also added a FILL operation which does only stores to the data bus and saves a few clock cycles over using MVN/MVP.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
 Post subject: Re: FT816 Core / FT832
PostPosted: Sat May 27, 2017 3:50 am 
Offline
User avatar

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 449
Location: Canada
2017/05/26
FT832. It looks like I left issues with interrupts to be determined. As they are now unless the core is in 832 mode it doesn’t save the code selector on the stack during an interrupt. So there’s no way to know what segment to return to. Also the code selector of the IRQ routine is assumed to be the current code segment. This creates a problem if the core is running in a different code segment when an IRQ occurs. It pulls the IRQ vector from the vector table as normal but then it will jump to the address in the currently running code segment. It really needs to establish the code segment properly when an IRQ occurs. The code segment needs to be part of the IRQ vector or it needs to be assumed to be some value.
So, the code segment is going to be assumed to be zero for an IRQ routine. If a segmented core is specified then the code segment will be saved on the stack as well as the PC. This takes two extra bytes.
I’ve been trying to get EnhBasic to run and It’s close. The system gets to the EnhBasic code then hangs. I suspect it’s a bus problem of some sort as there is a LED status display fail. It’s supposed to display status ‘AB’ but it displays ‘A9’. It’s just one bit difference. EnhBasic is faked out to run in it's own 64k block at $01xxxx.

_________________
http://www.finitron.ca


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 72 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: