6502.org http://forum.6502.org/ |
|
FT816 Core http://forum.6502.org/viewtopic.php?f=10&t=3099 |
Page 3 of 5 |
Author: | Rob Finch [ Tue Nov 17, 2015 7:54 am ] |
Post subject: | Re: FT816 Core / FT832 Core Fork |
Quote: Not that you need too much encouragement, but I thought that I'd note that I like to follow your progress on your various projects. Invariably I find that your efforts provide great insights on how I can improve my efforts. I've had virtually the same thoughts regarding task switching that you are adding to your core. I am looking forward to your continued posts on how your implementation pans out. Thanks for the compliment. A unix like FORK instruction has been added to the core. This allows a unix paradigm to used when setting up a task. A sample test of the FORK instruction seems to work. It runs keyboard initialization as a task. Code: FORK #6 ; use context register #6 TTA ; get the running context (pid) (Transfer TR to Acc) CMP #6 ; are we running in context #6 ? LBEQ KeybdInit ; if so, initialize keyboard .... continues with monitor code The keyboard message from keyboard initialization comes up, and system gets to the BIOS prompt so I'm pretty sure the FORK instruction is working. But, I can't get the invaders app to run (it hangs) and it's also started with FORK. I'm guessing it's a software problem though. The stack has been modified so that it no longer resides in the data segment. Instead the stack pointer is a flat unsegmented address pointer (or rather based on stack segment value of zero). I ran into problems calling BIOS routines where the data segment needed to be switched. This created havoc with the stack since it was located in the data segment. I've been writing a little invaders app that runs with its own data segment at $10000. However I'd like to be able to call BIOS routines. BIOS routines assume the data segment is $0000. That means the data segment needed to be switched. However, as soon as the DS was changed, the stack was lost because the stack was located by the data segment. Unfortunately, the only way to change the DS is by pushing a value onto the stack, then popping the DS. Since the stack was being switched around, interrupts also had to be disabled while this took place. It ended up being a whole bunch of ugly nonsense. 30+ bytes of code just to call a subroutine. I'd planned to use shared memory located by using segment zero, as a means to communicate parameters to tasks, rather than the stack. So I thought having the stack in the data segment wouldn't be a big issue. Live and learn. The other alternative I've been toying with is having four segment registers like the x86. |
Author: | Rob Finch [ Thu Nov 19, 2015 2:26 am ] |
Post subject: | Re: FT816 Core / FT832 Core JCR, DDR2 |
I've had some trouble getting the DDR2 ram controller interface to work. A lot of the problem getting the invaders app to work has to do with getting bad responses from ram. I can test the DDR2 ram interface with Supermon and it doesn't always work. Testing the block ram interface works all the time. So I moved the invaders data segment into block ram, and it works much better. Of course it still hangs. I found and fixed a bug with the segment prefix instructions by trying to get the invaders app to work. Part of the problem is that there is a clock domain to cross over in the DDR ram controller. So signals have to be registered onto the new domain. Generation of a ready signal from the controller is tricky then as ready needs to go low as soon as ram is selected (not after several clocks for domain crossing). And it needs to remain low until the ram is ready. Another problem is that timing constraints for the DDR2 memory aren't fully specified. Well I specified them but Vivado ignores the specification claiming there's syntax errors. I added the impure context switching operations JCR - jump context routine, and RTC - return from context routine. The context is switched however all registers are not saved and restored. The .A, .X, .Y, and flags registers are allowed to propagate between contexts. This allows a context switch to be treated a little bit like a subroutine call. This is intended for synchronous context switches only, where the context will not be running asynchronously to the caller. The JCR and RTC instruction support BIOS calls where it is desirable to switch to the BIOS stack and data segment. For instance in the invaders app when a keystroke is tested the BIOS is called like this: Code: JSR Initialize JSR RenderInvaders .0002: JCR KeybdGetCharNoWaitCtx,7 ; check for char at keyboard (BIOS call) .0004: LDX #0 .0006: The '7' is the BIOS context register number. Then in the BIOS the keyboard routine is called and returns a value in .A and flags. Code: KeybdGetCharNoWaitCtx: JSR KeybdGetCharNoWait ; call the "real" BIOS routine RTC #0 ; pop zero bytes off the stack and return to original context For a real mind-bender the JCL instruction could be studied. JCL is the long form of the JCR instruction which also allows copying parameters between stacks. So JCL allows stack based parameters. After task and context switching I'll be having a look at ways to speed up interpreters. There's about 80 instructions free yet on the second page of opcodes. |
Author: | Rob Finch [ Fri Nov 20, 2015 5:04 am ] |
Post subject: | Re: FT816 Core / FT832 Core - VM support |
I've been reviewing the following set of posts on VM support http://forum.6502.org/viewtopic.php?f=10&t=3046 It has been mentioned on the forum that it would nice if the IP (interpretive pointer) register were an internal cpu register. Even better if it auto-increments after fetching from memory. I put some thought into supporting the IP register as an internal core register. And came to the conclusion that the way to do it was to use the existing program counter (at least in a multi-context core). The program counter both fetches from memory and auto-increments. Even better it uses the instruction cache to cache interpreter code. Hence I came up with the following. The following requires multiple register contexts in the processor. It ping-pongs between two register set contexts. One is a special interpretive task that fetches the code; the other is an interpreter that interprets the code fetches. A special core operating mode called interpretive mode is used to assist the implementation of interpreters. It can be turned on by setting bit 2 in the extended status register. An interpretive mode task uses the PC itself to fetch opcodes into the accumulator register rather than the instruction register. The PC of the interpretive task takes on the function of the IP register in Forth parlance. In the IFETCH stage of the core for an interpretive task the core then performs a task switch back to the invoking task. The accumulator is passed back to the interpreter where it can be processed. The program counter of the interpretive task is passed back to the invoker in the .X register. Benefits of using a task are: the bytecode can reside in a different code or data segment than the interpreter itself. It is relatively fast as an internal registers are being used rather than a variable in memory. Fetchs are governed by the PC and hence from the code segment, meaning the instruction cache is used to enhance performance. The mechanism here allows the interpreter to be written in 32 bit code while at the same time allowing a bytecode implementation. Task # 10 is the special interpretive task in this case. The program for task #10 is perhaps a bytecode. ; A potential Forth NEXT routine. Code: NEXT: ; switch to interpretive task to fetch word at interpretive PC ; switching to the interpretive task, then returning takes 6 clocks TSK #10 NEXT2: STA W LDA {W} ; A second variable is used to allow a double indirect jump ; rather than using self-modifying code. As self-modifying code ; would require a cache line invalidate that takes 20 cycles STA W2 JMP {W2} ; How to perform a branch Code: TSK #10 ; get the branch displacement AAX ; add .A and .X ORA #$0A000000 ; select context register #10 (in high eight bits of acc). JCI ; indirect jump to the context to set PC (JCI [acc]) BRA NEXT2 ; the JCI context switch will cause the next instruction fetch ; so we can skip over the TSK #10 ; Bytecode interpreter, dispatch Code: NEXT:
TSK #10 ; fetch a byte NEXT2: ASL TAX JMP (JmpTable,X) |
Author: | Rob Finch [ Mon Nov 23, 2015 5:29 am ] |
Post subject: | Re: FT816 Core / FT832 Core |
Branch to subroutine instructions BSR and BSL have been added for normal and long addresses respectively. This will help with position independent code. A far jump to context routine (JCF) has been added. A couple of changes were made to the stack pointer in the interest of getting memory protection from stack overflow. - the stack pointer is now only 28 bits in size to make room for a new stack size field in the context register - the stacks must be in the lower 256MB of memory - the stack size is now limited according to the stack size field in the context register (00=256,01=4096,10=65536,11=16777216 bytes) - the stack will "wrap around" within the size boundary I had toyed with the idea of stack bound registers but this would increase the complexity and require a larger context image. The core is now about 10600 LUTs or 16900 LC's and 6,000 lines of code. There's lots of room left in the FPGA (xc7a100t). |
Author: | Rob Finch [ Tue Nov 24, 2015 8:08 pm ] |
Post subject: | Re: FT816 Core |
Now that I had the core working fairly well, I decided to change it all around. The way segmentation works has changed in order to reduce the number of bytes moved around and increase functionality. The 32 bit segment registers have been turned into 16 bit selectors which index a table to get the 32 bit segment value and segment attributes. So there is a level of indirection now to fetch the segment value. The additional attributes include segment execute and write flags. With the additional segment attributes some memory protection features are added. My quandry was what to do when there's a segmentation problem ? My solution is to execute the break instruction and leave the segmentation status in a register. That way segment faults may even be processed in 8 or 16 bit emulation modes. The core now has some memory protection via read-only segments and executable segments. Segment bounds are also checked during a read or write operation. |
Author: | Rob Finch [ Mon Jan 16, 2017 6:57 am ] |
Post subject: | Re: FT816 Core / FT832 Core - load linear address |
2017/01/15 Added an LLA: (load linear address) prefix instruction to the core, which causes the linear address of the next instruction to be calculated and loaded into the accumulator. The linear address is the address after segmentation has been applied. The instruction was added as a prefix rather than another operation because of the large number of address modes available. It was undesirable to use up a whole bunch of opcodes for an infrequent operation. So Code: LLA: LDA {$21},Y loads the accumulator with the 32 bit contents of address $21 plus the .Y register plus the data segment rather than performing the LDA operation. I thought I’d use the DisplayString() routine to show how the core is working. It has an example of stack relative addressing indexed by the .Y register. The “FAR” prefix indicates that the indirect address contains two more additional bytes which indicate the selector used in addressing. So the address occupies a total of four bytes on the stack. Indirect long, and indirect extra long addressing with a FAR prefix is also possible. The instructions looks like: LDA FAR [4,S],Y and LDA FAR {4,S},Y respectively. Code: DisplayString: PHP ; push reg settings SEP #$20 ; ACC = 8 bit MEM 8 ; tell the assembler 8 bits in use LDY #0 .0002: LDA FAR (4,S),Y ; a Far short address BEQ .0001 JSR SuperPutch ; put the character INY BRA .0002 .0001: PLP ; restore regs settings MEM 16 ; tell the assembler 16 bits in use RTS #4 ; pop stack argument DisplayString() is called in the following manner Code: PEA 8 ; push segment selector part of address
PEA msgSSM ; push the offset part of the address JSR DisplayString ; call displayString … msgSSM: .byte "Single step mode task starting.",CR,LF,0 |
Author: | Rob Finch [ Wed May 24, 2017 3:22 pm ] |
Post subject: | Re: FT816 Core |
2017/05/23 Talk of the 65VM02 http://forum.6502.org/viewtopic.php?f=10&t=4531 in another thread has spurred me on to work on this project some more. Ported the FT832 test system to a Nexys Video board. There was some trouble getting the keyboard to work until it was identified that pullup resistors for the signal lines were not properly specified. Once spec’d properly the keyboard started working. Working on software, issues with the segmented architecture arise. Currently in order to access I/O devices from a generic app a segment must be setup to establish an address range, then the selector specified during a load or store operation. So updating the LEDS for instance looks like STA 5:$7000 where 5 is the selector containing a base address of zero, and $7000 is the address of the LEDs. This adds four bytes to every I/O access. It may not be that bad if I/O were to be accessed by only the BIOS/OS which could then use the data segment rather than needing to specify a selector. The other gotcha is accessing the BIOS ROM. It would be nice if it could be mapped into every code segment at a specific address. In order to do this segmentation would have to be bypassed for a ROM address range. A lot of the I/O is placed in the $00Fxxxxx address range in order to make it available to ‘816 mode. This is likely to be somewhere in the middle of an app’s data segment. |
Author: | Rob Finch [ Thu May 25, 2017 4:50 am ] |
Post subject: | Re: FT816 Core / FT832 |
I can't seem to get the screen scrollup code to work. Instead of scrolling the screen up and blanking out the last line, the screen is blanked and characters scroll around on the last line. Code for the scrollup which falls through into blanking a line is posted below. The code is running in 16 bit mode ('816 compatible). Code: 8317 00EA36 ScrollUp:
8318 00EA36 A0 00 00 LDY #0 ; .Y used as index to char 8319 00EA39 A2 2B 0A LDX #2603 ; number of chars on screen 8320 00EA3C .0001: 8321 00EA3C 5A PHY ; save off current .Y 8322 00EA3D 98 TYA 8323 00EA3E 18 CLC ; Add double the number of text 8324 00EA3F 65 4C ADC Textcols ; columns to .Y to find start of next 8325 00EA41 18 CLC ; row 8326 00EA42 65 4C ADC Textcols 8327 00EA44 A8 TAY 8328 00EA45 42 DA 42 B7 40 LDA FAR {Vidptr},Y ; .A = Load buffer[textcols+Y] 8329 00EA4A 7A PLY ; .Y = restore current .Y 8330 00EA4B 42 DA 42 97 40 STA FAR {Vidptr},Y ; Store .A in buffer[0+Y] 8331 00EA50 C8 INY ; advance to next character 8332 00EA51 C8 INY ; decrement total char count 8333 00EA52 CA DEX 8334 00EA53 D0 E7 BNE .0001 8335 00EA55 A5 4E LDA Textrows 8336 00EA57 3A DEA 8338 00EA58 BlankLine: 8339 00EA58 0A ASL 8340 00EA59 A8 TAY 8341 00EA5A 42 1B B9 59 F2 LDA CS:LineTbl,Y 8342 00EA5F 0A ASL 8343 00EA60 A8 TAY 8344 00EA61 A6 4C LDX Textcols ; number of chars to clear 8345 00EA63 A5 36 LDA NormAttr 8346 00EA65 09 20 00 ORA #$20 ; space 8347 00EA68 .0001: 8348 00EA68 42 DA 42 97 40 STA FAR {Vidptr},Y 8349 00EA6D C8 INY ; increment to next char 8350 00EA6E C8 INY 8351 00EA6F CA DEX ; decrement number of chars 8352 00EA70 D0 F6 BNE .0001 8353 00EA72 60 RTS |
Author: | GaBuZoMeu [ Thu May 25, 2017 9:28 am ] |
Post subject: | Re: FT816 Core |
I didn't read through all of this topic, perhaps I'm missing something fundamental, but if I understand your code correctly you have 2603 16 bit characters per screen. When you compute Y+2xTextcols there is a wrong CLC @ 8325 but this is not very important. And you advance your source pointer one full line beyond the screen - I don't know if this matters. Perhaps you should calculate X = X - Textcols before starting the loop. Assuming you would correct X before looping, then when the branch @ 8334 is not taken Y would point to the first non screen character. You could then simply assemble the "blank" char (8345, 8346) and setup the counter (8344) and then start a loop: DEY, DEY, STA, DEX, BNE. Hope this helps Arne edit(1): 2603 characters is really an odd number. Elsewhere you presented a piece of a terminal emulation which seems to have 30 lines by 85 characters. That would be 2550 char/scr. |
Author: | Rob Finch [ Thu May 25, 2017 2:00 pm ] |
Post subject: | Re: FT816 Core / FT832 scrollup |
Quote: 2603 characters is really an odd number. Elsewhere you presented a piece of a terminal emulation which seems to have 30 lines by 85 characters. That would be 2550 char/scr. It should actually be 2604 (the screen is 84 x 31). I suppose I could have gone with an 80x25 screen and a large blank area around the text. It's completely programmable anyway, but the internal memory is limited to 4096 characters. The display is a wide-screen monitor so there's some sense to using up more of the display area. The code should probably have the length of one line subtracted from the total chars as you say. There's no reason to scroll beyond the end of the screen. There does not appear to be anything fundamentally wrong with the code causing it to operate the way it does. So I'm guessing it's a hardware problem of some sort. It's either executing an instruction incorrectly or not fetching the right instructions. It's strange that it runs all the way through without hanging. |
Author: | GaBuZoMeu [ Thu May 25, 2017 6:11 pm ] |
Post subject: | Re: FT816 Core |
Perhaps starting somewhere in the middle of the screen and then scroll only 4 or so lines may help to identify the nature of the bug. If it still blanks the lines or perhaps do weird addressing. Because it clears the screen as you say, there is a good chance that FAR fetching didn't address the proper cell or reads zero or something that looks like a blank. I would first insert a couple of NOPs between PLY and STA FAR to verify that it is not position dependent. And then replacing the NOPs by saving A and Y using absolute,x addressing mode into a free mem region. Then you can figure out what is fetched and to where it should go. You may also try to set the attribute half to something fixed during transfer - perhaps you saving black characters with black background? BTW - why are you not using TFR ? |
Author: | Rob Finch [ Thu May 25, 2017 8:01 pm ] |
Post subject: | Re: FT816 Core |
Tried a few things as you suggested, and I got it fixed. I noticed that the backspace key on the keyboard didn't work either. It was traced to the LDX zp instruction. The LDX zp and LDY zp instructions were screwy. Fixing those instructions cleaned up all sorts of problems. Now Supermon816 runs! The screen scrolls as expected. Quote: BTW - why are you not using TFR ? TFR ? I assume this is the MVN / MVP instructions ? The system maintains a far (6 byte) pointer (called Vidptr) to the video buffer in the data segment for each task. This allows a task to write to a virtual screen which may not be displayed depending on the task state. In the optimized system the MVN/MVP instructions could be used, but it's a fair amount of work to setup it up from indirect pointer. I don't think the MVN/MVP can make use of the FAR prefix to indicate far addressing. But it's not impossible to calculate the physical address into a 32 bit reg. IT just aint' simple. |
Author: | GaBuZoMeu [ Thu May 25, 2017 8:43 pm ] |
Post subject: | Re: FT816 Core |
Ooops - my fault. TFR of course I mentioned MVN/MVP - somehow I really dislike most of these 816 mnemonics, they are not well choosen. Glad that I could help a little. Concerning the MVN/MVP instructions - especially in the context of far addressing: I didn't test this, but IMHO MVN or MVP didn't make a page-crossing when you wish to move s.th. from page:F123...page+1:4567 to somewhere else. If you are going to implement far block moves, you should manage this more universal. And if you could manage it to make this instruction interuptable BUT not always rereading the code... |
Author: | Rob Finch [ Thu May 25, 2017 11:38 pm ] |
Post subject: | Re: FT816 Core / FT832 MVN/MVP |
The way I have the MVN / MVP instructions working in 32 bit mode is to ignore the bank values specified in the instruction. The entire 32 bit .X and .Y registers provide the load and store locations which must be in the same data segment. The instruction is interruptible, but for simplicity re-reads the opcode for each move. It's not too bad to re-read the opcode since the instruction is coming from a cache and there's no external bus activity for reading the opcode. All three bytes of the opcode are read in a single clock cycle. All that shows up on the external bus is the load, then the store operation. I also added a FILL operation which does only stores to the data bus and saves a few clock cycles over using MVN/MVP. |
Author: | Rob Finch [ Sat May 27, 2017 3:50 am ] |
Post subject: | Re: FT816 Core / FT832 |
2017/05/26 FT832. It looks like I left issues with interrupts to be determined. As they are now unless the core is in 832 mode it doesn’t save the code selector on the stack during an interrupt. So there’s no way to know what segment to return to. Also the code selector of the IRQ routine is assumed to be the current code segment. This creates a problem if the core is running in a different code segment when an IRQ occurs. It pulls the IRQ vector from the vector table as normal but then it will jump to the address in the currently running code segment. It really needs to establish the code segment properly when an IRQ occurs. The code segment needs to be part of the IRQ vector or it needs to be assumed to be some value. So, the code segment is going to be assumed to be zero for an IRQ routine. If a segmented core is specified then the code segment will be saved on the stack as well as the PC. This takes two extra bytes. I’ve been trying to get EnhBasic to run and It’s close. The system gets to the EnhBasic code then hangs. I suspect it’s a bus problem of some sort as there is a LED status display fail. It’s supposed to display status ‘AB’ but it displays ‘A9’. It’s just one bit difference. EnhBasic is faked out to run in it's own 64k block at $01xxxx. |
Page 3 of 5 | All times are UTC |
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |