Joined: Thu May 28, 2009 9:46 pm Posts: 8407 Location: Midwestern USA
|
floobydust wrote: I've taken an approach of modifying my most recent C02 BIOS/Monitor code (was Version 2.04) and now have an interim 2.05 version. I took some time reading the source code for your BIOS and would like to offer some constructive comments. - Generally speaking, a simple ROM BIOS doesn't not need to vector API calls. The theory behind this is a BIOS's role is to provide something for the MPU to run following reset. Unless you can think of a scenario where you would need to wedge into a BIOS API function I suggest you eliminate the indirect jumps, conserving RAM and improving execution speed, and reducing the risk of a crash due to inadvertent vector overwrite.
That said, I would retain the indirect jump vectors for the interrupt handlers, as they will be handy for wedging in extra ISR code for test purposes. I did this in my POC BIOS so I could develop the low-level, interrupt-driven SCSI driver routines without having to burn a new ROM each time I changed/fixed something.
- In examining the UART receiver interrupt handler (ISR), I see some things that could be refined to improve serial I/O (SIO) performance. Currently the receiver ISR is as follows:
Code: UART_RCV LDY ICNT ;Get input buffer count (3) BMI BUFFUL ;Check against limit ($80), branch if full (2/3) LDA UART_RECEIVE ;Else, get character from 2691 (4) ; LDY ITAIL ;Get the tail pointer to buffer (3) STA IBUF,Y ;Store into buffer (5) INC ITAIL ;Increment tail pointer (5) RMB7 ITAIL ;Strip off bit 7, 128 bytes only (5) INC ICNT ;increment character count (5) Consider the following:
- Currently, you are testing for receiver circular queue (RQ, symbol IBUF) space before getting a datum from the UART (note: what you are receiving are datums, not characters ). The problem with doing so is if a receiver IRQ is not serviced with alacrity, an overrun error during sustained, high-speed transfer is likely to occur. This is an important consideration with the 2691, which only has a 4-deep RHR. Better to prevent an overrun than to try to detect it after your data stream has been corrupted.
Also consider that if your receiver is operating at 115.2 Kbps it will interrupt 11,520 times per second during sustained reception. This because you are not clearing the RHR on each IRQ, only reading one datum from it. The result is that as soon as your ISR terminates and the MPU returns to the foreground it will be interrupted again if the RHR is not empty, adversely affecting foreground performance—the MPU is executing those pre- and post-amble ISR instructions 11,520 times per second.
A better procedure would be to read the RHR, verify that RQ space is available and if so, queue the datum. If the RQ is full, discard the datum, since there is no place to put it. For more efficient code, the receive part of your ISR should loop back and check to see if the RHR still has datums. If not, the receive part of the ISR is done. Otherwise, repeat the RQ test and datum store code. I will illustrate the technique later on.
- There is no need to maintain a separate datum count for RQ, as a comparison between IHEAD and ITAIL will tell you what you need to know about the state of the queue. If IHEAD = ITAIL RQ is empty. If (IHEAD+1) AND $7F = ITAIL RQ is full. Those are the only conditions about which your code needs to be concerned.
- The following code fragment is not as efficient as it could be, and is also non-portable:
Code: INC ITAIL ;Increment tail pointer (5) RMB7 ITAIL ;Strip off bit 7, 128 bytes only (5) INC ICNT ;increment character count (5) The RMB/SMB instructions are non-portable due to they being available only with the 65C02 (and not all 65C02s). Neither the NMOS device (which shouldn't be used in new designs) or the 65C816 has them, the latter which usurped the corresponding opcodes for other purposes.
You already have ITAIL loaded into .Y, so why not just copy it to .A, increment .A, AND .A to eliminate bit 7 and store the result?
Here's how I would do the receiver ISR. This code eliminates references to ICNT and empties the RHR before moving on:
Code: UART_RCV: ;read datum from UART & attempt to queue it ; lda uart_status ;RHR status (4) lsr A ;RHR empty? (2) bcc tl0010 ;yes, done with receiver... (2,3,4) ; ; ——————————————————————————————————————— ; branch not taken: 2 clocks ; branch taken wo/page crossing: 3 clocks ; branch taken w/page crossing: 4 clocks ; ——————————————————————————————————————— ; ldy uart_receive ;fetch waiting datum (4) lda ihead ;fetch RQ "get" index (3) inc A ;bump it &... (2) and #%01111111 ;wrap it (2) cmp itail ;RQ "put" index (3) beq uart_rcv ;RQ is full... (2,3,4) ; ; ————————————————————————————————————————————————————————— ; If RQ is full we discard the datum & loop back, Doing so ; clears the RHR & prevents a receiver overrun. This won't ; happen if the foreground promptly processes the incoming ; data stream. ; ————————————————————————————————————————————————————————— ; ldx itail ;RQ "put" index (3) sty ibuf,x ;queue newest datum (5) txa ;copy current RQ "put" index (2) inc A ;bump it,... (2) and #%01111111 ;wrap it &... (2) sta itail ;store it (3) bra uart_rcv ;go back for more (3,4) ; tl0010 ...end of receiver ISR... The above code addresses a number of things:
- Each pass through the routine starts by determining if the RHR has at least one datum. If not, the routine terminates. Otherwise, the datum is gotten and an attempt is made to store it in the queue (IBUF). If IBUF is full the datum is discarded.
- No queue content counter is used or needed. Testing of the "get" and "put" indices (IHEAD and ITAIL, respectively) tells us all we need to know about the state of IBUF. A byte of zero page space is recovered for other purposes.
- IBUF management avoids the non-portable RMB instruction and uses register-based instructions that work with any 65C02, as well as the 65C816.
- All waiting datums are gotten from the UART in a single interrupt, avoiding the expense of processing back-to-back IRQs. Running at 115.2 Kbps continuous data flow results in 2880 IRQs per second instead of 11,520. Although the overall code has more instructions, the actual processing time is substantially reduced.
- A significant limitation of the 2691 is the lack of a transmitter FIFO, which means the looping technique illustrated in the receive ISR can't be used when a transmitter IRQ must be serviced. That is, there is no way to avoid the "IRQ storm" that sustained, high-speed transmission will cause (which is why the 26C92, 28L91 or 28L92 are better choices—they have transmit FIFOs). However, some code changes to the transmit ISR can be made to eliminate the OCNT queue counter and avoid the use of the non-portable RMB instruction. Additionally, management of the transmitter can be implemented with smaller code:
Code: UART_XMT: ;fetch datum from transmit queue & write to UART ; lda uart_isr ;get TxD IRQ status (4) lsr A ;transmitter interrupting? (2) bcc tl0020 ;no (2,3,4) ; ; ————————————————————————————————————————————————— ; As the 2691 lacks a TxD FIFO, there is no need to ; check the TxRDY bit in the status register. This ; is because the UART will interrupt as soon as all ; bits in the THR are shifted onto the wire. ; ————————————————————————————————————————————————— ; ldx ohead ;fetch TQ "get" index (3) ; ; ———————————————————————————————————————————— ; TQ is the transmit circular queue, aka OBUF. ; ———————————————————————————————————————————— ; cpx otail ;TQ "put" index (3) beq tl0010 ;TQ is empty... (2,3,4) ; lda obuf,x ;get datum from TQ &... (5) sta uart_transmit ;send it (4) txa ;copy "get" index (2) inc A ;bump it,... (2) and #%01111111 ;wrap it &... (2) sta ohead ;store it (3) bra tl0020 ;done for now (3,4) ; ; ——————————————————————————————————————————————————————————— ; If TQ is empty, transmitter IRQs must be disabled so as to ; avoid deadlock. This is accomplished by disabling the ; transmitter, A flag is set to let the foreground know the ; transmitter has been disabled. ; ; Although the below code disables the transmitter without ; checking the status of the THR, there is no problem in ; doing so. The UART will not actually shut down the trans- ; mitter until the last bit has been shifted out to the wire. ; ——————————————————————————————————————————————————————————— ; tl0010 lda #%00001000 ;disable... (2) sta uart_command ;transmitter (4) lda #10000000 ;tell foreground... (2) sta txdflag ;about it (3) ; tl0020 ...end of transmitter ISR... The above code does the following:
- The ISR begins by determining if the transmitter is interrupting. If it is not then it must have been disabled and no further processing is required. As the 2691 lacks a transmit FIFO there is no reason to check the status of THR—the UART will interrupt as soon as it shifts the datum in THR out to the wire.
- The queue management technique illustrated in the receive ISR is used here as well, eliminating the need for OCNT. Its location on zero page can be given to the txdflag transmitter status.
- The code that tested the state of THR has been eliminated, as it is unnecessary. The transmitter can be safely disabled while it is still shifting bits out to the wire. The actual shutdown will occur as soon as the final bit has been sent. txdflag is set to %10000000 so the transmit foreground function will know that it will have to restart the transmitter after queuing a datum. The actual flag value can be anything.
- The changes in §2 (above) require corresponding changes to the receive foreground to accommodate the revised queue management method. Also, the current CHRIN/CHRIN_NW code is needlessly redundant. There is no reason to put a spin loop into the BIOS just to wait for some data to arrive—the higher level functions that are waiting for data can spin if needed. Also, why consume resources by requiring that one routine JSR to another? Furthermore, the logic involving the use of carry to indicate if there is anything in RQ is contrary to the nearly universal programming practice that uses carry to indicate "true" (carry cleared) or "false" (carry set) status.
The following code replaces the CHRIN/CHRIN_NW functions with a simplified routine that includes the customary use of carry as a status indicator:
Code: CHRIN: ;fetch datum from receive queue w/immediate return ; ; Calling syntax: jsr chrin ; bcs no_datum ; ; Exit registers: .A: datum or entry value ; .X: entry value ; .Y: entry value ; SR: NV-BDIZC ; |||||||| ; |||||||+———> 0: datum returned ; ||||||| 1: queue empty ; ++++++++———> undefined ; phx ;preserve &... (3) phy ;preserve (3) ldx ihead ;fetch RQ "get" index (3) cpx itail ;RQ "put" index (3) beq tl0010 ;RQ is empty... (2,3,4) ; ; ——————————————————————————————————————————————————————————— ; The ihead —> itail comparison automatically conditions .C. ; ——————————————————————————————————————————————————————————— ; ldy ibuf,x ;get datum from RQ (4) txa ;copy "get" index (2) inc A ;bump it,... (2) and #%01111111 ;wrap it &... (2) sta ihead ;store it (3) tya ;copy datum (2) clc ;datum gotten (2) ; tl0010 ply ;restore &... (4) plx ;restore (4) rts ;return to caller (6)
- The changes in §3 (above) require corresponding changes to the transmit foreground to accommodate the revised queue and UART management methods. The following code addresses all that, plus offers both blocking and non-blocking modes:
Code: CHROUT: ;write datum to transmit queue ; ; Calling syntax: clc ;block on full queue ; ...or... ; sec ;return on full queue ; jsr chrout ; bcs queue_full ; ; Exit registers: .A: entry value ; .X: entry value ; .Y: entry value ; SR: NV-BDIZC ; |||||||| ; |||||||+———> 0: datum accepted ; ||||||| 1: queue full ; ++++++++———> undefined ; phy ;preserve,... (3) phx ;preserve &... (3) pha ;preserve (3) tax ;keep datum handy (2) lda #0 ;pick up... (2) ror A ;carry &,,, (2) tay ;save as a flag (2) ; tl0010 lda ohead ;fetch TQ "get" index (3) inc A ;bump it &... (2) and #%01111111 ;wrap it (2) cmp otail ;TQ "put" index (3) beq tl0040 ;TQ is full (2,3,4) ; txa ;recover datum (4) ldx otail ;TQ "put" index (3) sta obuf,x ;queue datum for transmission (4) txa ;copy current TQ "put" index (2) inc A ;bump it,... (2) and #%01111111 ;wrap it &... (2) sta otail ;store it (3) clc ;datum queued (2) lda #%10000000 ;is the transmitter... (2) trb txdflag ;enabled? (5) beq tl0030 ;yes, we're done (2,3,4) ; tl0020 lda #%00000100 ;wake up... (2) sta uart_command ;transmitter (4) ; tl0030 pla ;restore,... (4) plx ;restore &... (4) ply ;restore (4) rts ;done (6) ; ; ; handle full queue... ; tl0040 sec (2) tya ;blocking? (2) bmi tl0020 ;no, exit (2,3,4) ; wai ;yes, wait for next IRQ &... (3+) bra tl0010 ;try again (3,4) Salient points of this function are:
- In a uni-tasking environment, blocking on any I/O is perfectly acceptable—the MPU really has nothing else it can do until the I/O completes. Should you eventually build an operating system kernel that can support pre-emption you can switch to non-blocking SIO. In the above code, blocking is performed by WAIting for a hardware interrupt, be it a UART IRQ, a jiffy IRQ, etc. If blocking has been disabled the routine will give the transmitter a kick before exiting to make sure it's running.
- It's not advisable to put spin loops into BIOS APIs that involve I/O. If the condition on which the MPU is spinning never materializes your system will go into deadlock, forcing a restart. For example, most Linux kernel I/O APIs do not block unless requested. Instead they return an error code to the calling function. It is up to the higher level function that called the API to spin/sleep/retry, etc.
- The TRB instruction does double duty by first determining if txdflag is set, thereby conditioning .Z. After that, TRB clears the flag. With this arrangement, fewer instructions are required to manage the transmitter and execution speed is improved. Also, this basic technique is easily adapted for use with DUARTs and QUARTs.
- Carry is cleared if the write operation is successful, which it will always be in blocking mode. You should follow that logic everywhere in your BIOS: carry clear = TRUE and carry set = FALSE. In this context, TRUE can also be OKAY and FALSE can be ERROR.
_________________ x86? We ain't got no x86. We don't NEED no stinking x86!
Last edited by BigDumbDinosaur on Sun Dec 01, 2019 6:51 am, edited 2 times in total.
|
|