6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 10, 2024 9:05 am

All times are UTC




Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: C02 Monitor/BIOS 2.02
PostPosted: Thu Feb 14, 2019 4:45 am 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
Back at the end of 2017 I completed a new SBC which uses the Philips/NXP SCC2691 UART as a console. My initial BIOS and Monitor have worked well and 3 boards have been flawless for more than a full year of constant running. End of last year, I modified EhBasic to work on the Pocket SBC as well and integrated it into the Monitor.

I've since made some updates to the original 2.00 version of the BIOS:
- Cleaned up code and saved some space by re-arranging it a bit
- Added a benchmark timer with 10ms resolution and a maximum count of 65535.99 seconds
- A longer and more detailed BIOS boot message at a fixed location
- Removed a line of code in the panic routine that toggles the X1/X16 test of the UART

Changes to the Monitor have also been made:
- A routine that allows byte level patching to the EEPROM
- A Start and Stop of the new benchmark counter and a routine that shows the elapsed time
- A Xmodem-CRC save utility to complement the Xmodem-CRC load utility
- Fixed a problem in the S-Record processing in the Xmodem-CRC load utility
- A command to launch EhBasic
- A stub and message for the upcoming inline Assembler
- Checks for RAM/EEPROM on byte level editing/patching
- New JMP table entries to support Xmodem-CRC Load/Save, Benchmark timing and Uptime display

A new Flash utility that will update the Monitor code insitu. The BIOS must be functioning properly as it is used by the flash utility. Note that if you flash bad monitor code, it will likely crash after displaying the BIOS message.

The new version 2.02 has been working cleanly in a few SBCs for a while now, hence putting it out here. I also reformatted all of the source files, replacing all tab characters with spaces so the source and listing files look much prettier now.

I'll be adding code to the CMOS version of EhBasic to take advantage of the Xmodem-CRC Load/Save features in the monitor soon and will post an updated version here when completed. Hopefully some folks will find the monitor and bios code useful. The monitor currently has 35 functions (one only shows a message for the Assembler not yet implemented). It still fits within 5KB of space, while the BIOS and I/O are less than 1KB. In the allocated 8KB at the top of memory, that leaves a bit over 2KB of free ROM space.

Attachment:
C02Monitor202.zip [159.43 KiB]
Downloaded 111 times

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Sun Feb 17, 2019 5:52 pm 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
Nice work! I like the way you tidied up your white space.

How receptive are you to suggestions on how to make your binaries even smaller, but with identical functionality? I already have one queued up that could save you 16 bytes in C02BIOS2b, but I hesitate to share because I haven't tested it yet, and because it takes a rather obtuse track, reminiscent of something Woz might have whipped up 42 years ago. Say the word, and I'll share it.

I'm pretty sure I could save you at least a few dozen more bytes in C02Monitor2b, again without altering any functionality, but I have a lot on my plate right now, so I'll hold off from adding it to my long to-do list until I get a green light from you.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 18, 2019 4:36 am 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
Mike,

First, thanks for the kudos... I also switched editors... from Ultraedit to SlickEdit, the latter being done by one of the former IBM researchers from the old days... I used his editors way back when and really prefer it over UE. Cleaning up the white space did take some time but overall worth it for readability on the source and listing files.

Making the code even smaller... well, I already used one of your routines in the Monitor code, so please do share... I should be able test it fairly quickly... and also interested in how you can save 16 bytes in the BIOS code.

Now that 2.02 is completed and working as required, I'll be taking another look through the Monitor and hope to be able to save some additional space as well. One thought is to combine the edit memory and edit EEPROM commands. I'm thinking I should be able to save a few bytes there at least.

And many thanks!

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Mon Feb 18, 2019 7:52 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
floobydust wrote:
Back at the end of 2017 I completed a new SBC which uses the Philips/NXP SCC2691 UART as a console... I've since made some updates to the original 2.00 version of the BIOS:
- Cleaned up code and saved some space by re-arranging it a bit

Code:
;**************************************************************************************************
; Character In and Out routines for Console I/O buffer
;**************************************************************************************************
;Character Input routines
;CHRIN_NW uses CHRIN, returns if a character is not available from the buffer with carry flag clear
; else returns with character in A reg and carry flag set. CHRIN waits for a character to be in the
; buffer, then returns with carry flag set. Receive is IRQ driven/buffered with a size of 128 bytes
;
CHRIN_NW       CLC                      ;Clear Carry flag for no character
               LDA      ICNT            ;Get character count
               BNE      GET_CH          ;Branch if buffer is not empty
               RTS                      ;and return to caller
;
CHRIN          LDA      ICNT            ;Get character count
               BEQ      CHRIN           ;If zero (no character, loop back)
;
GET_CH         PHY                      ;Save Y reg
               LDY      IHEAD           ;Get the buffer head pointer
               LDA      IBUF,Y          ;Get the character from the buffer
               INC      IHEAD           ;Increment head pointer
               RMB7     IHEAD           ;Strip off bit 7, 128 bytes only
               DEC      ICNT            ;Decrement the buffer count
;
               PLY                      ;Restore Y Reg
               SEC                      ;Set Carry flag for character available
               RTS                      ;Return to caller with character in A reg

Why maintain a separate datum count in ICNT? The receiver circular queue's (RxQ) pointers (IHEAD and ITAIL) tell you if at least one datum is present (RxQ is what you are referring to as a "buffer"—it is not a buffer). If IHEAD == ITAIL there is no data in RxQ. Conversely, if IHEAD == ITAIL+1 the queue is full. The problem with using ICNT as a qualifier is that it can inadvertently get out of sync with actual queue conditions. It is IHEAD and ITAIL that determine where gets and puts on the queue occur, not a counter.

As a matter of style, routines such as CHRIN are usually written so carry is cleared if a datum is available or set if no datum is available. This style has an indirect benefit of reducing the number of instructions needed to quickly establish the "no data" condition. For example:

Code:
         phy                   ;preserve
         ldy IHEAD             ;RxQ "get" index
         cpy ITAIL             ;RxQ "put" index
         beq rxqempty          ;RxQ is empty (sets carry)
;
         lda IBUF,y            ;get oldest datum from queue
;
;   ————————————————————————————————————————————————————————————————
;   The next two instructions create a potential race condition with
;   the receiver ISR, as the change to IHEAD is not atomic.
;   ————————————————————————————————————————————————————————————————
;
         inc IHEAD             ;bump "get" index &...
         rmb7 IHEAD            ;truncate RxQ to 128 bytes
         clc                   ;datum gotten
;
rxqempty ply                   ;restore &...
         rts                   ;return to caller

In the above, a very quick comparison is all that is needed to determine if the queue has anything. No initialization of carry is required, as the comparison automatically sets carry if RxQ is empty.

The race condition with IHEAD can be averted with a slight code change:

Code:
         phy                   ;preserve
         ldy IHEAD             ;RxQ "get" index
         cpy ITAIL             ;RxQ "put" index
         beq rxqempty          ;RxQ is empty (sets carry)
;
         phx                   ;preserve
         ldx IBUF,y            ;get oldest datum from queue
         tya                   ;current RxQ "get" index
         inc A                 ;bump it &...
         and #%01111111        ;wrap it to 128 byte boundary
         sta IHEAD             ;atomically change RxQ "get"
         txa                   ;hand off datum
         plx                   ;restore
         clc                   ;datum gotten
;
rxqempty ply                   ;restore &...
         rts                   ;return to caller

As for the receiver ISR:

Code:
UART_RCV       LDY      ICNT            ;Get buffer counter
               BMI      BUFFUL          ;Check against limit, branch if full
               LDA      UART_RECEIVE    ;Else, get character from 2691
;
               LDY      ITAIL           ;Get the tail pointer to buffer
               STA      IBUF,Y          ;Store into buffer
               INC      ITAIL           ;Increment tail pointer
               RMB7     ITAIL           ;Strip off bit 7, 128 bytes only
               INC      ICNT            ;increment character count

I'd write that as:

Code:
UART_RCV lda UART_STATUS       ;status register
         lsr A                 ;shift RxRDY into carry
         bcc done              ;RHR empty, done processing receiver
;
         ldx UART_RECEIVE      ;fetch datum from RHR (see below text)
         lda ITAIL             ;RxQ "put" index
         tay                   ;save a copy         
         inc A                 ;bump "put" & ...
         and #%01111111        ;truncate to 128 byte boundary
         cmp IHEAD             ;RxQ "get" index
         beq UART_RCV          ;RxQ is full, discard datum
;
         sta ITAIL             ;save new "put" index
         txa                   ;hand off datum &...
         sta IBUF,y            ;put in RxQ
         bra UART_RCV          ;loop back for next datum
;
done  ...program continues...

The above code does away with the counter and also takes advantage of the 2691's receiver FIFO. Potentially up to four incoming datums can be processed in a single interrupt, which obviously reduces the MPU's workload. I use a somewhat ugly hack to determine if the receiver has any data—it will only work with the single channel NXP UARTs.

Note how the receiver is checked and read before any queue management is performed. You should always read from the UART when it interrupts and RxRDY in the status register indicates data is waiting, even though RxQ may be full. If you don't do this and automatic hardware handshaking is not functioning it is likely an overrun error will occur. If RxQ is full your only recourse is to discard the received datum.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Fri Apr 08, 2022 12:17 pm, edited 2 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 19, 2019 1:01 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1948
Location: Sacramento, CA, USA
BigDumbDinosaur wrote:
... Conversely, if IHEAD == ITAIL+1 the queue is full ...

I believe that your statement covers 99.21875% of the full-queue cases, but that pesky 0.78125% (when IHEAD == ITAIL-127) can be a real show stopper, even if it's only once in a blue moon.

Anyway, here's my untested mod for the BIOS:
Code:
;
              LDX      #251            ; -5 (zp,x wraparound saves us a cpx#)
SMHD          LDA      SECS+5,X        ; Increment SECS counter then propagate
              INC      SECS+5,X        ;   it through to the MINS, HOURS and
              CMP      SMHDLIM-251,X   ;   DAYS, as needed
              BCC      REGEXT0         ; Exit as early as possible
              STZ      SECS+5,X
              INX
              BNE      SMHD
              BRA      REGEXT0         ; Worst case exit (all 5 bytes zeroed),
;                                          happens once every 179.4 years
;                                        Table of max values for SECS, MINS,
SMHDLIM       .DB     59,59,23,255,255 ; HOURS, DAYSL and DAYSH, respectively
;
should be a drop-in 22-byte replacement for your 38 bytes of:
Code:
;
              INC      SECS            ;Increment Seconds
              LDA      SECS            ;Load it to A reg
              CMP      #60             ;Check for 60 Seconds
              BCC      REGEXT0         ;If not, exit
              STZ      SECS            ;Else, reset Seconds, inc Minutes
;
              INC      MINS            ;Increment Minutes
              LDA      MINS            ;Load it to A reg
              CMP      #60             ;Check for 60 minutes
              BCC      REGEXT0         ;If not, exit
              STZ      MINS            ;Else, reset Minutes, inc Hours
;
              INC      HOURS           ;Increment Hours
              LDA      HOURS           ;Load it to A reg
              CMP      #24             ;Check for 24 Hours
              BCC      REGEXT0         ;If not, exit
              STZ      HOURS           ;Else, reset Hours, inc Days
;
              INC      DAYSL           ;Increment low-order Days
              BNE      REGEXT0         ;If not zero, exit
              INC      DAYSH           ;Else increment high-order Days
              BRA      REGEXT0         ;Then exit IRQ handler
;
... assuming that I didn't make any terrible mistakes from trying to be more clever than I actually am.

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 19, 2019 5:19 am 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
BigDumbDinosaur wrote:
floobydust wrote:
Back at the end of 2017 I completed a new SBC which uses the Philips/NXP SCC2691 UART as a console... I've since made some updates to the original 2.00 version of the BIOS:
- Cleaned up code and saved some space by re-arranging it a bit

Code:
;**************************************************************************************************
; Character In and Out routines for Console I/O buffer
;**************************************************************************************************
;Character Input routines
;CHRIN_NW uses CHRIN, returns if a character is not available from the buffer with carry flag clear
; else returns with character in A reg and carry flag set. CHRIN waits for a character to be in the
; buffer, then returns with carry flag set. Receive is IRQ driven/buffered with a size of 128 bytes
;
CHRIN_NW       CLC                      ;Clear Carry flag for no character
               LDA      ICNT            ;Get character count
               BNE      GET_CH          ;Branch if buffer is not empty
               RTS                      ;and return to caller
;
CHRIN          LDA      ICNT            ;Get character count
               BEQ      CHRIN           ;If zero (no character, loop back)
;
GET_CH         PHY                      ;Save Y reg
               LDY      IHEAD           ;Get the buffer head pointer
               LDA      IBUF,Y          ;Get the character from the buffer
               INC      IHEAD           ;Increment head pointer
               RMB7     IHEAD           ;Strip off bit 7, 128 bytes only
               DEC      ICNT            ;Decrement the buffer count
;
               PLY                      ;Restore Y Reg
               SEC                      ;Set Carry flag for character available
               RTS                      ;Return to caller with character in A reg

Why maintain a separate datum count in ICNT? The receiver circular queue's (RxQ) pointers (IHEAD and ITAIL) tell you if at least one datum is present (RxQ is what you are referring to as a "buffer"—it is not a buffer). If IHEAD == ITAIL there is no data in RxQ. Conversely, if IHEAD == ITAIL+1 the queue is full. The problem with using ICNT as a qualifier is that it can inadvertently get out of sync with actual queue conditions. It is IHEAD and ITAIL that determine where gets and puts on the queue occur, not a counter.

As a matter of style, routines such as CHRIN are usually written so carry is cleared if a datum is available or set if no datum is available. This style has an indirect benefit of reducing the number of instructions needed to quickly establish the "no data" condition. For example:

Code:
         phy                   ;preserve
         ldy IHEAD             ;RxQ "get" index
         cpy ITAIL             ;RxQ "put" index
         beq rxqempty          ;RxQ is empty (sets carry)
;
         lda IBUF,y            ;get oldest datum from queue
;
;   ————————————————————————————————————————————————————————————————
;   The next two instructions create a potential race condition with
;   the receiver ISR, as the change to IHEAD is not atomic.
;   ————————————————————————————————————————————————————————————————
;
         inc IHEAD             ;bump "get" index &...
         rmb7 IHEAD            ;truncate RxQ to 128 bytes
         clc                   ;datum gotten
;
rxqempty ply                   ;restore &...
         rts                   ;return to caller

In the above, a very quick comparison is all that is needed to determine if the queue has anything. No initialization of carry is required, as the comparison automatically sets carry if RxQ is empty.

The race condition with IHEAD can be averted with a slight code change:

Code:
         phy                   ;preserve
         ldy IHEAD             ;RxQ "get" index
         cpy ITAIL             ;RxQ "put" index
         beq rxqempty          ;RxQ is empty (sets carry)
;
         phx                   ;preserve
         ldx IBUF,y            ;get oldest datum from queue
         tya                   ;current RxQ "get" index
         inc A                 ;bump it &...
         and #%01111111        ;wrap it to 128 byte boundary
         sta IHEAD             ;atomically change RxQ "get"
         txa                   ;hand off datum
         plx                   ;restore
         clc                   ;datum gotten
;
rxqempty ply                   ;restore &...
         rts                   ;return to caller

As for the receiver ISR:

Code:
UART_RCV       LDY      ICNT            ;Get buffer counter
               BMI      BUFFUL          ;Check against limit, branch if full
               LDA      UART_RECEIVE    ;Else, get character from 2691
;
               LDY      ITAIL           ;Get the tail pointer to buffer
               STA      IBUF,Y          ;Store into buffer
               INC      ITAIL           ;Increment tail pointer
               RMB7     ITAIL           ;Strip off bit 7, 128 bytes only
               INC      ICNT            ;increment character count

I'd write that as:

Code:
UART_RCV lda UART_STATUS       ;status register
         lsr A                 ;shift RxRDY into carry
         bcc done              ;RHR empty, done processing receiver
;
         ldx UART_RECEIVE      ;fetch datum from RHR (see below text)
         lda ITAIL             ;RxQ "put" index
         tay                   ;save a copy         
         inc A                 ;bump "put" & ...
         and #%01111111        ;truncate to 128 byte boundary
         cmp IHEAD             ;RxQ "get" index
         beq UART_RCV          ;RxQ is full, discard datum
;
         sta ITAIL             ;save new "put" index
         txa                   ;hand off datum &...
         sta IBUF,y            ;put in RxQ
         bra UART_RCV          ;loop back for next datum
;
done  ...program continues...

The above code does away with the counter and also takes advantage of the 2691's receiver FIFO. Potentially up to four incoming datums can be processed in a single interrupt, which obviously reduces the MPU's workload. I use a somewhat ugly hack to determine if the receiver has any data—it will only work with the single channel NXP UARTs.

Note how the receiver is checked and read before any queue management is performed. You should always read from the UART when it interrupts and RxRDY in the status register indicates data is waiting, even though RxQ may be full. If you don't do this and automatic hardware handshaking is not functioning it is likely a overrun error will occur. If RxQ is full your only recourse is to discard the received datum.


Hi BDD,

First, thanks for the amount of time you spent going over my BIOS, it's clear you went through it in great detail... much appreciated of course and I did use some of your recommended changes some time ago (and you're listed in the comments for providing them). I believe over time I might have come to the same realizations and changes, but you provided some nice changes which provided value and I just added them in, so thanks for that.

So, in defense of my seemingly flawed or inefficient coding... much of what I've done is based on some other requirements, so there's a reason behind much of them. In no particular order:

Using a count for the buffer, or queue. This basic circular buffer goes back to the 80's... it's never failed me yet and was patterned after the routines published by Lance Leventhal in the 6502 assembly language subroutines handbook. As the count (ICNT and OCNT) variables are incremented and decremented by the ISR and the CHRIN/CHROUT routines, they don't get out of sync with each other as the 6502 (65C02) only does single-thread execution. If it did fail, the Xmodem download and upload routines would periodically fail, which they don't. This was also recently covered in a thread here as well:

viewtopic.php?f=2&t=5427#p65539

Granted, my approach requires a couple extra page zero locations, but the rest of the code is fairly short and also has the exact same cycle count every time it executes.

The difference in carry set or clear for having a received byte/character. I looked at changing this over a year ago based on another thread and your replies. However, as I was also looking to port EhBasic over to the C02 Pocket, it's input routine also required the inverse of the carry flag as I had implemented it. As EhBasic frequently tests the input routine (get character non-waiting) for Ctrl-C, the routine I have is both shorter in size (6 bytes vs 9 bytes) and shorter in execution (13 clock cycles vs 22 clock cycles). The existing code does provide for a faster running EhBasic as less time is spent in the BIOS routine.

Using the Receive FIFO on the 2691... I looked at this during the initial code development. As the 2691 is used for a console I/O, there are monitor commands which are a single character, like Ctrl-Q, Ctrl-R, Ctrl-T, Ctrl-Z, register commands, etc. I also echo some of the commands back to the console, so I purposely setup the UART to generate an interrupt when a character is received vs waiting for the FIFO to appear full. The current BIOS routines are fairly efficient at servicing an interrupt for Receive or Transmit and don't bog anything down.

Using the default baud rate of 38.4K, a single character time is around 260 microseconds. During this time, the CPU has about 1560 clock cycles of execution (at 6MHz). The current ISR requires about 106 clock cycles to receive a character and place it into the queue. Using the Xmodem-CRC download function, the CPU can receive a 128-byte block, calculate the 16-bit CRC, decode Motorola S-Records (S19), perform the checksum and using an offset, place into memory and never miss a block receive or encounter an error.

While I agree there are multiple ways to skin the cat on this, your routines are very well thought out and no doubt should work cleanly. However, my current code is working without incident and is also short and fast. I would agree with you that the 2692 and later UARTs are better devices compared to the 2691, but I think I've gotten the UART to work exceptionally well in a fairly small BIOS (caveats and all) as a console, upload/downoad and leveraged most of it's capabilities for it's intended use.

As this is still a work in progress (as are most hobbies) I'll continue to look at your examples and posts as additional enhancements for the simple code I'm developing.

As a side note, the FTDI UART interface has pretty large buffers and is also configured for RTS/CTS handshake. This is also working based on some tests that I ran on the SBC while running a macro of commands. Note that I will further examine your code and see if I can implement it into the existing BIOS. I currently have over 1KB of available space that I'll be using for additional HW devices.

I do spent time looking to optimize the code for the next version. I will go back to examine your code in more detail and might implement some of your ideas in the next release. And again, thanks for the time you put in to examining the code and coming up and with some alternate code and the differences in the routines themselves. As with most (if not all hobbies) it's always a work in progress, getting the best out of small board setup.

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 19, 2019 7:28 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
barrym95838 wrote:
BigDumbDinosaur wrote:
... Conversely, if IHEAD == ITAIL+1 the queue is full ...

I believe that your statement covers 99.21875% of the full-queue cases, but that pesky 0.78125% (when IHEAD == ITAIL-127) can be a real show stopper, even if it's only once in a blue moon.

I assume you are being facetious.

However, in case my assumption is wrong, ITAIL-127 is not a tested condition. There are only two conditions of concern: IHEAD == ITAIL (queue empty) and IHEAD == ITAIL+1 (queue full). The latter case is unsigned integer arithmetic in which carry is ignored. Furthermore, in my example code, ITAIL+1 is ANDed to clear bit 7, which is also done to IHEAD when it has been incremented in order to keep the circular queue's size at 128 bytes. It's a technique that has been proven in the BIOS of my POC V1.1 unit, which uses a 128 byte queue size.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 19, 2019 2:13 pm 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
barrym95838 wrote:
Anyway, here's my untested mod for the BIOS:
Code:
;
              LDX      #251            ; -5 (zp,x wraparound saves us a cpx#)
SMHD          LDA      SECS+5,X        ; Increment SECS counter then propagate
              INC      SECS+5,X        ;   it through to the MINS, HOURS and
              CMP      SMHDLIM-251,X   ;   DAYS, as needed
              BCC      REGEXT0         ; Exit as early as possible
              STZ      SECS+5,X
              INX
              BNE      SMHD
              BRA      REGEXT0         ; Worst case exit (all 5 bytes zeroed),
;                                          happens once every 179.4 years
;                                        Table of max values for SECS, MINS,
SMHDLIM       .DB     59,59,23,255,255 ; HOURS, DAYSL and DAYSH, respectively
;
should be a drop-in 22-byte replacement for your 38 bytes of:
Code:
;
              INC      SECS            ;Increment Seconds
              LDA      SECS            ;Load it to A reg
              CMP      #60             ;Check for 60 Seconds
              BCC      REGEXT0         ;If not, exit
              STZ      SECS            ;Else, reset Seconds, inc Minutes
;
              INC      MINS            ;Increment Minutes
              LDA      MINS            ;Load it to A reg
              CMP      #60             ;Check for 60 minutes
              BCC      REGEXT0         ;If not, exit
              STZ      MINS            ;Else, reset Minutes, inc Hours
;
              INC      HOURS           ;Increment Hours
              LDA      HOURS           ;Load it to A reg
              CMP      #24             ;Check for 24 Hours
              BCC      REGEXT0         ;If not, exit
              STZ      HOURS           ;Else, reset Hours, inc Days
;
              INC      DAYSL           ;Increment low-order Days
              BNE      REGEXT0         ;If not zero, exit
              INC      DAYSH           ;Else increment high-order Days
              BRA      REGEXT0         ;Then exit IRQ handler
;
... assuming that I didn't make any terrible mistakes from trying to be more clever than I actually am.



Okay, this looks quite interesting. I should have some free time later today to put this into the BIOS and give it a test. I'll report back with results once I have some test time completed. Thanks again for sharing this... :D

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 19, 2019 10:38 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
floobydust wrote:
barrym95838 wrote:
Anyway, here's my untested mod for the BIOS [timekeeping]...

Okay, this looks quite interesting. I should have some free time later today to put this into the BIOS and give it a test. I'll report back with results once I have some test time completed. Thanks again for sharing this... :D

Another, potentially more efficient, way to keep time is to use an ever-increasing binary count of the number of seconds and fractions of a second from "some time long ago" (epoch). The archetype would be UNIX time, which is maintained as a 32 bit signed integer, type time_t. In current versions of UNIX and Linux time_t is a 64 bit unsigned integer, a change used to circumvent the "year 2038" problem associated with the 32-bit time_t field. The epoch in both cases is Thursday January 1 00:00:00.0 GMT 1970. Any date prior to the epoch is represented by a negative integer.

The UNIX/Linux kernel only increments the time fields, of which there are at least two: the time of day, which is traditionally set to the time_t equivalent of UTC+0 when the system comes up; and system uptime, which is initialized to zero at boot time. Both fields are incremented once per second (on systems using the MPU's HPET, there may be another field that records fractions of a second).

An important concept with this method of timekeeping is the kernel does not perform conversions between broken-down time (BDT, which is human-readable) and the internal formats, which are binary integers. Format conversion is handled by user-space functions, e.g., mktime(), that are run whenever a conversion is needed. Functions such as mktime() refer to an environment variable, TZ, which represents the local time zone ("local" relative to the logged-in user). Hence the UTC+0 internal format is compensated for the local time zone during the conversion. Additional compensation for daylight saving time (DST) is also available by reference to another time zone environment variable that describes the different between DST and standard time. This is done because there are some locales where the difference between standard time and DST isn't an even hour.

This intentional divvying of timekeeping responsibilities between kernel and user space means the kernel interrupt code used to maintain the internal time fields can be very succinct. For example, on an x86-64 machine running the 64 bit Linux kernel, incrementing the time-of-day field can be carried out via a single instruction, since the MPU can handle a 64 bit data type in one register. Even on a 65C02 machine, a 64-bit time_t field can be managed with reasonably succinct code:

Code:
;increment 64-bit time_t with a 65C02...
;
         dec jiffyct           ;current jiffy IRQ count
         bne l0000020          ;not time to update
;
         lda #hz               ;jiffy IRQ frequency (e.g., 100 Hz)
         sta jiffyct           ;reset
         ldx #0                ;time field index
         ldy #s_time_t         ;time field size (8 bytes)
;
l0000010 inc tod,x             ;bump time-of-day
         bne l0000020          ;done w/time-of-day
;
         inx
         dey
         bne l0000010          ;next
;
l0000020 ...program continues...

The above would be executed in the interrupt service routine (ISR) once per jiffy IRQ. The 64-bit tod field, as well as the jiffyct jiffy IRQ counter, would be maintained on page zero for best performance. Centiseconds may be derived from hz - jiffyct, where hz is the jiffy IRQ frequency.

In POC V1.1, I developed a timekeeping method that is similar to the UNIX archetype, but with some changes. As earlier noted, the archetype's 32-bit time_t value is signed, which complicates conversion to/from BDT. This was a choice forced on Ken Thompson by the PDP-11's native word type: a signed 32-bit quantity. Unfortunately, when Linux came about, time_t had to remain signed in order to maintain compatibility with older systems. Even when time_t was expanded to 64 bits it had to remain signed.

Over the years as I've developed large-scale database applications, I've came to the realization that most information encapsulated in databases doesn't use dates earlier than the 20th century (persons' dates-of-birth, for example), which suggested to me that a different epoch could be used along with an unsigned integer. Doing away with a signed integer would simplify the conversion process, as signed arithmetic is not efficiently implemented on the 6502 family. Accordingly, I moved the epoch to Sunday October 1 00:00:00.00 UTC 1752, which corresponds to a time_t value of zero. As all dates are unsigned, a non-zero time_t value represents some point in time after the epoch. This particular epoch was chosen to avoid having to deal with September 1752, which month was truncated when the British Empire switched from the Julian calendar to the Gregorian one.

As well as the new epoch, I changed time_t to a 48-bit unsigned integer, in which bits 0-39 are the number of seconds that have elapsed since the epoch and bits 40-47 are padding to align the field size to an even number of bytes—word alignment simplifies handling with the 65C816, and can be omitted with the 65C02. The maximum useful date that can be represented by a 40-bit time_t field is Friday December 31 23:59:59.99 UTC 9999, a range of 8247 years. Compare that to the original UNIX 32-bit time_t, which has a useful range of a bit more than 136 years, half of which is prior to the epoch.

Conversion from the seconds count of the time_t format to BDT involves a series of operations on the time_t field that successively extract each BDT field. My algorithm is based upon a Julian date conversion algorithm that I modified and adapted:

Code:
p = S + 2393283
q = 4 * p / 146097
r = p - (146097 * q + 3) / 4
s = 4000 * (r + 1) / 1461001
t = r - 1461 * s / 4 + 31
u = 80 * t / 2447
v = u / 11

Y = 100 * (q - 49) + s + v  →  broken-down year (1752-9999)
M = u + 2 - 12 * v          →  broken-down month (1-12)
D = t - 2447 * u / 80       →  broken-down day of month (1 to 31, depending on month and whether Y is a leap year)

In the above, S is the time_t date reduced to an integral multiple of 86,400, resulting in S effectively representing the number of days that have elapsed since the epoch. As I implemented it, the reduction process extracts the hour, minutes and seconds value for the current day by iterative means.

The BDT values are returned as binary integers. For processing convenience, I decided to make all BDT fields 16 bits.

The algorithm I used to convert from BDT to time_t format is also based upon a Julian day algorithm, with adaptations for the date range and epoch:

Code:
m1 = (M - 14) / 12
y1 = Y + 4800
S = 1461 * (y1 + m1) / 4 + 367 * (M - 2 - 12 * m1) / 12 - (3 * ((y1 + m1 + 100) / 100)) / 4 + D - 2393284

The input date is in D (day-of-month, same range as D in the previous algorithm), M (month, 1-12) and Y (year, 1752-9999). S is the resulting time_t value. S is undefined if nonsensical input values are provided—the function calling the conversion is responsible for trapping garbage input.

The time-of-day can be added to the S field as follows:

Code:
tm = S + HOUR * 3600 + MIN * 60 + SEC

where the time-of-day value represented by HOUR, MIN and SEC is 24-hour format (00:00:00-23:59:59), and tm is the final time_t value. Again, nonsensical values will produce undefined results.

All terms in the above algorithms are positive integers and arithmetic operations follow the standard rules of algebraic precedence. Due to the range that the time_t date encompasses, arithmetic operations must be 64 bit to avoid overflow during multiplication. The quotient of each division operation is as if the floorl() C function has been applied—same as int(x) in BASIC.

64 bit addition and subtraction on the 65C02 is straightforward. 64 bit multiplication and division is not as trivial to implement but can be accomplished by scaling up existing algorithms. Although the math won't be very speedy, it only has to be carried out when a conversion is needed, not as a matter of routine kernel processing.

Food for your thought.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Mon Feb 10, 2020 5:21 am, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue Feb 19, 2019 11:59 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
:!: :!: :!:

Really impressive.

Thank you, BDD.


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 20, 2019 5:13 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
GaBuZoMeu wrote:
:!: :!: :!:

Really impressive.

Thank you, BDD.

You're welcome!

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 20, 2019 5:55 am 
Offline
User avatar

Joined: Tue Mar 05, 2013 4:31 am
Posts: 1385
BigDumbDinosaur wrote:
floobydust wrote:
barrym95838 wrote:
Anyway, here's my untested mod for the BIOS [timekeeping]...

Okay, this looks quite interesting. I should have some free time later today to put this into the BIOS and give it a test. I'll report back with results once I have some test time completed. Thanks again for sharing this... :D

Another, potentially more efficient, way to keep time is to use an ever-increasing binary count of the number of seconds and fractions of a second from "some time long ago" (epoch). The archetype would be UNIX time, which is maintained as a 32 bit signed integer, type time_t. In current versions of UNIX and Linux time_t is a 64 bit unsigned integer, a change used to circumvent the "year 2038" problem associated with the 32-bit time_t field. The epoch in both cases is Thursday January 1 00:00:00.0 GMT 1970. Any date prior to the epoch is represented by a negative integer.

The UNIX/Linux kernel only increments the time fields, of which there are at least two: the time of day, which is traditionally set to the time_t equivalent of UTC+0 when the system comes up; and system uptime, which is initialized to zero at boot time. Both fields are incremented once per second (on systems using the MPU's HPET, there may be another field that records fractions of a second).

An important concept with this method of timekeeping is the kernel does not perform conversions between broken-down time (BDT, which is human-readable) and the internal formats, which are binary integers. Format conversion is handled by user-space functions, e.g., mktime(), that are run whenever a conversion is needed. Functions such as mktime() refer to an environment variable, TZ, which represents the local time zone ("local" relative to the logged-in user). Hence the UTC+0 internal format is compensated for the local time zone during the conversion. Additional compensation for daylight saving time (DST) is also available by reference to another time zone environment variable that describes the different between DST and standard time. This is done because there are some locales where the difference between standard time and DST isn't an even hour.

This intentional divvying of timekeeping responsibilities between kernel and user space means the kernel interrupt code used to maintain the internal time fields can be very succinct. For example, on an x86-64 machine running the 64 bit Linux kernel, incrementing the time-of-day field can be carried out via a single instruction, since the MPU can handle a 64 bit data type in one register. Even on a 65C02 machine, a 64-bit time_t field can be managed with reasonably succinct code:

Code:
;increment 64-bit time_t with a 65C02...
;
         dec jiffyct           ;current jiffy IRQ count
         bne l0000020          ;not time to update
;
         lda #hz               ;jiffy IRQ frequency (e.g., 100 Hz)
         sta jiffyct           ;reset
         ldx #0                ;time field index
         ldy #s_time_t         ;time field size (8 bytes)
;
l0000010 inc tod,x             ;bump time-of-day
         bne l0000020          ;done w/time-of-day
;
         inx
         dey
         bne l0000010          ;next
;
l0000020 ...program continues...

The above would be executed in the interrupt service routine (ISR) once per jiffy IRQ. The 64-bit tod field, as well as the jiffyct jiffy IRQ counter, would be maintained on page zero for best performance. Centiseconds may be derived from hz - jiffyct, where hz is the jiffy IRQ frequency.

In POC V1.1, I developed a timekeeping method that is similar to the UNIX archetype, but with some changes. As earlier noted, the archetype's 32-bit time_t value is signed, which complicates conversion to/from BDT. This was a choice forced on Ken Thompson by the PDP-11's native word type: a signed 32-bit quantity. Unfortunately, when Linux came about, time_t had to remain signed in order to maintain compatibility with older systems. Even when time_t was expanded to 64 bits it had to remain signed.

Over the years as I've developed large-scale database applications, I've came to the realization that most information encapsulated in databases doesn't use dates earlier than the 20th century (persons' dates-of-birth, for example), which suggested to me that a different epoch could be used along with an unsigned integer. Doing away with a signed integer would simplify the conversion process, as signed arithmetic is not efficiently implemented on the 6502 family. Accordingly, I moved the epoch to Sunday October 1 00:00:00.00 UTC 1752, which corresponds to a time_t value of zero. As all dates are unsigned, a non-zero time_t value represents some point in time after the epoch. This particular epoch was chosen to avoid having to deal with September 1752, which month was truncated when the British Empire switched from the Julian calendar to the Gregorian one.

As well as the new epoch, I changed time_t to a 48-bit unsigned integer, in which bits 0-39 are the number of seconds that have elapsed since the epoch and bits 40-47 are padding to align the field size to an even number of bytes—word alignment simplifies handling with the 65C816, and can be omitted with the 65C02. The maximum useful date that can be represented by a 40-bit time_t field is Friday December 31 23:59:59.99 UTC 9999, a range of 8247 years. Compare that to the original UNIX 32-bit time_t, which has a useful range of a bit more than 136 years, half of which is prior to the epoch.

Conversion from the seconds count of the time_t format to BDT involves a series of operations on the time_t field that successively extract each BDT field. My algorithm is based upon a Julian date conversion algorithm that I modified and adapted:

Code:
p = S + 2393283
q = 4 * p / 146097
r = p - (146097 * q + 3) / 4
s = 4000 * (r + 1) / 1461001
t = r - 1461 * s / 4 + 31
u = 80 * t / 2447
v = u / 11

Y = 100 * (q - 49) + s + v  →  broken-down year (1752-9999)
M = u + 2 - 12 * v          →  broken-down month (1-12)
D = t - 2447 * u / 80       →  broken-down day of month (1 to 31, depending on month and whether Y is a leap year)

In the above, S is the time_t date reduced to an integral multiple of 86,400, resulting in S effectively representing the number of days that have elapsed since the epoch. As I implemented it, the reduction process extracts the hour, minutes and seconds value for the current day by iterative means.

The BDT values are returned binary integers. For processing convenience, I decided to make all BDT fields 16 bits.

The algorithm I used to convert from BDT to time_t format is also based upon a Julian day algorithm, with adaptations for the date range and epoch:

Code:
m1 = (M - 14) / 12
y1 = Y + 4800
S = 1461 * (y1 + m1) / 4 + 367 * (M - 2 - 12 * m1) / 12 - (3 * ((y1 + m1 + 100) / 100)) / 4 + D - 2393284

The input date is in D (day-of-month, same range as D in the previous algorithm), M (month, 1-12) and Y (year, 1752-9999). S is the resulting time_t value. S is undefined if nonsensical input values are provided—the function calling the conversion is responsible for trapping garbage input.

The time-of-day can be added to the S field as follows:

Code:
tm = S + HOUR * 3600 + MIN * 60 + SEC

where the time-of-day value represented by HOUR, MIN and SEC is 24-hour format (00:00:00-23:59:59), and tm is the final time_t value. Again, nonsensical values will produce undefined results.

All terms in the above algorithms are positive integers and arithmetic operations follow the standard rules of algebraic precedence. Due to the range that the time_t date encompasses, arithmetic operations must be 64 bit to avoid overflow during multiplication. The quotient of each division operation is as if the floorl() C function has been applied—same as int(x) in BASIC.

64 bit addition and subtraction on the 65C02 is straightforward. 64 bit multiplication and division is not as trivial to implement but can be accomplished by scaling up existing algorithms. Although the math won't be very speedy, it only has to be carried out when a conversion is needed, not as a matter of routine kernel processing.

Food for your thought.


Yes, I do like the idea of changing out the current RTC code. As I'm looking to extend the BIOS for additional hardware and minimize the page zero usage, I was thinking about going to an (unsigned) 32-bit seconds counter. This would save a byte in page zero and still yield more than adequate timing from a cold start. It would also have a shorter execution time in the ISR. As I plan to add a Realtime Clock on an expansion board in the near future, this would be something to consider implementing on the next major BIOS release. It also mandates a replacement of the CTRL-T Monitor command to convert the 32-bit time variable to human readable form. I would also plan to initialize the software time from the hardware RTC during a cold start. These are future enhancements, which are still a ways off.

Regarding the code from Mike for the existing RTC, I've already put this into an updated BIOS build and it's currently executing on one of my SBCs... I'll just let it run for a few days here and continue tracking it. As I've got some travel coming up shortly, I'll plan on having one of the SBCs run that code for a couple weeks. Granted, I can always edit the time variables manually to check for proper increments, but I'll just it run a couple days first. (Thanks Mike).

_________________
Regards, KM
https://github.com/floobydust


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 20, 2019 6:56 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
I find it easier to update the broken down time struct in the clock interrupt, and then convert to time_t format on request. The conversion only requires a few multiplications, and no division (except divide by 4).


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 20, 2019 7:01 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8479
Location: Midwestern USA
floobydust wrote:
Yes, I do like the idea of changing out the current RTC code. As I'm looking to extend the BIOS for additional hardware and minimize the page zero usage, I was thinking about going to an (unsigned) 32-bit seconds counter.

A 32-bit unsigned time_t would limit you to a little more than 136 years. You'd have to choose your epoch carefully so you have reasonable range.

Quote:
This would save a byte in page zero...

That's where a 65C816 has a real advantage. The kernel's direct (zero) page can be somewhere else, leaving physical zero page open to other uses.

Quote:
...and still yield more than adequate timing from a cold start.

Actually, you needed two counter fields: one that represents the time-of-day and calendar date (as I described earlier; I refer to this field as tm), and another that represents system uptime (referred to as uptime). Both are incremented at one second intervals. The difference is tm is settable, whereas uptime is initialized to zero at boot time and cannot be set.

In my POC V1.1 BIOS, uptime is also used as the reference for time delays. Since uptime increments once per second, the future value of uptime for delay purposes may be computed by adding the number of seconds of delay to the current value of uptime, and then loop waiting for uptime to become equal to the computed future value. When that happens, the delay loop is exited and control is returned to the program that requested the delay. It's not the same as a true sleep() function, as the latter actually finishes by putting the delayed task into catatonia and allowing a different task to run for the duration of the delay

Quote:
It would also have a shorter execution time in the ISR.

Not necessarily. Using the code example I previously posted, the byte in the time field that gets updated the most is the LSB——only jiffyct is updated on each jiffy IRQ. That's where the bulk of the processing will be done, and has nothing to do with the field size. The next byte only gets updated once every 256 passes through the code, that is, once every 256 seconds. The third byte only gets updated once every 65,536 passes, etc. So you can see a larger time field doesn't pose as much of a performance penalty as you think.

Arlet wrote:
I find it easier to update the broken down time struct in the clock interrupt, and then convert to time_t format on request. The conversion only requires a few multiplications, and no division (except divide by 4).

That's seems to be expensive processing for an ISR.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 20, 2019 7:36 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
BigDumbDinosaur wrote:
Arlet wrote:
I find it easier to update the broken down time struct in the clock interrupt, and then convert to time_t format on request. The conversion only requires a few multiplications, and no division (except divide by 4).

That's seems to be expensive processing for an ISR.


The ISR only updates the broken down time, which is fast. The conversion is done in application code, when needed.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: barnacle and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: