Joined: Thu May 28, 2009 9:46 pm Posts: 8505 Location: Midwestern USA
|
Although this topic describes a programming situation with my POC V1.3 unit, it isn’t specific to it, as anyone who is using a multichannel UART might find it of interest.
V1.3 has four serial I/O (SIO) channels, requiring an array of eight circular queues (FIFOs) defined in contiguous RAM, four for reception (RxQs) and four for transmission (TxQs). A zero-based index is used to identify which channel is being processed. The general processing methodology I’ve devised can support any even-numbered quantity of SIO channels, theoretically up to 128 maximum. The practical limit appears to be 32 channels, although I suspect the volume of processing required to handle continuous, bi-directional traffic on that many channels would leave little time to do much of anything else.
Anyhow, in order to keep a lid on the cycle count in the SIO primitives, two arrays of pointers are set up on direct (zero) page during system POST, one array for the RxQs and the other for the TxQs. Each queue has two pointers assigned to it, referred to for discussion purposes as “get” and “put”. A fetch is executed with LDA (get,X) and a store with STA (put,X), in which get and put are the base addresses of the RxQ and TxQ pointer arrays, respectively. In both cases, the offset in .X is set to channel index × 2.
Prior to attempting a fetch, it must be determined if there is at least one valid datum in the target queue, which will be true if put != get. Similarly, prior to attempting a store, it must be determined if the target queue can accept a datum, which will be true if put+1 != get. In some cases, it will be useful to know to what extent an RxQ has been filled, e.g., as in managing flow-control. Assuming a 256-byte queue size, that may be determined with put - get, with borrow discarded.
My primary concern with this scheme is the large amount of RAM being consumed to support it. The queue array occupies 2KB (256 × 4 × 2) and the queue pointer array occupies 32 bytes of direct page. There are also pointers to three hardware registers per channel, along with bit fields on direct page that track things such as which receiver’s RTS has been de-asserted and which transmitter has been disabled because it has nothing to transmit. All-in-all, 60 direct-page bytes are dedicated to SIO processing.
Something determined from experimentation with POC V1.1 is that with the MPU running at double-digit Ø2 rates, smaller SIO queues are practical. In V1.3, which runs 28 percent faster than V1.1, the MPU can process the SIO queues nearly 40 times faster than data can come in or go out the serial ports when running at 115.2 Kbps. With that in mind, I’m looking to modify the SIO primitives to work with a 128-byte queue size, which will free up 1KB. That 1KB can be used to rearrange the bank $00 map for better utilization.
Of course, there’s always a catch, and in this case, a smaller queue size complicates queue pointer management. Following each queue fetch or store, the corresponding pointer’s least-significant byte (LSB) is incremented to point to the next location. As the present queue size is 256 bytes, that scheme requires no special processing, since the pointer’s most-significant byte (MSB) is static and the LSB will wrap when the queue’s upper boundary has been reached. Testing for queue space is also trivial—it’s the simple put != get or put+1 != get comparisons. However, such simplicity isn’t possible with a 128-byte queue.
The queue array starts on a page boundary, i.e., $00xx00. Hence an even-numbered channel’s queue will end at $xx7F and an odd-numbered channel’s queue will begin at $xx80. This being the case, incrementing a pointer’s LSB would have to be followed with a check for range and if outside of its queue’s boundaries, fixed up as necessary, i.e., normalized. Additionally, a test for queue space cannot be a simple comparison, since a comparison is really an unsigned subtraction. For example, executing put+1 != get to see if an even-numbered channel’s queue can accept a datum will blow up if put is $7F. put+1 would have to be normalized before the comparison could be carried out.
Cutting to the chase, I’ve been idly monkeying with some code to come up with a way to manage the pointers. Here’s what I’ve got so far. All it does is increment the pointers, with normalization to keep them within their respective queue boundaries:
Code: .opt proc65c02,swapbin ;Kowalski assembler options directive ;=============================================================================== ; ;QUEUE POINTER INCREMENTATION SIMULATION ; ; ——————————————————————————————————————————————————————————————————————— ; This program tests an algorithm for incrementing TIA-232 circular queue ; pointers using a smaller queue size. Test conditions are: ; ; § There are four channels, selected with a zero-based index. ; ; § Each queue is 128 bytes in size. ; ; § The base address for the queue array is $BC00. The algorithm being ; simulated assumes that the queue array base is page-aligned. ; ; § An even-numbered channel’s queue starts on a page boundary. An odd- ; numbered channel’s queue starts at a page boundary plus the queue ; size. Hence, even-numbered channels’ queues are in the range $xx00- ; $xx7F & odd-numbered channels’ queues are in the range $xx80-$xxFF, ; where $xx is $BC, $BD, etc. ; ; In addition to maintaining queue pointers within the stipulated ranges, ; it is desired to use a minimum of clock cycles & code, although these ; goals are likely mutually exclusive. ——————————————————————————————————————————————————————————————————————— ; ;=============================================================================== ; ;TEST DEFINITIONS ; n_nxpchn =4 ;number of channels s_ptr =2 ;size of a pointer s_queue =128 ;size of a queue ptrbase =$a0 ;pointer array base address quebase =$bc00 ;queue array base address ; worktmp =$f0 ;setup working queue address ; ;=============================================================================== ; *=$2000 ; ;=============================================================================== ; ;SET UP TEST ENVIRONMENT ; setup lda #>quebase ;queue array base address MSB sta worktmp+1 ;set working address MSB lda #<quebase ;queue array base address LSB ldy #0 ;starting channel index ; ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-— ; Initialize Each Channel’s Queue Pointer ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-— ; setup010 sta worktmp ;set working address LSB tya ;current channel asl ;pointer array offset tax ;offset —> .X lda worktmp+1 ;set this channel’s queue... sta ptrbase+1,x ;starting boundary MSB lda worktmp ;set this channel’s queue... sta ptrbase,x ;starting boundary LSB iny ;bump channel index cpy #n_nxpchn ;all channels done? beq simstart ;yes ; adc #s_queue ;no, point to next... bcc setup010 ;channel’s queue ; inc worktmp+1 ;bump queue pointer MSB bra setup010 ;do next channel ; ;=============================================================================== ; ;SIMULATED CHANNEL PROCESSING ; simstart ldy #0 ;starting channel index ; loop tya ;channel index asl ;channel offset tax ;offset —> .X ... ; ; —-—-—-—-—-—-—-—- ; chan 0 —> .X = 0 ; chan 1 —> .X = 2 ; chan 2 —> .X = 4 ; chan 3 —> .X = 6 ; —-—-—-—-—-—-—-—- ; ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-— ; *** Beginning of Test Algorithm *** ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-— ; tya ;channel index lsr ;condition carry... ; ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—- ; c == 0 indicates even-numbered channel, queue LSB ; is in the $00-7F range... ; ; c == 1 indicates odd-numbered channel, queue LSB ; is in the $80-FF range. ; ; For reference, a byte within a given queue is acc- ; essed with (<dp>,X) addressing, which is why only ; the pointer LSB is modified. ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—-—- ; inc ptrbase,x ;bump address LSB bcc is_even ; ; ; odd-numbered channel... ; bmi next ;LSB still in range ; lda #s_queue ;reset LSB to... sta ptrbase,x ;to $80-FF range bra next ; ; ; even-numbered channel... ; is_even bpl next ;LSB still in range ; stz ptrbase,x ;reset to $00-$7F range ; ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-— ; *** End of Test Algorithm *** ; —-—-—-—-—-—-—-—-—-—-—-—-—-—-— ; next iny ;next channel cpy #n_nxpchn ;all channels processed? bcc loop ;no ; brk ;yes nop bra setup ; .end I suspect there may be a more-efficient way to do this, but have yet to see it. Maybe one of you will.
_________________ x86? We ain't got no x86. We don't NEED no stinking x86!
|
|