6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed May 22, 2024 5:16 pm

All times are UTC




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Thu Jun 27, 2013 8:28 am 
Offline

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy
Hi all,
below a piece of software for manage a dynamic local heap in 65C816 system.
The local heap will be entirely contained in a bank of 64k (the size of this local heap will be < $FFFF).
The starting of the heap is in variable hdrptr in page 0 and is not necessary to point to address $0000 of the bank.
Any idea for a global heap in which memory blocks (limited to 64k) may occupy 2 adjacent banks?

(Sorry, but the code is commented in italian)

Code:

; assembled with avocet 65C816 assembler

;----------------------------------------------------------
; Local Heap - only in one bank (max. 64K)
;----------------------------------------------------------
;
; Il blocco di heap locale viene gestito mediante un header ed una lista
; collegata di blocchi free.
; Struttura header del blocco heap:
;
;       WORD    bsize -> dimensione memoria libera nello heap
;       WORD    bnext -> puntatore al prossimo blocco libero
;
; in fase di inizializzazione bsize = tot. mem. heap - 4 e bnext = hdr + 4
;
; I blocchi free sono contenuti in una lista collegata con la seguente
; struttura:
;
;       WORD    bsize -> dimensione blocco + 1 (dispari -> blocco free)
;       WORD    bnext -> next blocco free (= 0 se e' l'ultimo)
;
; I blocchi allocati (non-free) hanno la seguente struttura:
;
;       WORD    bsize -> dimensione blocco (sempre pari)
;               data -> blocco effettivo allocato di dimensione bsize
;
;

; Variabili pagina 0:
        .PAGE0
nsize           .DW     ; dimensione blocco da allocare
bsize           .DW     ; dimensione vera blocco free
splitsz         .DW     ; dimensione blocco splittato
bfree           .DW     ; puntatore blocco free
hdrptr          .DW     ; puntatore header heap

MINFREESIZE     .EQU    $0008   ; minimo valore per allocazione
FIRSTPTR        .EQU    $0006   ; offset primo ptr valido
HEAPHDRSIZE     .EQU    $0004


curptr          .EQU    bfree
pred            .EQU    splitsz

        .CODE
       
; alloca un blocco nello heap locale
; In    - A = nsize (comprensivo dei 2 bytes necessari all'header del blocco)
;         Y = puntatore all'header dello heap
;         DP  = 0 (pagina zero)
;         DBR = banco contenente lo heap
;         CPU in modo 16 bit
; Out   - CF = 0 - X = ptr blocco
;         CF = 1 se allocazione fallita

_NewHeapBlk:
        .LONGA  on
        .LONGI  on

        sta     nsize
        sty     hdrptr
        tyx                     ; X = hdrptr
        lda     !$0000,x        ; A = header.bsize
        cmp     nsize           ; bsize - nsize
        bcc     ?08             ; bsize < nsize -> errore (no spazio)
?01:    txy                     ; Y = pred blocco libero
        ldx     !$0002,y        ; X = Y.bnext -> header blocco libero
        beq     ?08             ; errore - finiti i blocchi liberi
        lda     !$0000,x        ; A = X.bsize
        dec     a               ; vera dimensione del blocco free
        cmp     nsize           ; X.bsize - nsize
        bcc     ?01             ; X.bsize < nsize -> next free block
        beq     ?06             ; X.bsize = nsize
       
        ; CF = 1 -> X.bsize > nsize
        sta     bsize           ; salva bsize
        sbc     nsize           ; A = eccesso di memoria del blocco
        cmp     #MINFREESIZE    ; eccesso di memoria permette split blocco ?
        bcs     ?04             ; blocco free splittabile
        lda     bsize           ; alloca blocco intero X.bsize
        bra     ?06
?04:    ; splitta blocco free
        ; A = size blocco splittato
        ; Y = pred blocco free
        ; X = blocco free
        sta     splitsz         ; salva dimensione blocco splittato
        stx     bfree           ; X -> blocco free
        txa
        clc
        adc     nsize
        tax                     ; X -> blocco splittato
        lda     splitsz
        inc     a               ; marca blocco free
        sta     !$0000,x        ; bsize nuovo blocco splittato
        txa                     ; A = blocco split     
        ldx     bfree
        sta     !$0002,x        ; bfree.bnext = split
        lda     nsize           ; nsize
?06:    ; toglie dalla lista free il blocco X
        ; A = size blocco free
        ; Y = pred blocco free
        ; X = blocco free
        sta     !$0000,x        ; X.bsize nuovo blocco allocato
        lda     !$0002,x        ; X.bnext
        sta     !$0002,y        ; Y.bnext = X.bnext
        ldy     hdrptr
        sec
        lda     !$0000,y        ; header.bsize
        sbc     !$0000,x        ; header.bsize - X.bsize
        sta     !$0000,y
        inx
        inx                     ; X -> campo data
        clc                     ; allocazione OK
        rts
?08:
_hErr:
        sec
        ldy     hdrptr
        ldx     #0              ; return NULL
        rts
       
; ritorna dimensione di un blocco allocato
; In    - X = ptr blocco allocato
;         Y = puntatore all'header dello heap
;         DP  = 0 (pagina zero)
;         DBR = banco contenente lo heap
;         CPU in modo 16 bit
; Out   - CF = 0 - A = size blocco - X = ptr header blocco
;         CF = 1 se ptr X non valido (heap corrotto ?)
_SizeHeapBlk:
        cpx     #FIRSTPTR
        bcc     _hErr           ; ptr non valido !
        dex
        dex                     ; X -> header blocco
        lda     !$0000,y        ; size heap
        sec
        sbc     #HEAPHDRSIZE    ; toglie lunghezza header heap
        sta     bsize           ; bsize = dimensione corrente
        sty     hdrptr
        iny                     ; add 4
        iny
        iny
        iny                     ; Y = first ptr blocco
?04:    sty     curptr
        lda     !$0000,y        ; size blocco
        cpx     curptr          ; confronta X con curptr
        bcc     _hErr           ; X < curptr -> errore !
        beq     ?08             ; X = curptr -> trovato ?
        ; qui CF = 1 !
        and     #$FFFE          ; maschera off bit 0
        sta     nsize           ; dimensione blocco corrente
        lda     bsize
        sbc     nsize           ; update dimensione heap
        bcc     _hErr           ; overflow -> errore !
        beq     _hErr           ; ptr X non trovato !
        sta     bsize
        tya
        clc
        adc     nsize
        tay                     ; Y = next block
        bra     ?04             ; loop
?08:    bit     #$0001          ; test blocco free
        bne     _hErr           ; errore - blocco free
        clc
        ldy     hdrptr
_?01:   rts
       
; rilascia un blocco allocato
; In    - X = ptr blocco allocato
;         Y = puntatore all'header dello heap
;         DP  = 0 (pagina zero)
;         DBR = banco contenente lo heap
;         CPU in modo 16 bit
; Out   - CF = 0 - A = size blocco - X = ptr header blocco
;         CF = 1 se ptr X non valido (heap corrotto ?)
_RlsHeapBlk:
        jsr     _SizeHeapBlk
        bcs     _?01            ; errore - esce

; punto di ingresso nel caso il test del puntatore sia stato fatto
; In    - X = ptr blocco fisico (header blocco)
;         Y = puntatore all'header dello heap
;         A = size blocco
;         CF = 0
_RlsHeapBlk2:
        sta     bsize           ; size blocco
        adc     !$0000,y        ; update size heap
        sta     !$0000,y
        stx     curptr          ; salva curptr
        tyx                     ; X = PRED
        ldy     !$0000,x        ; Y = SCAN
        beq     ?06             ; SCAN = NULL
?02:    cpy     curptr          ; test
        beq     ?03             ; SCAN = CURPTR
        bcs     ?06             ; SCAN > CURPTR
?03:    tyx                     ; X = PRED = SCAN
        ldy     !$0002,x        ; Y = PRED.bnext
        bne     ?02             ; loop SCAN
?06:    ; X = PRED // Y = SUCC // X < curptr < Y
        ; inserisce il blocco nella lista free
        tya                     ; A = SUCC
        stx     pred            ; salva PRED
        ldx     curptr
        sta     !$0002,x        ; curptr.bnext = SUCC
        txa                     ; A = CURPTR
        ldx     pred            ; X = PRED
        sta     !$0002,x        ; pred.bnext = curptr
        tya                     ; A = SUCC
        beq     ?08             ; last block
       
        ; verifica ora se esiste un blocco free adiacente a destra
        sec
        sbc     curptr          ; A = SUCC - curptr
        cmp     bsize           ; se adiacente a destra -> A = 0
        bne     ?08             ; no free block adiacente a destra
       
        ; fonde blocco curptr con blocco SUCC
        lda     !$0000,y        ; SUCC.bsize
        dec     a               ; SUCC blocco free !
        clc
        adc     bsize           ; update size blocco fuso
        sta     bsize
        ldy     curptr          ; Y = CURPTR
        lda     !$0002,x        ; A = pred.bnext
        sta     !$0002,y        ; curptr.bnext = A = pred.bnext

?08:    ; X = PRED // X < curptr
        ; verifica se esiste un blocco free adiacente a sinistra
        cpx     hdrptr          ; PRED = primo blocco heap ?
        beq     ?10             ; si - non esiste blocco adiacente a destra
        ldy     !$0000,x        ; Y = PRED.bsize
        dey                     ; blocco free !
        stx     pred
        tya                     ; A = PRED.bsize
        clc
        adc     pred            ; A = PRED.bsize + PRED
        cmp     curptr          ; se curptr = A blocco adiacente
        bne     ?10             ; non esiste blocco free adiacente a destra
       
        ; fonde blocco curptr con blocco PRED
        tya                     ; A = PRED.bsize
        clc
        adc     bsize
        sta     bsize           ; nuova dimensione blocco free fuso
        ldx     curptr          ; X = curptr
        lda     !$0002,x        ; A = curptr.bnext
        ldx     pred            ; X = PRED
        sta     !$0002,x        ; PRED.bnext = curptr.bnext
        stx     curptr          ; curptr = PRED (blocco fuso)
       
?10:    ; marca il blocco curptr come free
        lda     bsize
        inc     a
        ldx     curptr
        sta     !$0000,x        ; curptr.bsize = bsize + 1
        clc                     ; OK
        rts

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 27, 2013 5:06 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8190
Location: Midwestern USA
I can't really decipher what you are attempting to accomplish (not able to read Italian) but I will say that straddling two banks is possible by 16 bit indexing. For example, LDA $018000,X would access bank $02 if X => $8000. Similarly:

Code:
         sep #%00100000        ;8 bit .A
         rep #%00010000        ;16 bit index
         lda #1
         pha
         plb                   ;set default data bank
         ldx #$FFFF            ;arbitrary index value
         lda $8000,X           ;loads from $027FFF
         ...etc...

would reach into bank $02 as well.

Also consider:


Code:
s_word   =2                    ;size of data word
;
         sep #%00100000        ;8 bit .A
         rep #%00010000        ;16 bit index
         lda #$01              ;base bank
         ldx #$8000            ;base address
         stx zeropage          ;set up direct page pointer
         sta zeropage+s_word   ;set up base bank
         lda [zeropage]        ;loads from $018000
         ldy #$8000            ;arbitrary index value
         lda [zeropage],y      ;loads from $020000
         rep #%00100000        ;16 bit .A
         ldy #$FFFF            ;arbitrary index value
         lda [zeropage],y      ;loads from $027FFF,$028000
         ...etc...

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 27, 2013 5:39 pm 
Offline

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy
BigDumbDinosaur wrote:
[size=110]I can't really decipher what you are attempting to accomplish (not able to read Italian) but I will say that straddling two banks is possible by 16 bit indexing.


Hello, well... i know this that you say. My question is different.
In my previous post the code manage a linked list of block of memory, so is possible to implement C-like function as malloc(), free() and realloc(). But is limited to 64k. I'm searching a way to implement a general malloc()/free() function's (i don't believe exist another way than a linked list of free block for implement this) that work at any size of memory.
With processor as 8086 this is easy, using the segmented memory (in paragraph), but with 65C816 the granularity for a global heap is 64k.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 27, 2013 6:05 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
You should be able to make use of 3-byte pointers with 65816 Addressing mode Zeropage/Direct Indirect Indexed Long (that is [d],y)
Cheers Ed


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 27, 2013 6:46 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8190
Location: Midwestern USA
BigEd wrote:
You should be able to make use of 3-byte pointers with 65816 Addressing mode Zeropage/Direct Indirect Indexed Long (that is [d],y)
Cheers Ed

Correct, as I illustrated in my code example.

I think the linked list method is one such technique. Another would be to maintain a small bitmap that relates to "pages" (not to be confused with a 256 byte page) of a fixed size, with a set bit indicating that a page of a given starting address is available. If an allocatable page were equal in size to a 65xx page, then the entire 64KB space of one bank could be managed in a 32 byte bitmap. If a larger space were to be managed the bitmap would be multiples of 32. In any case, the starting address controlled by a bit can be computed with minimal processing overhead. It's going to be a 24 bit quantity, which is readily handled by the '816. The application would still be limited to a maximum of 64KB per allocation. Not sure if this is where granati is going though.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 27, 2013 6:56 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
Ah, so you did. I confess I read your text and skipped the code.
Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Thu Jun 27, 2013 10:43 pm 
Offline

Joined: Mon Jun 24, 2013 8:18 am
Posts: 83
Location: Italy
I explained myself really really bad and I apologize for that.
In the code that I posted at the beginning are implemented the following functions:

1) _NewHeapBlk: allocation of a block of given size; this function return a pointer (16 bit) to the block.

2) _RlsHeapBlk: deallocation of the given block (given pointer 16 bit)

Not need other: one can allocate and free any size (in the 64K), and this work the same as C-like function malloc() and free()
This work why i use a linked-list of free-block; at the beginning i have one only block (free) and any time i call _NewHeapBlk the linked list is updated; the same when i call _RlsHeapBlk. Not need bitmap or other storage methods. And i can allocate any size at any order (is only mandatory that allocation and deallocation are balanced).
The problem is managing an heap that is larger than one bank of 64k. Of course is ever possible to build a linked list, but this list should be spanned over many banks, and the allocation will return a 24 bit pointer.

_________________
http://65xx.unet.bz/ - Hardware & Software 65XX family


Top
 Profile  
Reply with quote  
PostPosted: Fri Jun 28, 2013 2:38 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8190
Location: Midwestern USA
granati wrote:
I explained myself really really bad and I apologize for that.
In the code that I posted at the beginning are implemented the following functions...

It's only fair to warn you that when the topic gets on to programming, I tend to become quite pedantic, much to the amusement of some and annoyance of others. :lol: What follows isn't criticism, just some friendly advice.

In this case, I believe you have placed the metaphoric cart in front of the horse. You've presented code that is supposed to implement a memory allocation algorithm (malloc?), but have not adequately described the objectives that your algorithm is expected to achieve. This makes understanding what you are trying to do somewhat difficult, and is contrary to good software development practice. Professional software engineers always thoroughly define the objectives before they start writing code.

May I make a suggestion, both for our benefit and yours? Put together a flow chart or similar describing exactly what you are trying to achieve, commented in English (don't worry about the quality of your English; we'll be able to understand). In other words, think with a top-down approach, which means define the problem to be solved in abstract terms. The details come later once you have a clear picture of what is to be achieved. Succinctly stated, you can't solve a problem if you can't adequately describe it to someone else. If others here (including me) can fully understand your objectives, I think we will be better able to help you find a methodology that will work for you. :D

Incidentally, a workable malloc for the 65C816 already exists in the WDC C compiler, so we know it can be done.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 5:09 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 388
Location: Minnesota
Quote:
Incidentally, a workable malloc for the 65C816 already exists in the WDC C compiler, so we know it can be done.


I don't think there was ever any doubt about that. The OP could just be having fun working it out for himself, using an incremental approach (start with one 64K block to divvy up, then figure out how to extend that).


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 5:28 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
So, is there a (subtle?) difficulty with just bumping up all the pointers to 3 bytes? Some of the calculations might also need to be 3 bytes, even if an allocated block only needs two bytes for the length. If your two banks are not consecutive, you need another steep to convert between logical addresses and physical addresses.
Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 5:28 pm 
Offline

Joined: Sun Nov 08, 2009 1:56 am
Posts: 388
Location: Minnesota
Quote:
Another would be to maintain a small bitmap that relates to "pages" (not to be confused with a 256 byte page) of a fixed size, with a set bit indicating that a page of a given starting address is available. If an allocatable page were equal in size to a 65xx page, then the entire 64KB space of one bank could be managed in a 32 byte bitmap.


I'm having a bit of trouble figuring out how this would work. Not so much the allocation part - if you want N bytes, scan for a sequence M of set bits such that M * "page size" >= N. If you find such a sequence, clear the bits and return a pointer to the first "page". Otherwise return a null pointer.

No, it's more the de-allocation part. If all you have is a pointer to the first page, how do you know how many pages to free in the bitmap?

I suppose you could encode the pointer in some way, taking advantage of the fact that if each allocation must start on a "page boundary" then the value of the low bits of the pointer can be assumed and instead used somehow to indicate the size of the allocation. That might tricky in a hurry though, if I wanted to do any pointer arithmetic (I'd guess the compiler would have to know how to manipulate such a pointer and when it was appropriate to do so).


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 8:25 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8190
Location: Midwestern USA
teamtempest wrote:
Quote:
Another would be to maintain a small bitmap that relates to "pages" (not to be confused with a 256 byte page) of a fixed size, with a set bit indicating that a page of a given starting address is available. If an allocatable page were equal in size to a 65xx page, then the entire 64KB space of one bank could be managed in a 32 byte bitmap.

I'm having a bit of trouble figuring out how this would work. Not so much the allocation part - if you want N bytes, scan for a sequence M of set bits such that M * "page size" >= N. If you find such a sequence, clear the bits and return a pointer to the first "page". Otherwise return a null pointer.

No, it's more the de-allocation part. If all you have is a pointer to the first page, how do you know how many pages to free in the bitmap?

The process' u-area (to borrow a term from UNIX) would have to maintain a record of how many pages were allocated for a given address. Or...

Quote:
I suppose you could encode the pointer in some way, taking advantage of the fact that if each allocation must start on a "page boundary" then the value of the low bits of the pointer can be assumed and instead used somehow to indicate the size of the allocation. That might tricky in a hurry though, if I wanted to do any pointer arithmetic (I'd guess the compiler would have to know how to manipulate such a pointer and when it was appropriate to do so).

Since Granati wants to be able to allocate more than 64K of RAM with his malloc routine, he is going to have to use a 24 bit indirect pointer on direct page to point to the starting address, using [<DP>] or [<DP>],Y addressing. So what could be done is store the page address in <DP>, <DP+1> and <DP+2>, and the page count in <DP+3>. Don't forget that you can have multiple direct pages with the '816, so even though you are using up direct page four bytes at a time, you can always get more by manipulating the MPU's DP register.

Incidentally, many systems allocate RAM in 4KB pages. Using that number and the method I just described, a 256KB allocation could be managed with four direct page locations. For more than that, a 16 bit count would have to be maintained elsewhere.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 8:29 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8190
Location: Midwestern USA
BigEd wrote:
If your two banks are not consecutive, you need another steep to convert between logical addresses and physical addresses.

That would be somewhat awkward to implement with software alone. However, it's good that you did highlight this possibility. Without logical to physical translation, memory fragmentation would probably ensue.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 8:34 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10802
Location: England
Surely it's easy in software: the third byte of a pointer (the top byte) takes one kind of value when adding or subtracting lengths, and is substituted for the other kind of value when used as an address indirection. For example, it takes the values 00 and 01 for lengths (it's a logical pointer) and the values 1f and 23 for indirection (we're storing in banks 1f and 23.) A lookup table with two entries is required, if we're allocating into two banks. Probably going to be banks 1 and 2, but I hope you see what I mean.
Cheers
Ed


Top
 Profile  
Reply with quote  
PostPosted: Sat Jun 29, 2013 8:53 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8190
Location: Midwestern USA
BigEd wrote:
Surely it's easy in software: the third byte of a pointer (the top byte) takes one kind of value when adding or subtracting lengths, and is substituted for the other kind of value when used as an address indirection. For example, it takes the values 00 and 01 for lengths (it's a logical pointer) and the values 1f and 23 for indirection (we're storing in banks 1f and 23.) A lookup table with two entries is required, if we're allocating into two banks. Probably going to be banks 1 and 2, but I hope you see what I mean.

Of course, we're proceeding a bit blind at this point, as we don't know exactly what sort of architecture Marco has in mind.

However, I'm thinking in terms of what the MMU in a modern x86 machine does. It maintains a page table for logical <—> physical translation, with addressing for any given process being zero-based. A similar thing could be accomplished in software without too much overhead if page sizes don't have to be too small. Going with the idea of 4KB pages, a translation table would require 16 slots to tabulate the maximum number of pages that can be allocated per bank. If the physical memory that has been allocated to a process is in discontiguous banks this table would say so, since the bank number would be implicitly stated in a 24 bit physical address. The rest of it is integer math.

Realistically, this sort of thing is better accomplished in hardware. I envision where the algorithms needed to do translation in software could eat up more than a few machine cycles. Sufficiently powerful programmable logic could play the role of an MMU that maintains the page table and does the translation. That doesn't help Marco's problem, however.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 14 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: