6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Wed May 01, 2024 2:26 pm

All times are UTC




Post new topic Reply to topic  [ 11 posts ] 
Author Message
 Post subject: WDC816CC and farmalloc
PostPosted: Thu Apr 18, 2024 8:49 am 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 256
Location: South Africa
I am struggling to get farmalloc to work using WDC's linker and C compiler and I suspect it's more of a linker problem than a farmalloc problem.

I have the 'C' code:
Code:
void* far_heap_start = (void*)0x12f111;
void* far_heap_end = (void*)0x1fffff;

void main(void)
{
   void*            pvBackground;
   
   for (;;)
   {
      pvBackground = farmalloc(100);
      ...
   }

Compiled and linked using:
Code:
wdc816cc.exe  -bs -DUSING_816 -ml -wr -wu -so -sp -lt -pb -pp -px -o Main.obj Main.c
wdcln.exe -C10000 -D20000 -g -t -sz -HIE -obin\Video.hex Main.obj -lcl -lml

farmalloc initially returns $00000000 (a zero value 32bit pointer) which doesn't make sense as I would expect the first return to be somewhere above $0012f111. As it loops it allocates out successive pointers for a little less than 64KB before executing a BRK instruction. Which also doesn't make sense as I would expect it to be able to allocate about 850KB before it can't any more.

The WDC documentation indicates:
Quote:
Code:
void *heap_start = (void * )0x28000, *heap_end = (void * )0x48000;
The example allocates 128K
So I would expect to be able to allocate out more than 64KB.

The sharp eyed among you will have already noticed that heap_start is not the same as far_heap_start however if I use heap_start and heap_end then I get the following linker error:
Code:
Undefined symbol: ~~far_heap_end
Undefined symbol: ~~far_heap_start

And at this point I should mention I am compiling for the Large memory model (using -ml) and linking against the Large standard and math libraries (-lcl -lml).
~~far_heap_end and ~~far_heap_start are definitely referenced int the cl.lib file. So I think the WDC documentation is incorrect in their example.

If I look at the generated main.lst file I can see that ~~far_heap_start and ~~far_heap_end are correctly externally defined
Code:
   701                        ;void* far_heap_start = (void*)0x12f111;
   702                           data
   703                           xdef   ~~far_heap_start
   704                        ~~far_heap_start:
   705 00:0000: 11 F1 12 00     dl   $12F111
   706 00:0004:                 ends
   707                        ;void* far_heap_end = (void*)0x1fffff;
   708                           data
   709                           xdef   ~~far_heap_end
   710                        ~~far_heap_end:
   711 00:0004: FF FF 1F 00     dl   $1FFFFF
   712 00:0008:                 ends

and the linker seems to link them into cl.lib.

Has anyone managed to use farmalloc before (I think so) and if so please could you post how you did it! I don't know if I'm an idiot or if farmalloc really is broken.


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 18, 2024 3:01 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8167
Location: Midwestern USA
AndrewP wrote:
I am struggling to get farmalloc to work...Has anyone managed to use farmalloc before (I think so)...

The referenced topic concerned a malloc() implementation Marco Granati was working on for the 65C816.  His design apparently was a clean-sheet implementation with no relationship to the WDC compiler.  From what I was able to understand about it, the heap was limited to 64KB and could not span banks.

Speaking of Marco, he abruptly stopped visiting the forum in January of 2021 after having been fairly active for a number of years.  His native Italy was hit especially hard by COVID...

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 19, 2024 9:20 am 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 256
Location: South Africa
I really hope Marco is well and it's just the usual life obligations that have kept him off this forum; unfortunately his subversion repository hasn't been updated in a number of years too.

BigDumbDinosaur wrote:
a malloc() implementation Marco Granati was working on for the 65C816.
Thanks for pointing out it wasn't the WDC malloc Marco was using, I hadn't noticed that. The more I think about it the more it appears I must be getting something in the linker process wrong. I'll report back here if I solve it.


Top
 Profile  
Reply with quote  
PostPosted: Sat Apr 20, 2024 6:40 pm 
Offline

Joined: Fri Apr 15, 2016 1:03 am
Posts: 136
I'm not familiar with the WDC package, so this is mostly a guess.

Is it expecting
Code:
far_heap_start = $12f111
far_heap_end = $1fffff

instead of
Code:
far_heap_start: dl $12f111
far_heap_end: dl $1fffff


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 24, 2024 7:04 am 
Offline
User avatar

Joined: Mon Aug 30, 2021 11:52 am
Posts: 256
Location: South Africa
leepivonka wrote:
Is it expecting
Code:
far_heap_start = $12f111
far_heap_end = $1fffff

instead of
Code:
far_heap_start: dl $12f111
far_heap_end: dl $1fffff
Thanks but unfortunately I don't have control over that, that's assembly emitted by WDC's C compiler.

I still haven't found a solution to this (well I have, I just wrote my own) but it's looking more and more like the cl.lib is incomplete.


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 24, 2024 6:15 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8167
Location: Midwestern USA
AndrewP wrote:
I still haven't found a solution to this (well I have, I just wrote my own) but it's looking more and more like the cl.lib is incomplete.

I suspect Marco Granati had run into something like this, as I know he was using the WDC development tools with his projects.  Perhaps that is what prompted him to conjure a new-and-improved malloc().

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed Apr 24, 2024 9:49 pm 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 137
Location: Texas
Strange, does WDC not provide the source code to their libraries so you can debug this sort of thing?

Rolling your own malloc() for a single process running on the 816, wouldn't be terribly difficult.

The simplest implementation I know of would be a simple bump allocator. You just have one pointer that points to the next "free" point in memory, and you increment it by the length you wish to allocate.
Code:
void *next_pointer = (void)0x0012F11; // Or where ever you want to start from

void *malloc(size_t size)
{
    void *rval = next_pointer;

    if ((next_pointer + size) > far_heap_end)
        return NULL;

    next_pointer += size; // Set up for next allocation
    return rval;   
}


The disadvantage to this approach is that you can't really "free" memory after you've allocated it. I've used this method for when I want to do some quick simple OS development though and don't want to write out a full memory manager.


There are several different ways you can go about making a more complex allocator of course. Really depends on what you're looking to do though.

For the small memory area of the 816, I'd probably do it one of two ways:

1) Use a simple bitmap, and divide the memory into chunks. (E.g. a single page, 4KiB on the Intel platform)
This would be about 512 bytes of memory to track this.

Each bit indicates if the that chunk is free (0) or allocated (1)

Nice thing is you can skip over several chunks at once just checking for MAX_INT. You never allocate less than one chunk.

When you release memory, you flip a bit back to 0 in your bitmap.

This is pretty fast, and easy to implement, but can leave lots of large gaps in your memory if the structures of your allocations are frequently small.

FreeBSD, and Linux use something kinda sorta like this with their buddy allocators. They then pair this with a slab allocator that runs in the process space to make it more efficient and fast.


Another approach might be to use a linked list of free blocks. Each block has a pointer to the next free block, and a size indicating how big it is.

You can then take a block and slice it down to the size you need and adjust the pointers as needed.

A bit more complicated to implement, but not horrible. Might leave your memory really fragmented though; which for the 816 is probably not as much of an issue seeing that you're not trying to keep cache lines filled.


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2024 9:54 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1400
Location: Scotland
Yuri wrote:
Strange, does WDC not provide the source code to their libraries so you can debug this sort of thing?


Do they still make money selling it? If so then that's maybe why want to keep the recipes secret... Saying that, I'd suggest it's more than time to open source it all. The code improvements may well be worth it. However, I also suspect they may have outsourced the work elsewhere, so maybe there are other issues there. (Licensing, etc.)

Quote:
Rolling your own malloc() for a single process running on the 816, wouldn't be terribly difficult.


Memory allocators are a very well (and often) "solved" problem. Sometimes though it's a case of pick the least worst performing suitable for your needs...

Part of the issue with the '816 is the 64K banks of RAM, so catering for a heap that's > 64KB and an allocator to deal with chunks > 64KB is the challenge - but if the C compiler correctly gives you far/wide pointers and generates the code to manage it, it ought to be fairly straightforward.

When I was looking for an allocator, I chose the classic/standard "first fit" algorithm and initially wrote one in Sweet16 for my 65C02 system, but when I went to the '816 and wanted BCPL, I re-wrote it in BCPL.

It's a linked list and I keep a pointer to the end of the last chunk allocated, so allocating the next chunk should be fast - that fails when you run out of heap, then you re-start the scan from the start, looking for free spaces big enough and then split them into 2 chunks - one the size you want, the other marked as free. You keep doing this until you don't have a free chunk big enough, to split then you do a merge, trying to merge adjacent free chunks into bigger chunks. Lather rinse, repeat.

So in-theory, fragmentation can cause you to run out of memory, even when you have lots of free chunks. Other than deliberate stress testing, I've never had that happen, even when running my system for weeks at a time doing edits, compiles, tests and so on. I have 512KB of RAM in my '816 system.

Another thing I do is implement "guard" words in the allocated chunks - these are checked when I free a chunk to make sure they've not been overwritten. This has caught few program bugs and were worth the effort.

All this in BCPL which compiles to a bytecode which is interpreted at an effective instruction rate of somewhere between 100 to 300K instructions/sec. on my 16Mhz '816 but performance is more than adequate.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2024 3:28 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8167
Location: Midwestern USA
drogon wrote:
Yuri wrote:
Strange, does WDC not provide the source code to their libraries so you can debug this sort of thing?

Do they still make money selling it?

I seem to recall the software is free for the downloading.

Quote:
Part of the issue with the '816 is the 64K banks of RAM, so catering for a heap that's > 64KB and an allocator to deal with chunks > 64KB is the challenge - but if the C compiler correctly gives you far/wide pointers and generates the code to manage it, it ought to be fairly straightforward.

Banks shouldn’t be an issue with data fetches and stores, as RAM can be treated as linear space by use of 24-bit pointers.¹  The real issue is in protecting one process’ heap from encroachment by another process.  That would, I’d think, be outside the purview of a memory allocator.

————————————————————
¹In a practical application, use of 32-bit pointers will result in faster-acting code, since register size changes during pointer manipulation can be largely avoided.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Thu Apr 25, 2024 4:06 pm 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1400
Location: Scotland
BigDumbDinosaur wrote:
drogon wrote:
Yuri wrote:
Strange, does WDC not provide the source code to their libraries so you can debug this sort of thing?

Do they still make money selling it?

I seem to recall the software is free for the downloading.

Quote:
Part of the issue with the '816 is the 64K banks of RAM, so catering for a heap that's > 64KB and an allocator to deal with chunks > 64KB is the challenge - but if the C compiler correctly gives you far/wide pointers and generates the code to manage it, it ought to be fairly straightforward.

Banks shouldn’t be an issue with data fetches and stores, as RAM can be treated as linear space by use of 24-bit pointers.¹  The real issue is in protecting one process’ heap from encroachment by another process.  That would, I’d think, be outside the purview of a memory allocator.


Lets count the number of multi-tasking OSs for the '816 that are workable right now... I count 1. Mine. Guard words at the start and end of each chunk allocation do work well.

But even in systems like Unix, a process can spawn threads that see the same memory, so the problem doesn't do away there.

This is from the header of my BCPL getvec:

Code:
 *      We have a heap which we split into chunks.
 *      Chunk format:
 *
 *            31 30-23 22     0
 *      +-------+-----+--------+
 *      | InUse | PID | Length |
 *      +-------+-----+--------+
 *      | Guard:      FEEDBEEF |
 *      +----------------------+
 *      | .....................|
 *      |      Data space      |
 *      | .....................|
 *      +----------------------+
 *      | Guard:      BEADCAFE |
 *      +----------------------+
 *
 * Each chunk has 3 words extra and the 23-bit length parameter includes this.
 *      InUse is a flag : top-bit -  with zero for free, top-bit set for in-use
 *      with the other 7 bits being used for the Imp PID.
 *
 *      The guard words are there to help detect heap/chunk corruption at free
 *      or merge times.


What you get back from the call is a pointer to the word after the first guard word.

Quote:
————————————————————
¹In a practical application, use of 32-bit pointers will result in faster-acting code, since register size changes during pointer manipulation can be largely avoided.


Which is exactly what my BCPL system does - everything is a 32-bit word. The trade-off is speed, but the advantages of having just one data type oughtweigh that for me.

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Fri Apr 26, 2024 1:18 am 
Offline
User avatar

Joined: Tue Feb 28, 2023 11:39 pm
Posts: 137
Location: Texas
drogon wrote:
Yuri wrote:
Strange, does WDC not provide the source code to their libraries so you can debug this sort of thing?


Do they still make money selling it? If so then that's maybe why want to keep the recipes secret... Saying that, I'd suggest it's more than time to open source it all. The code improvements may well be worth it. However, I also suspect they may have outsourced the work elsewhere, so maybe there are other issues there. (Licensing, etc.)

Quote:
Rolling your own malloc() for a single process running on the 816, wouldn't be terribly difficult.



Many professional software tool sets provide the source code so you can debug issues that come up. Microsoft has provided the code for their standard C library for quite some time, as did Borland, Watcom and many others back in the day as I recall.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 11 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 9 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: