GARTHWILSON wrote:
very seamless and bug-free in that respect
This would only work as long as your program doesn't store program-referenced variables. If it did, then you can only relocate at load-time.
Quote:
My interest in moving already-loaded programs around comes from the times I've loaded up different applications on the workbench computer, then, after being done with one of the earlier ones, wanting to delete it and make room for something else; but that would leave memory fragmented, without enough contiguous room to load the next thing. In that case, I've had to also dump and re-load what was after the part I was finished with.
There are basically several ways of working around this problem:
(1) Make your program event-driven, and mandate that your software handle EVENT_BEGIN_RELOCATE and EVENT_END_RELOCATE events, both issued by the kernel of your OS. The idea being EVENT_BEGIN_RELOCATE gives your code the chance to serialize your program's state into a form suitable for later deserialization during EVENT_END_RELOCATE. So, your control flow is the OS realizes it needs to collapse memory. It issues EVENT_BEGIN_RELOCATE to an application, which then uses this time to prepare for the move. The OS then collapses memory by relocating the program's code and data segments. EVENT_END_RELOCATE is then sent to the newly relocated program, which restores its previous state, thus giving it a chance to fix up addresses.
(2) Mandate your program
NEVER uses absolute addresses for internal memory references. This is the method that PC/GEOS took, and it not only worked extremely well, but it also enabled them to utilize a "vm file" (virtual memory file) system where it could provide raw application state dumps as data files. The result was a user experience wherein loading and saving application state was effectively instantaneous, and this on a 4.77MHz CPU! To access a chunk of memory, you'd first have to explicitly "lock" it into physical memory. While locked, the chunk never moved, giving the application(s) the opportunity to touch the memory as needed. After you're done using it, you "unlock" the memory, giving the system the chance to move (or even dump back to disk) the memory.
(3) Use a memory management unit to page memory and establish multiple, unique address spaces for each program. This is what nearly all modern OSes do today. All programs literally run at address 0. Only shared libraries are loaded at non-zero addresses, and even then, nobody can guarantee where in the address space they appear, even amongst different address spaces! The kernel is responsible for managing memory in finite units called "pages" (typically 4K each), and updating hardware memory mapping configurations. This is the most difficult to implement, but by far the fastest and most convenient approach from an application programmer's point of view. No PC-relative code needed (except for "floating" shared libraries, as discussed above), and no need to lock memory manually, and no need for handling relocation events.
(4) Use a bytecode interpreter. This is the slowest possible method, and by what you described, is what your HP calculators are actually doing, especially with BASIC. Bytecodes can be implemented however you want, including mandating that all code be virtual-PC-relative.
(5) Make all code
and data PC-relative. This is, for example, how Win32Forth works.
Quote:
The problem with relocating running code (at least in the 6502 where code is not relative) is not that the relocation tables would have to be stored - but address values sneak into variables (fixed values like jump tables are in the relocation tables too) or into the stack. And it is basically impossible to determine which part of the stack is an address and what is normal data.
This is a solved problem. The solution is to never use actual addresses. Instead, you store
handles to data.
A
handle, in its simplest possible incarnation, is merely a pointer to a pointer. The next most complex is an
index into a table of pointers. It allows you play all sorts of neat tricks with memory management without adversely affecting runtime performance (your inner-most loops will take the performance hit, but as discussed above, "locking" your used memory returns your performance back. Just remember to unlock your memory after you're done).
The "handle table" is a data structure known, at least in part, to both your application and your memory manager. When a chunk of memory is relocated, the memory manager is responsible for updating its corresponding handle pointer. This frees the application to refer to chunks of memory via the handle, which remains fixed throughout the entire lifetime of the program.
Quote:
I had forgotten about the stack; but as long as you don't use preëmptive multitasking and relocate the program and its data space while it has references to outside programs
This is also a solved problem. I invite you to research the programming language
Oberon. Each of the activation frames on the stack has a corresponding "type descriptor" associated with it, which is required for its garbage collector to properly free unused memory. It can guarantee a piece of memory is free if, and only if, it has no outstanding references to it. Checking variables and CPU registers is easy, but checking the chain of activation frames is difficult, for nothing on the stack distinguishes data from pointers! The frame type descriptors permits the garbage collector to traverse the stack with confidence, clearly disambiguating pointers from non-pointers.
However, you can
only get that with a strongly-typed language. Languages like Forth simply cannot handle this. Languages like Scheme, Smalltalk, and Lisp get around this problem either by using
tagging, which I won't discuss here, or by simply not using activation frames as they're commonly understood.
Quote:
I'll say that the term "garbage collection" sounds related. How does that work (without getting too complicated)?
Suppose you have a program which allocates a chunk of memory, uses it, then frees it, like so:
Code:
void foo() {
char *myMemory = malloc(16); /* allocate 16 bytes of RAM */
UseItHere(myMemory); /* Do something with it */
free(myMemory); /* OK, we're done with it, so free it. */
}
This allocates some memory, uses it, then releases it so that it may be released for other programs to use. If only all code were this simple, we'd never have any problems.
Unfortunately, even
without the specter of multitasking, more complex applications prove difficult to know
when to actually release memory. For example, what actually happens in UseItHere() will determine whether or not the free(myMemory); call is actually a bug.
If UseItHere() adds that newly allocated piece of memory to a data store of some kind, then calling free() on it is a bug, because the memory
might still be in use elsewhere in the program. (Let's ignore the fact that it actually
is in use by the data store itself.)
So, the truly safest way to handle this is to rewrite the code as follows:
Code:
void foo() {
char *myMemory = malloc(16); /* allocate 16 bytes of RAM */
UseItHere(myMemory); /* Do something with it */
}
Notice that we allocate and use the memory, but we never actually free it. Again, what happens in UseItHere() determines whether or not this is a bug -- what if UseItHere has decided to
not register the memory on some data store? OOPS -- now we've wasted some memory.
This is called a "memory leak", and I'm positive you've heard that term before.
So, what is the solution? There are only two known solutions to this problem:
(1)
Apply formal logic, problem solvers, and static program analysis to determine which chunks of memory get allocated, and when they're to be deallocated. This is the approach that functional languages like MLton use to help reduce garbage collection overhead. ML alone takes 22 code passes to perform all of its optimizations. MLton adds 13 more. Thankfully, ML compiles so fast that this hit in compile performance isn't that noticable. Still, imagine trying to maintain a compiler with a 35 passes, and compare that to your assembler or C compiler, which typically only has
2.
(2) Rely on garbage collection.
Garbage collection works because a very simple observation:
you can't use a chunk of memory if you don't have a pointer to it. Ignoring the case of synthesizing pointers (which is a type and safety violation, by the way), once you lose the last pointer to a region of memory, that's it -- it's gone forever!! You can't restore it,
unless you have a pointer cached away somewhere accessible (like the data store UseItHere() might be using).
So, a garbage collector's job is to examine two things: the program state, and the memory state, and based on this mutual information,
dispose of memory no longer referenced. It does this, for all intents and purposes, concurrently with the running of your program.
Notice that BASIC has a DIM statement to allocate memory for arrays. Notice it also can construct arbitrarily large strings. However, it lacks completely any means of disposing allocated memory (some might argue the CLEAR command performs this job, but in reality, it's purpose is more complex. Executing CLEAR inside a running program often results in catastrophe unless you know EXACTLY what you're doing). This is because BASIC is a garbage collected environment.
Notice how languages like Java or even Oberon allow you to create new objects, but have no syntactic means of getting rid of them. This is because they, too, utilize garbage collection.
The end results with garbage collected languages are:
(1) ever so slightly slower runtime performance. This is because the underlying CPU simply lacks hardware assistance for implementing "read barriers" and "write barriers", and so these things need to be implemented in software. Most of these barriers can actually be optimized away by advanced compilers, thankfully, which is why performance hits are "slight" and not "god-awful" like they were in the 60s and 70s.
(2) NEVER do you get an accidental memory leak which crashes the system due to following a stray pointer. It's simply impossible, and provable mathematically.
You do get a different kind of resource leak with garbage collected languages though -- as long as you
have a pointer to a chunk of memory, it'll never be discarded. Garbage collectors are not
intention detectors, so when you are truly finished using a piece of data, the proper thing to do is assign its pointer to NULL (typically the value of zero, but it's best to use your language's symbolic representation for this; for Pascal/Modula-*/Oberon/Ada, use "nil" or "NIL" instead). That way, you eliminate that reference (without affecting others), giving the collector enough information to do its job.
Statistically speaking, however, comparing the memory leaks arising from not using garbage collection against the resource leaks from garbage collected systems, it is clear to see which is the lesser of the two evils. A brilliant example is the recent release of Internet Explorer, which will gladly crash itself after many hours of continual use due to leaked memory. Contrast this with Firefox, which will not leak memory nearly as fast unless you're using plugins (which Firefox can't control), such as the infamous Flash 9 or Flash 10 plugin under Linux.
Some languages, like Smalltalk or Lisp, won't even let you manipulate pointers of
any kind, so it's impossible to casually leak resources in those environments. (It's doable, but you actually have to explicitly code for it!)
This raises a good point, too -- if you're planning on writing a program so that it's extensible with plugins, you can do
all the static analysis and even dynamic analysis in the world, and you
still won't have a reliable system. All because of those plugins, you might end up with an intractible memory leak that ultimately brings your system down.
This is why I advocate writing code in some dialect of Oberon instead of C: Oberon compiles to native code, just like C does, but has garbage collection. If you write extensible programs, then the extensions, too, are written in Oberon
or an Oberon-compatible language. The end result is you get garbage collection and type-safety throughout the system. And, it compiles a hell of a lot faster too -- easily a factor of 15 times faster on my box, and for comparable levels of optimization. For Enterprise-scale programs, that speed boost is quite significant.
I haven't learned Ada yet, but I understand Ada exhibits most, if not all, of the properties that make Oberon so appealing to me. I'll need to take that language up someday.