Re: dictionary hashing
Posted: Fri Aug 17, 2018 11:20 pm
whartung wrote:
I haven't looked at how F83 FORGETs words. It hashed the words across 4 different paths. But each of those were, effectively, just 4 independent dictionaries, so the word headers were not changed at all to accommodate this technique.
Simply, you'd hash the value, decide which of the 4 lanes to pursue, and then search that lane to find your word.
Similarly, when a word is added, it picks the lane, and just appends the word to that lane.
Simply, you'd hash the value, decide which of the 4 lanes to pursue, and then search that lane to find your word.
Similarly, when a word is added, it picks the lane, and just appends the word to that lane.
I've already experimented with making a kernel where the dictionary is hashed across 4 paths. I used a code primitive >VT, to vocabulary thread, to add an offset to the address of the vocabulary. Each vocabulary had 4 cells for the dictionary paths. >VT exclusively ored the low two bytes of the count with the low two bytes of the first character of the words name, left shifted once ( double the offset, the cells were two bytes ) and added this offset to the address of the vocabulary. The find primitive, (FIND) was used like this
Code: Select all
CONTEXT @ >VT @ (FIND)
Code: Select all
CONTEXT @ >VT (FIND)
Code: Select all
CODE >VT ( ADR VOC -- ADR VOC+OFFSET )
HERE 6 + JSR,
NEXT JMP,
<code to hash name and add offset of 0-3 multiplied by 2 to vocabulary address>
RTS,
Code: Select all
CODE (FIND) ( ADR VOC -- ADR2 FLAG )
' >VT @ 6 + JSR, \\ PERFORM 4 WAY HASH
<REST OF FIND>
Code: Select all
CONTEXT @ (FIND) // (FIND) HASHES NAME AND SEARCHES ALONG CORRECT THREAD
Given the Commodore 64's slow disk drives, it just didn't seem worth it. If I ever get a chance to build an SBC and had source code sent over a serial line from the PC, it would be useful. On the other hand, the alterations to support a faster (FIND) resulted in a smaller memory footprint and made the system feel more responsive when interactively experimenting at the console, so it should be useful on a Commodore computer or an SBC.
Cheers,
Jim
Edit: Actually, my Forth didn't have WITH-WORDS at that time. The extra complexity in WORDS, SEEALL ( a word to test that SEE works correctly for all words in the context vocabulary), and SMUDGE-TEST ( a metacompiler word to test if there are any smudged words, and therefore wasted space, in the new kernel) motivated me to write WITH-WORDS as a common factor for WORDS, SEEALL, and SMUDGE-TEST which I then wrote for my simple linear-list implementation. In a way, experimenting with dictionary hashing helped me improve my non hashed implementation.
Cheers,
Jim