Joined: Fri May 05, 2017 9:27 pm Posts: 895
|
IamRob wrote: In an earlier thread I created a HELP search engine that returned information about a word delimited by a colon.
I tried applying the same search engine to finding any word within a number of screens. But it is a lot slower than I would like it to be. The problem is that using the space as a delimiter, then every word has to be checked for a match.
In the HELP engine, a match is only checked when it comes across a colon. A lot fewer matches are made and the search is quite a bit faster.
Even though a linear search is done, and a match is checked on every word, I still expected it to be faster that it is. The order of the search is this:
1) LOAD SCREEN 2) ENCLOSE each word delimited by a SPACE 3) use MATCH to check each word against the SEARCH word 4) print screen# and line# of each matched word found 5) if not end of screen, increment pointer and return to step #2 6) if not last screen, increment screen and return to step #1 7) end
I limited the search to 16 screens ( 0 SEARCH searches 00..$0F, $10 SEARCH searches $10..$1F and so on ) which is same as searching 16k in memory. At 1 Mhz, it takes 1 min 10 secs to search all 16 screens for each occurrence of the word.
I know I could make the search into a primitive, but I was just wondering if there might be a different way to search random words to make the search programmed entirely in Forth go faster. Reducing the number of times MATCH is used (like in HELP) helped speed things up. But doing a search on every word delimited by spaces is a lot slower. In HELP, a search for a colon is done before using ENCLOSE then entering MATCH.
Any ideas? With the Forths that I am familiar with, LOADing a screen will cause that screen to be interpreted and or compiled. Do you mean read that screen into a buffer?
Fleet Forth's editor is based on the Forth Inc. line editor presented in Forth Dimensions volume 3 issue 3. It has a word S which takes a block number and parses text up to a delimiting ^ and searches all blocks from the current one to one less than the block number that was on the stack. I ran some tests by copying a range of sixteen blocks to blocks used by the C64 Ram Expansion Module to eliminate disk access time as a variable. Depending on the text in the screens and what was searched for, I got results ranging from 11.6 to 20.2 seconds. Some of that time is the time to TYPE the information and the time it takes the Commodore 64 to scroll.
Code: OK CLEAR.TIME $10 FH S LDA ^ ELAPSED IP )Y LDA ^ PHA IP INC 2 8001 IP )Y LDA ^ 4 8001 IP )Y LDA ^ W 1+ STA DEY E 8001 IP )Y LDA ^ W STA CLC F 8001 IP LDA ^ 2 # ADC IP STA 1 8002 IP )Y LDA ^ PHA TYA 3 8003 0 ,X LDA ^ N ,Y STA 9 8003 0FE ,X LDA ^ 0FF ,X ORA 4 8005 IP LDA ^ 2 # ADC IP STA 8 8005 IP )Y LDA ^ PHA INY 2 8006 IP )Y LDA ^ IP 1+ STA 3 8006 IP )Y LDA ^ PHA DEY 4 8007 IP )Y LDA ^ PHA CLC 5 8007 3 ,X LDA ^ 80 # ADC PHA 6 8007 3 ,X STA 2 ,X LDA ^ PHA SEC 7 8007 0 ,X LDA ^ 2 ,X SBC TAY 8 8007 1 ,X LDA ^ 3 ,X SBC PHA 9 8007 0 ,X LDA ^ 2 ,X CMP 3 8008 1 ,X LDA ^ 3 ,X CMP 5 8008 0FE ,X LDA ^ 0FF ,X ORA 8 800A 0FE ,X LDA ^ 0FF ,X ORA 7 800B 0FE ,X LDA ^ 0FF ,X ORA C 800B 0FF ,X LDA ^ PHA 0FE ,X LDA 4 800C 0FF ,X LDA PHA 0FE ,X LDA ^ 4 800C 2 ,X LDA ^ 2 # SBC N 2+ STA 5 800D 3 ,X LDA ^ 0 # SBC N 3 + STA 6 800D 0 ,X LDY 1 ,X LDA ^ 7 800D N )Y LDA ^ TAX INY E 800D N )Y LDA ^ F 800D N )Y LDA ^ 4 800E N 2+ )Y LDA ^ A 800E 0 # LDA ^ 3 800F N )Y LDA ^ IBIT # AND 6 800F 1 # LDA ^ 9 800F 0FF # LDA ^ TAY 4 8010 N 4 + )Y LDA ^ 8 8010 N 4 + )Y LDA ^ C 8010 0:00:20.2 OK CLEAR.TIME $10 FH S PHA ^ ELAPSED IP )Y LDA PHA ^ IP INC 2 8001 IP )Y LDA PHA ^ TYA 3 8003 IP )Y LDA PHA ^ INY 2 8006 IP )Y LDA PHA ^ DEY 4 8007 IP )Y LDA PHA ^ CLC 5 8007 3 ,X LDA 80 # ADC PHA ^ 6 8007 3 ,X STA 2 ,X LDA PHA ^ SEC 7 8007 1 ,X LDA 3 ,X SBC PHA ^ 9 8007 TYA PHA ^ A 8007 0FF ,X LDA PHA ^ 0FE ,X LDA 4 800C 0:00:13.4 OK CLEAR.TIME $10 FH S ADC ^ ELAPSED IP LDA 2 # ADC ^ IP STA 1 8002 IP LDA 2 # ADC ^ IP STA 8 8005 3 ,X LDA 80 # ADC ^ PHA 6 8007 102 ,X ADC ^ 102 ,X STA 7 8009 101 ,X ADC ^ 101 ,X STA 6 800C TYA N ADC ^ 0 ,X STA 2 800F N 1+ ADC ^ 1 ,X STA 4 800F 0:00:12.7 OK 78 1 RAM $10 BMOVE 87 8010 86 800F 85 800E 84 800D 83 800C 82 800B 81 800A 80 8009 7F 8008 7E 8007 7D 8006 7C 8005 7B 8004 7A 8003 79 8002 78 8001 OK CLEAR.TIME $10 FH S DUP ^ ELAPSED R> DUP ^2+ >R @ >BODY ! ; 3 8002 R> DUP ^COUNT + >R ; 9 8002 AP0 @ DUP ^! ; F 8002 DUP ^>IN @ UMIN 9 8005 DUP ^NEGATE UNDER+ UNDER+ ; A 8005 DUP ^0 B 8006 DUP ^0<> + NEGATE ROT + 5 800A DUP ^SCR ! CR ." SCR# " U. 8 800C 0:00:12.9 OK CLEAR.TIME $10 FH S DROP ^ ELAPSED DROP ^PAUSE ; 6 8001 DROP ^PAUSE ; C 8001 'STREAM DROP ^ >IN @ C/L /MOD 9 800C 0:00:11.6 OK CONSOLE
The word FH ( from here ) when editing, will add the number in SCR to the value on the stack. When loading a block, it will add the value in BLK to the number on the stack.
The heart of the editor's search function is the word MATCH . It takes the address and count of a string to search ( in this case, the address of the block buffer and $400) and the address and count of the string to find ( in this case, the contents of the editors find buffer, FBUF , a 65 byte buffer). MATCH returns the offset to the character just past the search string in the string to search and a flag.
|
|