6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Nov 24, 2024 6:36 am

All times are UTC




Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: A robust RAM test
PostPosted: Fri Aug 24, 2018 7:21 pm 
Offline

Joined: Wed Jul 18, 2018 12:12 pm
Posts: 96
Earlier this year I repaired a TRS-80 which would up having a bad static RAM module. The RAM module consisted of 4-2K RAMs. The correct chip selects were being generated, the address lines OK, and data lines OK. Something internal to one of the RAM chips was causing some RAM address to write over the top of others. It just so happened that part of the cold boot process on this machine copies some code from ROM to RAM which resides in the same RAM chip as the stack and this copy process thrashed the stack. (I captured the first 7 data bus lines on my LA using the 8th channel to look at the RD signal which was used as a clock to decode the data captured. Thankfully someone had done a great annotated ROM dump so it was possible to follow along with what it was doing whilst booting.)

Anyhow, a few days ago I was watching a video of someone working on an old Commodore PET which wound up having a RAM problem with the 'extended' RAM (which I think was everything over the 16K it came with in the base model). See, this is 6502 related :) He had a RAM testing utility which turned out to not test the 'extended' RAM so he wrote his own but his initial results were somewhat confusing as it was a similar situation to what I had where it was not writing or reading where it was supposed to so a simple RAM test would not discover the fault.

That got me to wondering how one would go about writing a robust RAM test. By robust I mean the test would be tailored to the particular hardware being tested. If you have an older 8 bit system with something like eight 8Kbit RAM chips, i.e. 1 bit per chip you could have one bad RAM chip cause one bit of every address to be wrong. If you had a single 32K RAM chip with the same result your determination would be different. Then of course there are a multitude of different chip select logic failures which could throw your test for a loop.

I tired doing some searching here on the forum and elsewhere but did not come across any good references as to what would constitute a good RAM test, i.e. how to construct your test in such a way as to hopefully not be led down the primrose path.


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Fri Aug 24, 2018 8:17 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Doing a RAM test on a 6502 system is somewhat more difficult than with any (?, perhaps most) other 8 bit CPUs because the 6502 lacks a 16 bit register. In order to address any location you need at least two reliable RAM cells in page 0. A circumvent could be a self modifying piece of code running in the RAM that is part of a 6532 mapped anywhere. Then you would use something like LDA absolute address and STA abs and modify the addresses with INC and DEC.

There are various strategies to test RAM - difficult to say how robust they are. A common strategy is called "walking bit(s)" (or similar, its quite a while ago). The function is simple: clear all RAM to zero (and verify that). Then set the LSB of the first RAM cell. Verify that it is set and that all other locations still zero. Clear carry and ROL through all (°) RAM locations. Verify again. Until this single bit is shifted out. Then repeat that with the first two bits set. And so on. Once you reach $FF everywhere you may continue with certain patterns like $AA, $55, $33, $CC, $81, $18 and the like. If you have to do this for say 32K RAM on a 1 MHz system this may last hours :shock:

(°): ROLing through all RAM locations is time consuming and additionally it "refreshes" the RAM more frequently than perhaps intended. So this is something questionable.


Regards
Arne


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Fri Aug 24, 2018 8:32 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Digging into an old archive I found the attached "RAM TEST PROGRAM NACH ELEKTOR APRIL 1982"
ELEKTOR was a monthly electronics publication in Europe, they build several microprocessor systems with various CPUs, one was the so called "JUNIOR COMPUTER" using a 6502 :D
Whether the program was for that machine or something older - I don't know. It is possible that I just have copied their test mechanism into some 6502 code (although I myself usually avoid program code in page 0). You may take it as "an idea". No warranties :lol:

Attachment:
RAMTEST.ASM [2.22 KiB]
Downloaded 291 times


Ah, it uses at least one subroutine (OUTXAH) from the SYM-1. This routine prints a 16 bit hex number (high byte in X, low byte in A) IIRC.


Regards,
Arne


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sat Aug 25, 2018 8:34 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
GaBuZoMeu wrote:
Doing a RAM test on a 6502 system is somewhat more difficult than with any (?, perhaps most) other 8 bit CPUs because the 6502 lacks a 16 bit register. In order to address any location you need at least two reliable RAM cells in page 0.

Any 6502 (or 65C802/65C816 in emulation mode) RAM test is going to need some zero page (ZP) storage to test absolute RAM, which means ZP itself has to first be qualified, using only the MPU's registers during testing. Fortunately, MPU registers alone can touch all of ZP.

The first step would be to write at least two different checkerboard patterns (conventionally, %10100101 and %01011010) to all ZP locations, execute a short delay and then compare the accumulator (which has the test pattern) against all test locations. If that procedure succeeds, it tells us ZP can be addressed and apparently is able to hold data.

The next test would be walking bits. Walking bits would take two forms to be thorough. The first would write %00000000 into the test location, set carry and then rotate 9 times. Here it gets interesting. Ideally, the bit shifted from carry into the test location will be the only bit circulating in that memory cell. That being the case, carry should be set when the loop exits. However, that doesn't prove the memory cell didn't mess up. The final test would be to load the accumulator from the test location to prove that it is still contains %00000000. If it doesn't, a bit ended up getting "stuck" on and the test has failed. The following is code that does this test:

Code:
         *=$2000
;
         ldx #0                ;ZP location index
         txa                   ;initialize
         sec                   ;test "bit"
;
loop0010 sta $00,x             ;clear test cell
         ldy #9                ;bit shift iterations
;
loop0020 rol $00,x             ;rotate away
         dey                   ;step counter
         bne loop0020          ;not done
;
         bcc badram            ;RAM defective...abort
;
         lda $0,x              ;any "stuck" bits?
         bne badram            ;yes, bad RAM...abort
;
         inx                   ;we done?
         bne loop0010          ;no, do next
;
         clc                   ;good ZP RAM
         brk                   ;done
         nop
;
badram   sec                   ;bad ZP RAM
         brk                   ;done
         nop

Naturally, the above code would be in ROM. Execution time according to the Kowalski simulator is 29,959 clock cycles.

The next test would be inverted walking bits. In this one, the pattern %11111111 is written into the test location, carry is cleared and 9 rotations are executed. If carry is cleared when the loop exits, the test location is incremented, which should change its contents to %00000000. If it doesn't there is a bit stuck off and the test has failed. The following is code that does this test:

Code:
         *=$2000
;
         ldx #0                ;ZP location index
         clc                   ;test "bit"
;
loop0030 lda #%11111111        ;initialize...
         sta $00,x             ;test cell
         ldy #9                ;bit shift iterations
;
loop0040 rol $00,x             ;rotate away
         dey                   ;step counter
         bne loop0040          ;not done
;
         bcs badram            ;RAM defective...abort
;
         inc $0,x              ;any "stuck" bits?
         bne badram            ;yes, bad RAM...abort
;
         inx                   ;we done?
         bne loop0030          ;no, do next
;
         brk                   ;good ZP RAM...done
         nop
;
badram   sec                   ;bad ZP RAM
         brk                   ;done
         nop

Execution time is 30,979 cycles. All ZP locations will be cleared when testing has successfully completed.

Beyond that, the really paranoid among us would do individual address line testing. That would mean writing different patterns into the ZP addresses that correspond to only one bit in the LSB of the address being set. Hence the test addresses would be $0000, $0001, $0002, $0004, $0008, $0010, $0020, $0040 and $0080. After a short delay, each of those locations would be checked to see if the unique pattern for that location is present. Needless to say, a compare error would mean there's a major hardware malfunction.

If ZP survives these tests absolute RAM testing can commence, since ZP should be trustworthy for pointers and counters. The same test regimen could be used on absolute RAM, assuming you are willing to endure what could be a lengthy test, even on a 65C02 system running a 14 MHz. The first step would be to qualify the stack, using the same regimen used on ZP. If stack RAM checks out, your test functions can safely call subroutines and otherwise make use of the stack. The address line test would be similar to that for ZP, except the address pattern would be $0100, $0200, $0400, $0800, $1000, $2000, $4000 and $8000.

The address line tests can also be inverted so that only one of the address lines is at logic zero.

In a 65C816 system with extended RAM, detailed testing could take a long time. There are two approaches to testing extended RAM: using 16-bit indexing and incrementing the DB register each time the end of a bank is reached, or using a 24-bit direct page (DP) pointer to touch extended RAM. The former method should be faster, since all 24-bit addressing modes incur a one clock cycle penalty compared to their 16-bit counterparts. Also, the first method allows the RAM in each bank to be accessed with DP addressing, since the 16-bit index will effectively make direct page appear as 64KB of RAM. I haven't actually tested either method to date, but will once POC V2 is fully operational.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sat Aug 25, 2018 9:46 am 
Offline

Joined: Thu Aug 23, 2018 7:10 am
Posts: 89
Location: CyberBunker
as for 65256 srams, those can just be tested externally by dropping them into an at28c256 capable eprom programmer, writing a file to it, leaving it on for a while and reading/verifying the file back. the full test passing each byte out of the first 32k including the stack area through all 256 values before moving on to the next (sufficient delays and clearing of the remaining bus capacitance voltages in case of 'chip not present' errors are taken care of by the 6502 simply loading the next instruction, so no need to specifically handle that, 6502s do not have any cache so in between each memory operation there are always a bunch of different bytes going over the bus anyway (the next opcode and it's parameters) ;) full test takes about 2 minutes on a 1mhz system and around 30 seconds on an 8mhz system. should an error be encountered it simply stops at that point and prints an error followed by the available ram. should zero page itself be defective, it'll drop out either immediately or at the first iteration of the loop anyway. hmm. i actually think our printstring printhexbyte ttychar and ttyenter routines still work without ram, other than one of them (don't quite remember which one) they also are all reentrant. now. what you can't do is JSR to the memory test as it does wipe the stack (twice actually, and it resets the stack pointer should it have been called some other way but by /RESET going down. actually just jumping to the reset address re-initializes everything like it was a cold start, so also the memory test and the stack pointer ;) the memory test can potentially be skipped on warm starts by checking if the value in the location counter is still the value it should be, therefore the memory test was already passed.

ah yeah. printhexbyte uses the stack, therefore ram, (pha high nibble -> pla -> second nibble) to store intermediary results, all of those functions are subroutines called with JSR and therefore won't return properly if there is no working ram at the stack page. so basically 'no ram at all?' no output on the terminal. just death (it won't even RTS properly from initializing the ACIAs with no ram on the stack page)

tbh tho, the ram test is there to provide some guarantee on the stable operation of the unit, if it's broken in any way it simply should not continue boot whatsoever, halt and blink some led rather than run the risk of it doing unexpected stuff (randomly opening the gate or turning off the lights or something ;) . it's not so much about informing users about anything.

the ramtest also securely wipes out all data that may remain in (some) memory modules from the last powerdown/reset (mainly dram scammingly marketed as sram - not that we buy such trash but it could potentially be used) it's actually more about that than about informing the user about some broken chips


as for on board ram tests in full test mode: (counts through each byte from 01 to 00 then moves on to the next, leaving memory zeroed out at the end)

.COLDSTART
SEI
CLD
LDX #$FF
TXS
JSR !ACIAINIT
JSR !TTYCLEAR

....

; DISPLAY MEMORY TEST DATA
LDX #>TXTSTARTINGMEMORYTEST
LDY #<TXTSTARTINGMEMORYTEST
JSR !PRINTSTRING
JSR !TTYENTER

;$00,$01,$02,$03 ARE SKIPPED AS THESE ARE INTERNAL I/O PORTS ON EITHER THE 6509 OR 6510 CPUS

; RUN MEMORY TEST
LDA #$00
STA $04 ; TEST RAM LOCATION LSB - HOLDS RESULT
STA $05 ; TEST RAM LOCATION MSB - HOLDS RESULT
STA $06 ; TEMPORARY TEST VALUE VARIABLE
; INITIALIZE Y
LDY #$07 ; START AT THE BYTE AFTER THE UP TO 4 6509/6510 MMU REGISTERS AND OUR OWN COUNTER INCLUDE REST OF ZEROPAGE AND STACK
.MEMTESTLOOP
LDA $06 ; TEST VALUE LOCATION
STA ($04),Y
CMP ($04),Y
BNE ~MEMTESTEND
INC $06 ; ROTATE EACH BYTE THROUGH ALL VALUES FIRST, AS OTHERWISE WE OVERWITE OUR VARIABLES IN PAGE ZERO
BNE ~MEMTESTLOOP
LDA $06
STA ($04),Y ; SET RAM BYTE TO ZERO BEFORE NEXT ONE
INY ; NEXT BYTE WITHIN MEMORY PAGE
BNE ~MEMTESTLOOP
INC $05 ; NEXT MEMORY PAGE
LDX $05
CPX #$80 ; 32KB RAM $0000-$7FFF
BNE ~MEMTESTLOOP
.MEMTESTEND
STY $04 ;STORE THE LSB SO $04-$05 CONTAINS THE HIGHEST WORKING RAM ADDRESS PLUS 1 INCLUDING THE LSB
; RESET STACK ANYWAY IN CASE OF MEMORY FAILURES BEFORE CALLING ANY SUBROUTINES
LDY #$00
LDA #$00 ; STACK INITIALIZE VALUE
.STACKINITLOOP
STA $0100,Y ; STACK LOCATION
INY
BNE ~STACKINITLOOP
LDX #$FF ; RESET STACK POINTER
TXS
LDX #>TXTMEMORYTEST
LDY #<TXTMEMORYTEST
JSR !PRINTSTRING
LDA #$80
CMP $05
BEQ ~MEMTESTOK
LDX #>TXTFAIL
LDY #<TXTFAIL
JMP !MEMTESTSTATUS
.MEMTESTOK
LDX #>TXTOK
LDY #<TXTOK
.MEMTESTSTATUS
JSR !PRINTSTRING
JSR !TTYENTER

LDX #>TXTMEMDETECTED
LDY #<TXTMEMDETECTED
JSR !PRINTSTRING
LDA $05
JSR !PRINTHEXBYTE
LDA $04
JSR !PRINTHEXBYTE
JSR !TTYENTER

JSR !STARTMONITOR


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sat Aug 25, 2018 12:24 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
Thanks BDD,

I haven't got the idea that the zero page is covered by the index registers and could be verified this way.

There is one sort of error the test didn't recognize: address errors (within the RAM chip). You are treating all bits - an in paranoid mode all bit patterns - of just one cell. You don't look if working on cell X has a side effect on another location.

That was why I zeroed all RAM - but now I think it would sufficient to do this for a region of the size of the chips in question - as in Jeff_Birt's case 2K.

This still assumes no design flaw and no errors in bus drivers (if any) and the chip select hardware. Malfunctions from that circuitry may verified easily (and fast) as zeroing all locations and then inverting one RAM-size-region verifying all others being still 0 and so on is done quickly.


Regards
Arne


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sat Aug 25, 2018 8:52 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
GaBuZoMeu wrote:
There is one sort of error the test didn't recognize: address errors (within the RAM chip). You are treating all bits - an in paranoid mode all bit patterns - of just one cell. You don't look if working on cell X has a side effect on another location.

Due to the way individual cells are addressed within the RAM, the single address line test would likely ferret out such an error. There is no one absolutely foolproof RAM test, which is why you have to balance thoroughness against time consumption. My experience is that if the checkerboard and walking bits tests as I described are successful, the likelihood of an undiscovered defect in RAM is quite small.

Quote:
That was why I zeroed all RAM - but now I think it would sufficient to do this for a region of the size of the chips in question - as in Jeff_Birt's case 2K.

In my POC designs' firmware, the absolute memory test is non-destructive. The reason is in the event I have to hard-reset following a crash, the RAM image of what was running will remain for a postmortem. In POC V1, stack RAM, which is not at $000100 as it would be with a 65C02, is destructively tested. In POC V2, the RAM at $00D800-$00DEFF is destructively tested, as the firmware's direct page and stack are in that range. In both cases, the physical zero page is destructively tested, as described in my previous post.

Quote:
This still assumes no design flaw and no errors in bus drivers (if any) and the chip select hardware.

The single-address-bit test will usually expose such errors. Once the design has been proved in this fashion such a test is not routinely required, since glue logic hardware generally doesn't fail the way DRAM does (SRAM seldom fails once it gets past the "infant mortality" phase).

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sat Aug 25, 2018 8:58 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
cb3rob wrote:
.COLDSTART
SEI
CLD
LDX #$FF
TXS
JSR !ACIAINIT
JSR !TTYCLEAR

...etc...

Source code is easier to read if you surround it with the "code" and "/code" markers.

Code:
.COLDSTART
SEI
CLD
LDX #$FF
TXS
JSR !ACIAINIT
JSR !TTYCLEAR

....etc....

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sat Aug 25, 2018 11:05 pm 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
BigDumbDinosaur wrote:
In my POC designs' firmware, the absolute memory test is non-destructive. The reason is in the event I have to hard-reset following a crash, the RAM image of what was running will remain for a postmortem. In POC V1, stack RAM, which is not at $000100 as it would be with a 65C02, is destructively tested. In POC V2, the RAM at $00D800-$00DEFF is destructively tested, as the firmware's direct page and stack are in that range. In both cases, the physical zero page is destructively tested, as described in my previous post.
I remember I read something about RAM verification during startup in your POC description. I am so confident with the immediate response of my various 6502 boards that I never would delay that by issuing a RAM test unquestioned :) . So far I have only once issued a RAM test - just to test the test :roll:
And so far I only have come across one RAM chip that fails to work without a known reason.

Quote:
...since glue logic hardware generally doesn't fail
older fashioned PALs and PROMs using fusing techniques tend to malfunction after years - it appears that the remains of the burned fuse could move and refill the gaps where they come from. I sadly cannot remember the source of that information.
Quote:
...the way DRAM does (SRAM seldom fails once it gets past the "infant mortality" phase).
I have heard that DRAMs are less reliable. Do you have any sources to this information?


Regards
Arne


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sun Aug 26, 2018 1:35 am 
Offline
User avatar

Joined: Sun Jun 30, 2013 10:26 pm
Posts: 1950
Location: Sacramento, CA, USA
I was thinking that one could employ two pseudo-random number generators (one for the address and other for the value). The type of PRNG that cycles through a given set of values exactly once before repeating. Set them loose with known seeds, writing to RAM until the address PRNG cycles around, then reset the seeds and cycle through once again, this time comparing to RAM instead of writing to it. Then two more passes with the same seeds but using the one's complement for the value, for a total of "only" four passes. What types of bit and/or addressing errors would be able to slip through that, assuming that you can easily tailor an address PRNG with a cycle that matches the footprint of the RAM you are testing?

_________________
Got a kilobyte lying fallow in your 65xx's memory map? Sprinkle some VTL02C on it and see how it grows on you!

Mike B. (about me) (learning how to github)


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sun Aug 26, 2018 2:39 am 
Offline
User avatar

Joined: Wed Mar 01, 2017 8:54 pm
Posts: 660
Location: North-Germany
This is an interesting strategy. But I am by no means a mathmage to proof or invalidate this strategy.

Your PRNGs generate certain patterns but only two (true and inverted) for each memory location. Any cell (single bit) that is stucked will be detected either in phase 2 or 4. But if an internal addressing error refers to the same bit for just two or four external addresses, then they could slip through I think :twisted:

So the question is whether such an error could be real :shock: From a technical point of view I assume such a malfunction cannot occur for only two or four bits. But the more bits affected the lesser the chance of staying undiscovered. :)


Regards
Arne


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Sun Aug 26, 2018 4:11 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8514
Location: Midwestern USA
GaBuZoMeu wrote:
BigDumbDinosaur wrote:
Quote:
...since glue logic hardware generally doesn't fail

older fashioned PALs and PROMs using fusing techniques tend to malfunction after years - it appears that the remains of the burned fuse could move and refill the gaps where they come from. I sadly cannot remember the source of that information.

I was referring to discrete logic, which is manufactured rather than programmed.

Speaking of PAL failures, the earlier versions of the Commodore 64 had a PAL that was an "off-the-shelf" product produced by Signetics, if memory correctly serves me. Those units seldom had problems as they aged. However, Jack Tramiel was relentless about reducing the cost of the C-64 and the result was MOS Technology, by then the "Commodore Semiconductor Group" (CSG), decided to concoct an in-house version of the Signetics PAL. Unfortunately, while CSG's design was okay, the fab process wasn't and the replacement PALs proved to be nowhere near as reliable as the Signetics products.

A C-64 that failed usually fell victim to a power supply malady or a PAL that could no longer remember what it was supposed to do. Efforts to replace the PAL with other types of logic have been checkered, as the C-64 is a timing quagmire—typical of many home computers of the era. The best results were obtained by using a specific ST Micro 45ns PROM (OTP EPROM) as a poor-man's PAL. That PROM was a rarity among PROMs, in that it would not glitch the output (data) pins when the input pattern on the address pins changed.

Quote:
...the way DRAM does (SRAM seldom fails once it gets past the "infant mortality" phase).

I have heard that DRAMs are less reliable. Do you have any sources to this information?

None to which I can point you, although I know from experience that the older generations of DRAM occasionally suffered single-bit failures in the field. Also, DRAM is sensitive to background radiation from space, which is why ECC RAM was developed for server use.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Mon Aug 27, 2018 1:41 am 
Offline
User avatar

Joined: Mon May 12, 2014 6:18 pm
Posts: 365
My Commodore 64 has stopped working, so I have been watching repair videos lately. This one shows a guy using a "dead test ROM" to figure out exactly which subsystem is malfunctioning:

https://www.youtube.com/watch?v=Znh6tyVLG-E

My idea would be to unplug the 6502 and take control of the bus with a $1-2 5v microcontroller. The AVR ones can be programmed with SPI, so you could even use your (other) 6502 system to program them. You would have to obey the clock timing the system is generating, but you could have complete control of everything and not have to worry about whether zp was bad or anything like that. Add a $1-2 LCD and you have a Commodore tester for $3. This would be more robust than a test cart since it can still communicate even when the video or audio is dead, and you don't have to count screen flashes to diagnose the problem.
EDIT: And you can calculate checksums of the ROMs or even check the whole ROM if you have enough flash in the microcontroller to hold a copy.


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Mon Aug 27, 2018 1:00 pm 
Offline

Joined: Wed Jul 18, 2018 12:12 pm
Posts: 96
Thanks for all the responses.

W.R.T. the walking bit test. I'm at a loss to understand why this is 'better' than simply the alternating checkerboard patterns, i.e. with the alternating checkerboard you have tested that each bit can be set to both a zero and a one. It is a simple test though, just bit shifting through, so you are reading/writing to that address multiple times so maybe that is the advantage? This is the sort of thing that would have to be tailored to the system being tested to be of the greatest benefit.

The single address test I think would be a big benefit to older systems that very often use banks of 1Kbitx8, 8kbitx8 where address decoding logic failures cause odd problems. Even the internal addressing failure on the 2K SRAM I had was hard to find because it overwrote the stack area.

I would not suggest an exhaustive RAM test as described here as a boot up procedure but rather a diagnostic procedure for an ailing system.

As far as RAM and decoding logic not failing, it sure does. These are common problems on 30-40 year old systems. Both SRAM and DRAM and the decoding logic do fail and being able to diagnose the problem accurately is important.

For systems like the C64 there is a 'dead test' cartridge which tries to test the DRAM as one of the first things it does. The results are mixed to say the least as the address decoding logic, etc. failures can mimic the same symptoms so it is not as simple as plugging the cartridge in and replacing the chip(s) it says are bad. No matter if you use a cartridge or something fancier you still have to use the same address decoding logic so the results are likely to not be much better other than having an alternative form of indication which could be done by simply blinking an LED on the cartridge. It might be possible to 'hold' the bus in a given state though to allow for interactive testing, i.e. your smart diagnostic cart would keep trying to access the same memory location so you can check the state of the chips involved to help track down a problem. Something like a bad PLA can give the same symptoms of other parts failing and it might change as the system warms up which adds more fun to the mix.


Top
 Profile  
Reply with quote  
 Post subject: Re: A robust RAM test
PostPosted: Mon Aug 27, 2018 1:50 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
A checkerboard couldn't tell if you only had two bytes of memory: so any stuck-at fault in the address bus other than A0 can't be detected.

Also, if you only ever write AA and 55, you could have a short between two even or two odd data lines and you wouldn't see it. (Well, not a short as such, because your program wouldn't run, but a short internal to the RAM.)

The things to check are things like: are there 8 independent bits of data or is there some coupling; is there as much memory as I expect, or is there some aliasing; when I write to one location, is only one location affected; can I always write a zero in any given situation as readily as a one, and vice versa; do nearby writes perturb a stored value; does a value persist over some tens of milliseconds at least.

It's true though that you would generally be doing white-box testing, where you have some idea of the implementation and what the likely failures are.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 18 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 59 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: