I am working on a project to disassemble an 8k cartridge. The initial disassembly is done and the resulting source code assembles. The hashes match, so it technically works. But there aren't many macros yet and I'm still a long way from fully understanding the code.
Early on I noticed there were several variants of the cartridge, at least according to the TOSEC database. Even with the cartridge header stripped off several dumps had different sha1 hashes. Some of these will no doubt turn out to be minor dump errors. But, I suspect at least a few of them are bug fixes between production runs. I'd like to sort that all out, if I can.
I read Using a running VICE session for development and realized just how powerful the VICE machine language monitor is. It quickly became clear that, with effort, it could be used for very rigorous unit testing.
So, I'm starting to think about how to use a unit test harness, something like Test Anything Protocol appeals to me. It's text based, human readable, and easy to log. I realize 99% of this would be an effort in higher level scripting, likely with Bash or Perl. I could use TAP to check my macros and various chunks of code. With effort, I also probably integrate the VICE ML monitor too. Technically my assembler, 64tass has .assert and .check functionality but its use is undocumented and discouraged.
Is anyone else leveraging unit tests to bug hunt or sanity-check their code? What are some tools and techniques I should be aware of? Any pointers and suggestions would be welcome. Also, even though their use is discouraged due to a pending syntax change, do any of you know how to get .assert and .check to work?
What Sorts of Tools Do You Use to Unit Test Your Code?
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
Yes, I extensively unit test all my assembly code. I use the pytest framework in Python because it's by far the best unit test framework of the many dozens I've seen and the several I've written in the last twenty years.
At the moment you can find all of what's discussed below in my 8bitdev repo.
In my system first he file is assembled with an assembler of choice. Currently my (rather horrible) top-level build script and the loaders support The Macroassembler AS and the ASxxxx assembler suite, but others would be easy enough to add. (The main work is in writing the code to read your assembler's symbol table output.) Then the unit test framework starts and, for each test, sets up a CPU simulator (currently available are py65 for 6502 and my own for 6800), loads the object file into it, loads the symbol table, and runs the test.
Here's a sample set of 10 unit tests for a 6502 routine called `bi_readdec`, which given a pointer to an ASCII representation of a hexadecimal number converts it to a "bigint" (arbitrary-precision) binary number and stores that in an output buffer.
Some notes to help explain this:
1. The test is obviously parametrized, allowing me to use the same code body for many tests. The `input` and `output` parameters are obviously specified right there; the other three parameters are `m`, the simulated machine, `S` the symbol table loaded from the assembler output, and `R` a class allowing me to construct "register set" objects (there will be more on this below). All three of those are "fixtures"; simply adding an `m` to the parameter list tells pytest to go find the setup code for the simulated machine, run it, and pass in the object it produces.
2. The `print` statement prints to stdout; this is captured by pytest and won't be shown unless the test fails. (Though you can ask it to show output even from successful tests if you like.)
3. You can see that there are functions to deposit bytes and words into the simulator's memory. Here this is used to set up the input buffer and the pointers to the input and output buffers. `INBUF` and `OUTBUF` are just the constants defined earlier in the test code. `buf0ptr` and `buf1ptr` are symbols in the assembly code; `S.buf0ptr` returns the value of `buf0ptr`, which in this case is the address in memory where we store the pointer to that buffer.
4. `m.call()` starts executing code in the simulator; it starts at the given address (the `bi_readhex` symbol, here) and counts JSRs and RTSs until it finds the final RTS, where it stops and returns, unless it encounters a BRK instruction in which case a (Python) exception will be thrown. (The list of "stop" opcodes can be specified, as can a different limit on the number of instructions to execute before throwing an exception.) If your JSRs and RTSs don't match, there are other ways of calling the code and running it to a given point, exiting without an exception on encoutnering a given opcode, etc.
5. `m.call` also takes a register set (which includes flags); here you can see that we set only register A, loading it with the length of the input buffer.
6. After it returns, we fetch some bytes from the simulator's memory and then assert that various values are what we expect them to be. There's almost never any need to write your own assertion functions; simply `assert EXPRESSION` and if it fails pytest will take it apart and show you the pieces, even telling you things like which individual elements in a list (or in this case, a sequence of bytes) are different from what's expected. That's why I can combine all my values above into 3-tuples and compare them; pytest will tell me which individual values in the tuples did not match and drill down even further into those if they're structured values.
This test unfortunately doesn't demonstrate register/flag comparisons, but those are done with objects constructed with R(), which can have "don't care" values to be used in comparisons. So typically I'd do something like `assert R(x=0x33, Z=1) == m.regs` to test just the x register value and Z flag, and on failure it would give me back something like the following, where the hyphens indicate the "don't care" values in the expected result:
It's worth mentioning that this sort of testing can also replace using a debugger in many circumstances; it's not difficult (but should be made easier!) to have the simulator stop at specified addresses and print out the current values of whatever registers and memory are of interest, for example. I can also generate execution traces, but those too want more work (for example, they currently don't show what memory was changed at every step).
Right now this whole thing is not really "productized" for use by others; the framework should be in a separate repo, with documentation and tutorials, etc. etc. I'm planning to get around to that one day, but it's still under pretty heavy development at the moment. However, I'm happy to do support, pair programming sessions, whatever, to help anybody who's interested in getting up to speed on this stuff.
Yeah, as someone who's been using Bourne shell since the '80s, Perl since the '90s, Ruby from the early 2000s onwards, and, over the last few years, Python, I can say you definitely should simply start with Python. I frequently ignore my own advice and use Bash to get something started and most of the time I regret it. (My top-level `Test` script in that repo is an excellent example.) The difference isn't as vast with Perl or Ruby, but it's still there and hurts in some important areas. (For example, you can't get something like pytest in Ruby or Perl because they don't give you access to the compilation system; pytest actually compiles the Python code in your tests differently from normal in order to instrument it so it can take apart structured variables in the way mentioned above.)
At the moment you can find all of what's discussed below in my 8bitdev repo.
In my system first he file is assembled with an assembler of choice. Currently my (rather horrible) top-level build script and the loaders support The Macroassembler AS and the ASxxxx assembler suite, but others would be easy enough to add. (The main work is in writing the code to read your assembler's symbol table output.) Then the unit test framework starts and, for each test, sets up a CPU simulator (currently available are py65 for 6502 and my own for 6800), loads the object file into it, loads the symbol table, and runs the test.
Here's a sample set of 10 unit tests for a 6502 routine called `bi_readdec`, which given a pointer to an ASCII representation of a hexadecimal number converts it to a "bigint" (arbitrary-precision) binary number and stores that in an output buffer.
Code: Select all
# Buffers used for testing deliberately cross page boundaries.
INBUF = 0x6FFE
OUTBUF = 0x71FE
@pytest.mark.parametrize('input, output', [
(b'5', b'\x05'),
(b'67', b'\x67'),
(b'89A', b'\x08\x9A'),
(b'fedc', b'\xFE\xDC'),
(b'fedcb', b'\x0F\xED\xCB'),
(b'80000', b'\x08\x00\x00'),
(b'0', b'\x00'),
(b'00000000', b'\x00'),
(b'087', b'\x87'),
(b'00000087', b'\x87'),
])
def test_bi_readhex(m, R, S, input, output):
print('bi_readhex:', input, type(input), output)
m.deposit(INBUF, input)
m.depword(S.buf0ptr, INBUF)
m.depword(S.buf1ptr, OUTBUF)
size = len(output) + 2 # length byte + value + guard byte
m.deposit(OUTBUF, [222] * size) # 222 ensures any 0s really were written
m.call(S.bi_readhex, R(a=len(input)))
bvalue = m.bytes(OUTBUF+1, len(output))
assert (len(output), output, 222,) \
== (m.byte(OUTBUF), bvalue, m.byte(OUTBUF+size-1))
1. The test is obviously parametrized, allowing me to use the same code body for many tests. The `input` and `output` parameters are obviously specified right there; the other three parameters are `m`, the simulated machine, `S` the symbol table loaded from the assembler output, and `R` a class allowing me to construct "register set" objects (there will be more on this below). All three of those are "fixtures"; simply adding an `m` to the parameter list tells pytest to go find the setup code for the simulated machine, run it, and pass in the object it produces.
2. The `print` statement prints to stdout; this is captured by pytest and won't be shown unless the test fails. (Though you can ask it to show output even from successful tests if you like.)
3. You can see that there are functions to deposit bytes and words into the simulator's memory. Here this is used to set up the input buffer and the pointers to the input and output buffers. `INBUF` and `OUTBUF` are just the constants defined earlier in the test code. `buf0ptr` and `buf1ptr` are symbols in the assembly code; `S.buf0ptr` returns the value of `buf0ptr`, which in this case is the address in memory where we store the pointer to that buffer.
4. `m.call()` starts executing code in the simulator; it starts at the given address (the `bi_readhex` symbol, here) and counts JSRs and RTSs until it finds the final RTS, where it stops and returns, unless it encounters a BRK instruction in which case a (Python) exception will be thrown. (The list of "stop" opcodes can be specified, as can a different limit on the number of instructions to execute before throwing an exception.) If your JSRs and RTSs don't match, there are other ways of calling the code and running it to a given point, exiting without an exception on encoutnering a given opcode, etc.
5. `m.call` also takes a register set (which includes flags); here you can see that we set only register A, loading it with the length of the input buffer.
6. After it returns, we fetch some bytes from the simulator's memory and then assert that various values are what we expect them to be. There's almost never any need to write your own assertion functions; simply `assert EXPRESSION` and if it fails pytest will take it apart and show you the pieces, even telling you things like which individual elements in a list (or in this case, a sequence of bytes) are different from what's expected. That's why I can combine all my values above into 3-tuples and compare them; pytest will tell me which individual values in the tuples did not match and drill down even further into those if they're structured values.
This test unfortunately doesn't demonstrate register/flag comparisons, but those are done with objects constructed with R(), which can have "don't care" values to be used in comparisons. So typically I'd do something like `assert R(x=0x33, Z=1) == m.regs` to test just the x register value and Z flag, and on failure it would give me back something like the following, where the hyphens indicate the "don't care" values in the expected result:
Code: Select all
____________________________ test_bi_readhex[67-g] _____________________________
src/m65/bigint.pt:54: in test_bi_readhex
assert R(x=0x33, Z=1) == m.regs
E assert Unexpected Registers values:
E 6502 pc=---- a=-- x=33 y=-- sp=-- ------Z-
E 6502 pc=1069 a=FE x=00 y=FF sp=FF nv--diZC
----------------------------- Captured stdout call -----------------------------
bi_readhex: b'67' <class 'bytes'> b'g'
Right now this whole thing is not really "productized" for use by others; the framework should be in a separate repo, with documentation and tutorials, etc. etc. I'm planning to get around to that one day, but it's still under pretty heavy development at the moment. However, I'm happy to do support, pair programming sessions, whatever, to help anybody who's interested in getting up to speed on this stuff.
Quote:
I realize 99% of this would be an effort in higher level scripting, likely with Bash or Perl.
Curt J. Sampson - github.com/0cjs
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
cjs wrote:
Yeah, as someone who's been using Bourne shell since the '80s, Perl since the '90s, Ruby from the early 2000s onwards, and, over the last few years, Python, I can say you definitely should simply start with Python. I frequently ignore my own advice and use Bash to get something started and most of the time I regret it. (My top-level `Test` script in that repo is an excellent example.) The difference isn't as vast with Perl or Ruby, but it's still there and hurts in some important areas. (For example, you can't get something like pytest in Ruby or Perl because they don't give you access to the compilation system; pytest actually compiles the Python code in your tests differently from normal in order to instrument it so it can take apart structured variables in the way mentioned above.)
I know what you mean with Bash. Sometimes Bash is the right answer for simple problems. The reality is, I often write simple prototype code in Bash and use that code as a rough outline. Then, I rewrite everything in whatever language I'm going to actually use, often Python. I like Python for API-to-API type stuff, but I don't usually find low level work appealing in the language.
I was experimenting with Raku (ex-Perl 6) for a while. It has grammars, which are named regexes with recursion thrown in to allow for some very complex parsing to take place. After that, I began and oddly intense Perl 5 kick that isn't slowing down. There is an interesting proposal called Cor; it's a new object model. It is sort of Ruby-like to my eye. That proposal got me to give Perl a second look after a long hiatus, one-liners and short one-off "data munging" scripts notwithstanding. It's neat to see how much the language has changed over the years.
I never did get into Ruby, though I do find a lot of Ruby code is visually appealing. I think the next language I'm going to try to tackle is Forth, just because it's so different from everything else. On that note, DurexForth (C64) looks very cool.
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
I unit test my code using Py65Mon launched from a Makefile. No fancy framework, I just look for known good output.
Here's a link to my repo:
https://github.com/Martin-H1/6502/blob/ ... n/Makefile
Here's a link to my repo:
https://github.com/Martin-H1/6502/blob/ ... n/Makefile
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
load81 wrote:
I know what you mean with Bash. Sometimes Bash is the right answer for simple problems.
Quote:
I like Python for API-to-API type stuff, but I don't usually find low level work appealing in the language.
Quote:
I was experimenting with Raku (ex-Perl 6) for a while. It has grammars, which are named regexes with recursion thrown in to allow for some very complex parsing to take place.
----------
¹ "Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems"
Curt J. Sampson - github.com/0cjs
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
Martin_H wrote:
I unit test my code using Py65Mon launched from a Makefile. No fancy framework, I just look for known good output.
After instantiating and extending the py65 monitor class, you can load binaries into RAM, peek and poke things in memory, single step, set breakpoints, and run at full speed. You can also simulate hardware at various addresses by "subscribing" to reads or writes at the addresses you want - it will run your function to determine how to react. We added a 32-bit cycle counter to time Forth words.
Here is the test script we came up with.
https://github.com/scotws/TaliForth2/bl ... alitest.py
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
Hi!
For the FastBasic unit testings, I wrote my own emulator library: https://github.com/dmsc/mini65-sim/ ; It emulates the full Atari 8-bit OS, but not the Atari hardware, so it can be used to test all the command line tools and BASIC samples.
Using this emulator, I built a simple test framework at https://github.com/dmsc/fastbasic/tree/master/testsuite , it reads test definition files like this: https://github.com/dmsc/fastbasic/blob/ ... -input.chk
"Test" says which test to apply, "run-fp" means compile with floating-point compiler, then run the resulting program. "input" data is passed to the emulator as console input, "output" data is checked to match the one given. The above is accompanied with the following basic program: https://github.com/dmsc/fastbasic/blob/ ... -input.bas
Note that the emulator is used first to run the command line compiler, so the full process is tested as it would work in the Atari.
Have Fun!
load81 wrote:
Is anyone else leveraging unit tests to bug hunt or sanity-check their code? What are some tools and techniques I should be aware of? Any pointers and suggestions would be welcome. Also, even though their use is discouraged due to a pending syntax change, do any of you know how to get .assert and .check to work?
Using this emulator, I built a simple test framework at https://github.com/dmsc/fastbasic/tree/master/testsuite , it reads test definition files like this: https://github.com/dmsc/fastbasic/blob/ ... -input.chk
Code: Select all
Name: Test statement "INPUT"
Test: run-fp
Input:
1
2
.
Output:
Start
?1 1
1 2
?18Code: Select all
' Test for statement "INPUT"
? "Start"
input a%
? err(), a%
input ; b%
? err(), b%
input a%
? err()Have Fun!
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
The .assert and .check directives in 64tass are not for code testing purposes as outlined above.
These directives were added long time ago to prevent mistakes when programming banked memory systems. I needed them because often the wrong memory configuration was used which resulted in memory trashing or garbage reads. Also certain functions were only supposed to be called if the memory area(s) they operated on were available. Or worse those functions could have been banked out themselves.
It was a sort of hack and their use was complicated. However they served their purpose and I got rid of a lot of bugs in my code while suffering their limitations.
I've choose not to document them to discourage their use as they will go away at some point once I figure out a proper replacement for them.
Somewhat platform specific but more on topic I think:
https://www.commocore.com/repository/c64unit
https://github.com/martinpiper/BDD6502
http://www.cactus.jawnet.pl/attitude/?a ... 8&which=15
These directives were added long time ago to prevent mistakes when programming banked memory systems. I needed them because often the wrong memory configuration was used which resulted in memory trashing or garbage reads. Also certain functions were only supposed to be called if the memory area(s) they operated on were available. Or worse those functions could have been banked out themselves.
It was a sort of hack and their use was complicated. However they served their purpose and I got rid of a lot of bugs in my code while suffering their limitations.
I've choose not to document them to discourage their use as they will go away at some point once I figure out a proper replacement for them.
Somewhat platform specific but more on topic I think:
https://www.commocore.com/repository/c64unit
https://github.com/martinpiper/BDD6502
http://www.cactus.jawnet.pl/attitude/?a ... 8&which=15
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
soci wrote:
The .assert and .check directives in 64tass are not for code testing purposes as outlined above.
Your assembler is rock solid. You should absolutely have a Patreon or a cryptocurrency address for users to donate to.
Re: What Sorts of Tools Do You Use to Unit Test Your Code?
soci wrote:
Somewhat platform specific but more on topic I think:
https://www.commocore.com/repository/c64unit
https://github.com/martinpiper/BDD6502
http://www.cactus.jawnet.pl/attitude/?a ... 8&which=15
https://www.commocore.com/repository/c64unit
https://github.com/martinpiper/BDD6502
http://www.cactus.jawnet.pl/attitude/?a ... 8&which=15
I've had only a brief look at them so far, but I do have a couple of comments on them.
I'm not seeing much use of test generation from parameters in any of these systems. This is something that in my experience is not used so much in testing high-level languages, but I've found I use it quite heavily in testing assembly code. For example, in the article on CommTest they have the following tests for a "subtract" function which I find quite typical:
Code: Select all
context("when address is $0000") {
it("results in address = $ffff") {
writeWordAt(address, 0x0000)
call
assert(readBytesAt(address, 2) === Seq(0xff, 0xff))
}
}
context("when address is $0001") {
it("results in address = $0000") {
writeWordAt(address, 0x0001)
call
assert(readBytesAt(address, 2) === Seq(0x00, 0x00))
}
}
context("when address is $0100") {
it("results in address = $00ff") {
writeWordAt(address, 0x0100)
call
assert(readBytesAt(address, 2) === Seq(0xff, 0x00))
}
}
Code: Select all
@pytest.mark.parametrize('input, result', [
(0x0000, 0xFFFF), (0x0001, 0x0000), (0x0100, 0x00FF),
])
def test_subtract(m, S): # machine, Symbol table
m.depword(S.address, input)
m.call('subtract')
assert result == m.word(S.address)
You'll have noticed there's some plain English descriptions in the ConnTest test cases above. Moving towards such "plain English" descriptions is characteristic of "BDD," or "Behaviour-Driven Design." BDD6502 actually writes the tests in such form, as in this example:
Code: Select all
Scenario: Simple Score add test
Given I start writing memory at $400
Given I write the following bytes
| Score_ZeroCharacter+3 | Score_ZeroCharacter+4 | Score_ZeroCharacter+5 | Score_ZeroCharacter+6 | Score_ZeroCharacter+7 | Score_ZeroCharacter+8 | Score_ZeroCharacter+9 |
Given I start writing memory at $500
Given I write the following hex bytes
| 05 04 06 04 03 01 |
When I set register a to lo($500)
When I set register x to hi($500)
When I execute the procedure at ScoreAdd for no more than 103 instructions
Then I hex dump memory between $400 and $407
Then I expect to see $3ff equal 0
Then I expect to see $400 equal Score_ZeroCharacter+3
Then I expect to see $401 equal Score_ZeroCharacter+4
Then I expect to see $402 equal Score_ZeroCharacter+7
Then I expect to see $403 equal Score_ZeroCharacter+0
Then I expect to see $404 equal Score_ZeroCharacter+2
Then I expect to see $405 equal Score_ZeroCharacter+4
Then I expect to see $406 equal Score_ZeroCharacter+9
Then I expect to see $407 equal 0
Code: Select all
def test_simple_score_add(m, S, R): # machine, Symbol table, Register set constructor
m.deposit(0x400, score_zchar[3:10])
m.depoist(0x500, b'\x05\x04\x06\x04\x03\x01')
m.call(S.ScoreAdd, R(a=LSB(0x500), x=MSB(0x500))
assert b'\x00' + score_zchar[3:10] + b'\x00' == m.bytes(0x3FF, 10)Curt J. Sampson - github.com/0cjs