Re: reverse engineering Robotron 2084 for the Apple II
Posted: Fri Mar 08, 2019 10:47 pm
fschuhi wrote:
Thank you for putting in the time, that was a great intro into how to work with your tool. Very much appreciated.
Quote:
The adhoc emulation with Shift-R is awesome. I didn't know that WFDis can do that. With regard to the task at hand, it was helpful to see how a subroutine address on the stack can be used to access data immediately after the JSR. That's an important idiom to know.
Quote:
Prepending labels with L and S is sensible, so I have added this right away to my workbench. I would have expected that jumps were labeled with 'J', but you stick to 'L' here as well. Any particular reason why?
Quote:
I had originally lowercased mnemonics and addresses (like seen for x86). Most of the snippets on this forum are uppercase, though. In WFDis you decided to lowercase everything. What is your reasoning behind deviating from the common "style guide"? )
There've also been studies back in the day of single-case text that lowercase is easier to read than upper, but was determined too disrespectful for certain names and titles which should be Proper Cased, so they went with upper-only. I tend to find it more readable in lowercase as well, when looking at full listings.
Quote:
I lack the skill of keeping the structure of even smaller stretches of code in mind.
Quote:
For this reason I have put most of the work so far into automated structural analysis of the code. I do this in Python, because the emulator is written in Python, too. BTW Excel doesn't play a big role in the project, for me it's just a convenient notebook with additional intelligence and easy interface to the emulator and tracer.
So it doesn't make a lot of sense to me to try to map all of this hand-wrangled bit banging into some clean single model of execution. But that also depends on what you mean by "structural analysis".
Quote:
I think using a manual disassembler is compatible with my reverse engineering approach. I lean on the ideas in Don Lancaster's "Tearing Into Machine-Language Code". He advises against using tools to disentangle the code and rather advocates "Do the dull stuff yourself!". But his method also assumes that those who want to reverse engineer should be versed in 6502. A bit more automated structure discovery is necessary to help my learning.
- "Holes" of uncalled data surrounded by large code areas are often also code.
- A9 is "LDA #xx" which very commonly starts code paths.
- In my video example where one of the pointers went to an area with something like "00 00 10 20 30 ff ff ff" I assumed that it was not code, because it appears more structured like bitwise data.
What I find the most useful by far is giving names to things, especially variables and subroutines. Even when they're just guesstimates, once you name something and look at its various uses you can piece together a picture of what it's for, in a somewhat bottom-up fashion, but you need to focus on things that will reveal the most. There's a few good starting points to look at that can anchor some of this understanding:
- Accesses to well-known I/O addresses (keyboard/joystick inputs, video registers)
- Accesses to screen memory
- Functions that are called from many places (usually indicates main loop or small utility functions)
- Writes to known system or software vectors (which will generally point to code)
- Calls into ROM
- Initialization code can reveal the overall memory layout
Quote:
I'm going to continue working on the analysis over the weekend, let's see if I am able to share interim results. I would certainly consider transferring at least some of the automatically generated info to the listing in WFDis (e.g. caller/callee), so it would be nice if your next version re-enables inline comments. But that's just a nice to have, the tool is certainly powerful as it is. Thanks again for the help!