TMorita wrote:
The Java case and the 6502 case are totally different cases.
The Java case was analyzing the derived types which would be stored in a particular variable, then called through a particular interface-dispatching point in code. There is a limited enumeration of types that end up there, and the setting of them was direct. This is quite similar to analyzing a 16-bit integer in 6502 that is manually used as a jump vector.
It has nothing to do with selfmod or type-safety or verification or branch targets internal to Java functions, it's purely a problem of "what values/derived-types can this location, which is directly set from literals in other places in code, hold?" which is a very limited problem.
Quote:
So, there are no unrestricted jump instructions in Java bytecode.
Would you consider the invoke* instructions or calling a Runnable.run() to be unrestricted? Those are the closest equivalent of a JMP(ind) in a working 6502 program. The java manifestation of the exact same analysis problem is on a higher level than on the 6502.
Quote:
My experience differs. I wrote a 65816 disassembler which follows control flow to discover the state of the M and X bits. Most of the cases I found were not amenable to shallow analysis.
And I agree with that. Note that enumerating the values of a vector does not determine
when those values hold, just that they are some values that
are put there, thus is not state tracking except the tiny bit between the LDA and STA instructions which write to the vector, to propagate the constant. Your flag example is one of state tracking across large swaths of code, which is a much larger problem.
Quote:
Imagine you have a C compiler, and the optimization algorithm is theoretically faulty and only produces correct code for "many if not most" of the cases.
That is the wrong way of looking at it. A better analogy would be the optimizer only performs optimizations where it can statically determine certain features, and not optimizing when it can't.
Analysis brings forth known, known-complete, and/or known-incomplete information about code which
adds to the system's understanding of it. Any individual bit of analysis doesn't necessarily bring all truth about a system on its own, and if you deny the use of any input indicators beyond "proven fact" then you end up with a simplistic and underpowered analysis. Too many people are simply afraid of or don't know how to work with "unknown"s and "might be"s, which is disappointing when it comes to fields like this which really depend on them. Converging unknowns is a late-stage process; resolving types and enumerations is earlier and iterates based on dependencies and new information.
Quote:
Given the instruction sequence:
JMP (zp)
where the contents are not known, and a 4 megabit NES cartridge such as Final Fantasy III, the number of possible branch destinations is 512K.
So, with your approach, you would have to analyze 512K entry points into the code and check if each one executes valid instruction sequences, and if so, generate the corresponding equivalent code for each one.
Then you have not read my approach at all. I agree that what you say would be a dumb approach, however such brute-force analysis can be useful (see link at the end of the post).
My approach does the equivalent of the following:
- Note what locations JMP(ind) instructions use as vectors. (JMPs found in known code areas)
- Note any stores into those vector locations. (Stores found in known code areas)
- See if basic constant propagation gives that store a known value.
- Associate low & high byte sets together if they easily chain.
Setting vectors, as well as self-modifying opcodes, are generally unparameterized pieces of code and the vast majority are done with a simple LDA #imm, STA vec pair. This is a very common pattern for setting vectors in the 6502 (I'm no expert on idiomatic 65816 code or how they're set there, and the complications that the banking registers bring). If there are other writes to that zp location which are not a determinable propagated constant, then that would be a known-incomplete enumeration set, or the vector could be used for other purposes (LDA zp[+1] would be a good indicator towards that).
Again, the reason this is tractable is because programmatic calculation of jump vector values (and selfmod opcodes) on 6502 is rare, and would very obviously break this method, but in ways that note that unknown stores are being done to the vector. Forth interpreters are an example where this wouldn't work, but the common dynamic dispatch via indirect jump used in assembly programs would.
Quote:
Another complication: The JMP (zp) could jump to code in RAM. This code would not be there in a ROM image, so the static analyzer would need to determine this. So not only would the static recompiler need to analyze and possibly generate 512k entry points into ROM code, it would also need to detect code copied from ROM to RAM.
That is correct. The issue of knowing what's in code space is different from the issue of determining the values of an enumeration which only has a few possible values that are all set literally. However, if the zp location is only used as a jump vector, static analysis could at least determine the jump vector values themselves even if the code in that place in RAM is unknown.
Quote:
Yet another complication: Probably the most common reason why code would be copied from ROM to RAM in a NES environment is to use self-modifying code, such as a fast memory copy routine. So not only would the static recompiler need to analyze and possibly generate 512k entry points into ROM code, and detect code copied from ROM to RAM, it would also need to generate all the possible self-modifying code variations which could be generated in RAM.
An interesting feature there is that LDA #imm, STA zp remains the same readable code regardless of where in memory it is, so a pattern match could hunt for those occurrences (also LDY/STY and LDX/STX obviously), though they're not guaranteed to be part of actual executed code and don't take advantage of constant propagation analysis.
Again, getting clues and insight to possibilities is a whole lot more valuable than simply saying "It's not exact, don't do any of it ever!" You need to be able to teach the computer how to try things and put partial information & indicators together into real results.
Quote:
Yet another complication: The static recompiler detects code copied from ROM to RAM, and it detects four byte writes to the code in RAM (memory copy), but it cannot determine all possible values which can be written to the self-modifying code. So in this case the number of possible permutations is 2^32 or about 4 billion possible permutations, so the static recompiler would need to generate 4 billion routines, each representing a possible permutation of the self-modifying code.
Self-modifying
opcodes is generally done via LDA #imm, STA abs patterns as well, where abs is a known-executing point in memory. Hunting those is exactly the same as hunting literal jump vectors: They're a limited set of enumerated, literal values being stored into one suspect location. Self-modifying
operands are the equivalent of RAM variables (in 6502 where operands are always bytes or words, unlike other archs where operands are bitwise embedded into code words).
Quote:
Yet another complication: The self-modifying routine in RAM copies from source to dest address. However, on a cartridge with an MMC such as MMC1 or MMC3, the memory map changes depending on the contents of the bank register. From the NESdev wiki, for an MMC1:
Yes, like relocating code in the first place, that is a separate problem of figuring out where your executable code ends up.
Quote:
So IMHO it's not practical to allow unknowns in a 6502 static recompiler.
I think this is your problem here. I never said that I'm writing a static recompiler, nor think it's a generally tractable problem (for many other issues than just the semantic discovery). I'm just saying that a few analysis features that people say are impossible aren't. You said "[JMP (x) where X is a RAM address] obviously causes huge problems for static recomplation", and I disagree that it's a huge problem. There are other huge problems though, like the banking, code copying, speedcode/JIT generation, code decompression, CPU & other chip state tracking, etc.
Here's a
great talk from last year (slides at the bottom of the page) about doing analysis of binary code
where the instruction set & encoding is unknown! There's a ton that you can do with analysis that can reveal some very definite indicators about what's what in code that seem "impossible".
(also apologies if some of this is confusing, as I'm still dealing with medical issues. Just ask and I'll try to explain better)