6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sun Apr 28, 2024 10:33 am

All times are UTC




Post new topic Reply to topic  [ 176 posts ]  Go to page 1, 2, 3, 4, 5 ... 12  Next
Author Message
PostPosted: Mon Nov 04, 2019 5:20 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
Hello everyone,
I am a Spanish EE student and this is my first post in this forum.

I am designing a 6502 inspired cpu in Logisim, with the intention of building my design with ttl ic's later on. My primary goal is learning about computer architecture, assembly programming etc.

I have a basic model that can execute some instructions using immediate and zero-page addressing modes, but I'm stuck at implementing absolute addressing. I don't see how the two address bytes following a certain instruction can be loaded into the address bus registers, as loading the first byte changes the address that is being read, making it impossible to read the following byte. An intermediate register of some sort seems to be needed, but I am unable to identify that on 6502's block diagram (viewtopic.php?t=1744) nor find information on the subject.
Any ideas/documentation on the subjects? Thank you.


Last edited by JuanGg on Wed Nov 27, 2019 4:14 pm, edited 1 time in total.

Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 04, 2019 6:48 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Welcome! We can usually answer questions like these with the help of visual6502. What I would guess is that the ALU will be used - it often acts as a temporary holding place. In effect it's a pipeline register which can be used as a temporary.

And yes, this seems to be the case:
http://www.visual6502.org/JSSim/expert. ... 8e0403eaea


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 06, 2019 4:46 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
Thanks! It seems that I can't get away with my simple ALU-accumulator setup (see attached). I'll have to look more into that, and learn about visual 6502, as it sees to be a valuable resource. But as I am the one designing the CPU, I can add an additional temporary register for this kind of operations if I need to. It doesn't have to replicate a 6502, I just wanted it to be vaguely similar in order to benefit from existing documentation. We'll see.
Attached is also a general picture of the whole CPU.


Attachments:
Cpu.JPG
Cpu.JPG [ 212.52 KiB | Viewed 10368 times ]
ALU.JPG
ALU.JPG [ 62.81 KiB | Viewed 10368 times ]
Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 06, 2019 5:02 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Yes indeed, a temporary register will do - the 6800 and I think the 6809 used that approach, as does the Z80, so you're in good company!


Top
 Profile  
Reply with quote  
PostPosted: Sun Nov 10, 2019 9:30 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
I have added a temporary register to my ALU, see attached, and that enables me to do absolute addressing, and also to increment/decrement X and Y registers without modifying the accumulator. The problem now is that I need 7 clocks to execute the instruction, plus 2 from fetching the opcode makes 9. I have a 3 bit micro instruction counter, so it does not fit.
That seven clocks are:
-Program Counter -> Address registers
-Read low address from memory -> Temp register, increase program counter
-Program Counter -> Address registers
-Read high address from memory -> Address high register
-Tmp register -> Address low register, increment program counter
-Read operand from memory -> temp regsiter
-Alu result -> Accumulator

I have 'simulated' ORA with absolute addressing on visual 6502, so I can see what control signals fire up and trace it on the block diagram. It takes 4 cycels, but as I see it, each cycle has two distinct steps, in which different control signals are enabled. (I suppose each step is a clock?).

As I don't care about speed and I'd better keep things simple (i.e no pipelining), I may increase the micro instruction counter to 4 bit so that instructions can take as long as 16 clocks, which may be handy for relative addressing. Doing so would mean going for bigger decoder roms (although I have a workaround in mind).

Also found this http://faculty.cs.niu.edu/~berezin/463/ ... cpugen.gif, which is in a page about the 6502. I could do it that way, but then I'd need separate inc/dec logic for x and y registers.
Juan


Attachments:
ALU_TMP.JPG
ALU_TMP.JPG [ 71.2 KiB | Viewed 10310 times ]
ORA_ABS.JPG
ORA_ABS.JPG [ 95.03 KiB | Viewed 10310 times ]
Top
 Profile  
Reply with quote  
PostPosted: Mon Nov 11, 2019 1:57 am 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Since you're not trying to fully emulate a 6502, then I will let you know how I approached this problem since I too was not trying to emulate a 6502 in a cycle accurate manner.

I use two temp registers. One for the first instruction operand, and a second for the second instruction operand. The first instruction operand can be an immediate operand, a zero page address, or a signed displacement of a branch instruction. The second operand is always the high byte of a 16-bit direct address.

I also allow the memory address registers to be loaded from either the program counter (which has its own dedicated incrementer like you've shown above) or some combination of the two temporary registers.

An ALU memory operand always comes from the first temporary register, and a memory address operand comes from the two temporary registers.

One other thing that I do is that I load 0x00 or 0xFF into the second temporary register when loading the first temporary register depending on whether the operand being fetched is the displacement for a branch instruction or not. I load a 0xFF into the second register if the instruction is a branch and the most significant bit of the first operand is a logic 1. Otherwise, I load a 0x00.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Fri Nov 15, 2019 9:24 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
MichaelM wrote:
Since you're not trying to fully emulate a 6502, then I will let you know how I approached this problem since I too was not trying to emulate a 6502 in a cycle accurate manner.

I use two temp registers. One for the first instruction operand, and a second for the second instruction operand...


Thanks for the information, I'll have to look into that if my approach doesn't work out. I have added two input registers for the ALU: A and B. That makes instructions a fair bit slower, so I had to make the microcode counter 4-bit wide. I am prioritizing hardware simplicity over speed, so no problem. So far, by using those registers I have been able to implement zero page indexed, absolute indexed, and X indirect. So not looking bad so far, but we'll see.

As I ran out of address lines on the microcode ROMs, I have turned reset and interrupts into regular instructions. When IRQ goes low, next opcode to be executed is forced to be 0x0, the BRK instruction. I am effectively ignoring the "padding byte" and the "b" flag, as I read that BRK is not used much. Regarding reset, I am using the unused opcode 0xff in a similar way. This also enables to trigger a reset from software. I suppose this is not a "proper" way of doing this, I may change it latter and get bigger ROMs.
Juan


Attachments:
ALU2.JPG
ALU2.JPG [ 101.84 KiB | Viewed 10209 times ]
Top
 Profile  
Reply with quote  
PostPosted: Sat Nov 16, 2019 6:48 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
JuanGg wrote:
As I ran out of address lines on the microcode ROMs, I have turned reset and interrupts into regular instructions. When IRQ goes low, next opcode to be executed is forced to be 0x0, the BRK instruction.

This statement appears to indicate that you were running the RST and IRQ signals into your microcode ROMs. I think that this is an inefficient use of the address lines to your microcode ROM. First, RST should have an independent effect on some of your internal registers, such as your program / instruction counter (PC), your microprogram / microinstruction counter (uPC), and your Memory Address Register (MAR). When RST is asserted, these registers / counters should be forced to a known state.

In the case of the PC and the MAR, they should point to the reset vector location: 0xFFFC. In my implementation, I don't actually set the PC explicitly when RST is asserted. Instead, I either increment it, i.e. PC <= PC + 1, or I load it with the address I'm presenting on the memory address bus, i.e. the value assigned to MAR. Therefore, on RST, I force the memory address to 0xFFFC, and capture that address into the PC at the completion of the first read cycle of the reset sequence. The uPC is force to 0x00 (8-/9-bit microprogram address bus), and I proceed from there.

To handle interrupts, I conditionally test the internal INT signal, i.e. INT = FE(NMI) | IRQ & IE, and load the MAR as required with either the NMI vector address or the IRQ vector address. (Note: FE(NMI) is a function that latches until serviced when a falling edge on the external nNMI signal is detected.)

This approach may make the implementation a bit slower since it places logic in the address path, but it allows the microcode ROM to be populated with microcode sequences as required by the instructions and addressing modes. In other words, the microcode ROM does not need four different microinstructions for each instruction sequence element to account for RST or INT being asserted. In the case of RST, it is not a conditional signal, and in the case of INT, as defined above, the current instruction sequence is completed and the INT sequence is only performed at the boundary between instructions.

Just curiosity, but are you using a microprogram sequencer or just a loadable counter. If you're using just a loadable counter, you may consider using a simple microprogram sequencer like the Fairchild 9408. You can find a description of this obsolete microprogram sequencer in the 1976 Fairchild Macrologic manual (page 86) archived on bitsavers.org here. (Note: big PDF) Alternatively, you can view a derivative of this particular microprogram sequencer written in Verilog here.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 19, 2019 6:43 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
I'm going to write a general description of what I have now, but it will be subject to changes I'm sure:

-Microcode and microcode ROMS:
I have 5 8K ROMS as the control unit (I think I'm going to need another one soon)
I was indeed using a couple address lines for reset and interrupts in the beginning, but this is how I have currently set it up:

12 11 10 9 8 7 6 5____4 ______3 2 1 0
|______________|.....|........|_____|
.......OPCODE..........FLAG.....MICROCODE COUNTER

So each opcode has two different cases depending on the state of the flag. The flag is selected from N,V,Z and C just after the fetch cycle and saved in a flip-flop while the instruction executes. Those two cases are identical for all instructions but conditional branches. Micro instructions are sequenced with just a 4-bit counter that can be reset when the execution of the current instruction finishes. I think this does the job for now. When an interrupt or reset occurs, special opcodes (0x0 and 0xff) are read into the IR when last instruction ends. Using the "constant generators", just some transistors that pull data bus lines high when needed, reset and interrupt vectors are read, and execution continues from there.
Rom contents are generated by means of a Python program, in which I get to define instructions, opcodes and which control signals fire up and when.

-ALU
The ALU is comprised of two 8K ROMS, each acting as a 4-bit ALU. Its final operations are yet to be defined but I have implemented the usual OR, XOR, AND, NOT, ADC, SUBC, INC, DEC, ASL, LSR, ROL, ROR... we'll see which are needed. Also, adding with 0 and carry comes in handy for address calculation. For LSR and ROR, a multiplexer is used to switch carry bits around so I can shift to the right.
Again, a Pyhton program calculates all possible cases and generates a file that I can load into the ROMs

Two input registers A and B feed the ALU, whose output goes straight into the data bus. This registers serve as temporary holding places during address calculation for the several addressing modes

-MISC.
I have the Instruction Register, Accumulator, Stack Pointer and the usual X and Y index registers, which have to be incremented or decremented by means of the ALU.
The Program Counter is made out of four 4-bit loadable counters and its divided into High and Low.
Two memory address registers, High and Low, handle the 16 address lines.
Everything is connected to an 8-bit data bus with pull-down resistors, but there is also an "auxiliary bus" to transfer the contents of the Program Counter High to the Address High Register, this way, PC contents can be put into the address registers in one clock cycle.
On both the data and the auxiliary bus, constant generators are connected to put the addresses of the reset and interrupt vectors on the bus.

-What comes next:
Finishing the ALU
The Status register needs a complete redesign, as different flags are affected by different ALU operations (now all ALU operations affect all flags but the interrupt disable) The appropriate flags can be set an cleared from the control unit, but it needs more work. I may need additional control lines to handle that.
I have a working implementation of all addressing modes, but I'm not happy with the relative mode used for branches (my current implementation takes 15 clocks, and works by subtracting 128 from the PC, then adding 128 to the signed offset to leave it as a positive number and then adding those two...) So I have to work on that too, sign-extending the offset and doing a 16 bit addition.
Stack operations.

Just some thoughts. Bear in mind that I have very little if no idea of what I'm doing, so any suggestions are welcome. Find attached a screenshot of my Logisim simulation.
Juan


Attachments:
Logisim_simulation.png
Logisim_simulation.png [ 125.62 KiB | Viewed 10103 times ]
Top
 Profile  
Reply with quote  
PostPosted: Tue Nov 19, 2019 7:49 pm 
Offline
User avatar

Joined: Mon Apr 23, 2012 12:28 am
Posts: 760
Location: Huntsville, AL
Looks like you're on your way to successfully implementing a simile of a microprogrammed 6502.

One thing to consider is that by applying the opcode to the upper address lines of your microprogram ROM, you're effectively attempting to use a ROM as a PLA. The two types of devices are quite different. Your approach will lead to a much larger microprogram than necessary since you're not sharing any of the microroutines for any of the addressing modes.

One idea to consider is to create a decode ROM which provides the controls for the ALU and the address of a microroutine in your microprogram which handles the addressing mode and the fetching of the operands, i.e. a microsequence ROM. When the operands are ready, it simply executes the ALU operation defined by your decode ROM.

This approach should result in a substantial reduction in the size of your microprogram ROM. In fact it may be possible to reduce both the decode ROM and the microsequence ROM into a CPLD which is the closest analog to a PLA that is available today. (Note: at least one CPLD for each ROM. Combining the two functions into a single CPLD may not be feasible.)

Good luck on your project, and keep us posted on your progress.

_________________
Michael A.


Top
 Profile  
Reply with quote  
PostPosted: Sun Nov 24, 2019 9:09 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
I hope so :)

I may consider that route, but as of now, for ease of development (for me anyway), I'll stick to just EEPROMS. Optimization will come later.

I have completely redesigned the status register, and this is what I have now (see attached). Now I'am able to choose which flags get modified with each ALU operation, plus setting and clearing individual flags and also reading and writing the whole register to the data bus. I'm sure there is a simpler implementation, but this does the job for now. As I'm not implementing decimal mode, there's no D flag.

Maybe I should rename the thread...

Juan


Attachments:
Status_register.png
Status_register.png [ 30.68 KiB | Viewed 9993 times ]
Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 27, 2019 4:26 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
Renamed the thread...

I've updated my microcode to work with the new Status Register and tidied up the code that generates the ROM files a bit. Also configured a mouse macro that updates logisim's ROMs with the new microcode. Makes development much faster...

The new problem I have run into is the carry bit getting modified on address calculations. On instructions such as ORA or AND, I latch the carry bit into a flip-flop as if I was to take a branch and that enables me to set or clear carry at the end of the instruction depending on the original state.

On the other hand, branch instructions require me to latch other flags into that flip-flop to decide on whether to perform the branch or not. So, if the branch is taken, the carry flag will get used in address calculations and thus get modified. I remember reading somewhere about some sort of hidden carry used for this...I may have to look into that.

I have no experience in assembly programming, is it a big deal if a branch sets or clears the carry flag?
Juan


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 27, 2019 4:53 pm 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 727
Location: Tokyo, Japan
JuanGg wrote:
I have no experience in assembly programming, is it a big deal if a branch sets or clears the carry flag?

My experience is pretty limited, but it sure looks like a big deal to me. For example, if you're doing arithmetic operations on multibyte numbers with a loop, losing the carry in that loop would break the operation. Here's an actual example from code I wrote recently:

Code:
            clc
            adc [toutbufptr],y  ; add binary value of digit
            sta [toutbufptr],y  ;  into LSB
            lda #0              ; prepare to propagate carry
            dey
50$:        adc [toutbufptr],y  ; propagate carry
            dey
            bne 50$             ; not done; continue propagation

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 27, 2019 4:54 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Indeed, one can live with anything, but one of the nice things about the 6502 is the very handy set of choices as to which instructions set which flags.

It sounds to me like you might need a bit of state to control whether a branch is taken, but it's not clear why that has to be the same as the carry bit.


Top
 Profile  
Reply with quote  
PostPosted: Wed Nov 27, 2019 5:04 pm 
Offline
User avatar

Joined: Mon Nov 04, 2019 4:53 pm
Posts: 103
Location: Spain
I see, I supposed so...Back to the drawing board. I'll have to re-structure things a bit.
Thank you.
Juan


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 176 posts ]  Go to page 1, 2, 3, 4, 5 ... 12  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: