6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Mon Apr 29, 2024 8:02 am

All times are UTC




Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Tue Jan 11, 2011 2:21 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
What, in the 6502, drives T0 low? With my own fumbling I've been unable to deduce the logic that drives T0 low. I've mostly figured out the other tcstate bits:

Code:
tcstate[1] <= tcstate[0];

if(rdy0)
     tcstate[5:3] <= 3'b111;
else
     tcstate[5:3] <= tcstate[4:2];

tcstate[2] <= !sync;


sync is controlled by some fetch flip-flop, and fetch is controlled by tcstate[0].

As far as I can tell, T0 is driven high after PLA34 is high (which goes high when T0 goes low). i.e.
Code:
assign pla[34] = !tcstate[0];

if(pla[34]) tcstate[0] <= 1'b1;


However, I don't know what drives it low. It seems to be related to a rdy0 signal. i.e.

Code:
if(rdy0 & tcstate[0]) tcstate[0] <= 1'b0;


I don't know what drives rdy0, though. I've tried reading the transistor level schematic and the physical layout on visual6502, but my skills are still meek, so I haven't made any progress.


Context:
I am working to implement a 6502 core in Verilog that is gate-level accurate with the work published by visual6502.org. I'm doing it for my own educational purposes, so I don't mind if others have done this already.

I don't know much about reading physical chip layouts; how to read the transistors and wires and then piece those together into gates. Which is exactly why I'm doing this! :D So please excuse my ignorance.

T0 seems to control when the next instruction is fetched, so it's important to have before I can get even the most basic parts of my 6502 core working :cry:

P.S. Sorry if this is posted to the wrong section. I'm working in Verilog, so I would have posted it in that section, but my question isn't actually related to verilog.

Any help is greatly appreciated. Thank you.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Jan 11, 2011 11:51 am 
Offline

Joined: Tue Jul 05, 2005 7:08 pm
Posts: 990
Location: near Heidelberg, Germany
IIRC T0 is active in the last cycle of the previous opcode already. There may be a PLA (decoder) output to drive it low.

André


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Jan 11, 2011 7:21 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Hi Xor
Welcome! It's worth taking a look at Peter Monta's FPGA netlist tools project. He takes the visual6502 data structures and makes a verilog netlist which is as high-level as he can get it - mostly logic gates, with some low-level stuff for the datapath busses.

The project contains the resultant verilog, so you don't even need to compile and run it. I believe the verilog contains many meaningful signal names from the visual6502 project too.

Peter's project produces a model which behaves as a 6502 - it doesn't in itself add any explanations as to how or why - so it should be complementary to your project

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Jan 11, 2011 8:14 pm 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
BigEd: That's a fantastic resource, thank you! It's still low level, which will still give me an opportunity to learn the architecture; the how and why. It will be a lot easier to read than the physical layout or transistor netlist, though :P

Yesterday I actually built a tool in JavaScript which takes the visual6502 netlist and allows me to enter a node and explore the transistors and connected nodes that drive it. So, for example, I could type in "clock1" and it'll search out, one depth at a time, what drives it.

It was handy, but I'm still new to mentally mapping nmos transistors into logic, so I didn't get very far.

Anyway, thank you again, that will be immensely helpful! :D


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Tue Jan 11, 2011 8:35 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Hi Xor
That tool sounds interesting - feel free to add it to visual6502 by joining in on github, or if you like, add an MIT-style license and send it to me and I'll see if I can merge it in.

The other day I found myself using the shift-click function on visual6502, which shows you all the nodes which are presently connected by pass gates. As you step through the simulation, you can see the busses connected and disconnected, and see which circuits are reading or writing. (I was working on this page about one of the unassigned opcodes.)

Cheers
Ed


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 12, 2011 12:53 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
Hello BigEd:

I'll see what I can do. I'm neck deep in verilog right now trying to slowly decode what drives clock1. Thank you again for the link to that project!

There seems to be the occasional redundancy in the logic. For example node 17 and clock1 are the same thing, as far as I can tell. I guess that exists due to an electrical design decision in the actual 6502.

It'd be nice to write a tool that makes simple optimizations like that, so people working on projects like mine can start from a succinct netlist.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 12, 2011 5:22 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
Xor wrote:
It'd be nice to write a tool that makes simple optimizations like that, so people
working on projects like mine can start from a succinct netlist.


If you run the verilog through a synthesizer tool, it should report which signals are unused or duplicated.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 12, 2011 7:28 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
Well, thanks to the netlist->verilog project I've been able to make some progress. It's not much progress, but some is better than none at all.

So far I've distilled the logic into this:

Code:
always @ (posedge cclk)
   pipe_T0 <= clock1;

always @ (posedge cp1_v)
begin
   clock2 <= pipe_T0 | notRdy0;
   clock1 <= ((n_1215 & _TWOCYCLE) | n_109) & ((!pipe_T0 & !notRdy0) | pipe_T0);
end


This is highly distilled; I manually deconstructed the logic from what the netlist->verilog project gives (which is rather low-level and crufted by analog processing elements). clock2 works as I expected; a delayed by one version of clock1, except when notRdy0 goes high.

clock1 is driven by various signals. Half of it drives clock1 high when it was low in the previous full-cycle. The other half is what drives it low (time to fetch a new instruction).

Again, it's not much, but I'm really happy to have made some progress. I verified my equations against visual6502 with a few random predictions done by hand. Seems to be correct. Now I need to continue my work and find out what drives 1215, #TWOCYCLE, and 109.

As a side note, notRdy0 is always 0 in the example program running on visual6502. I'm not sure why? Is it because the external memory is always "ready" in the visual6502 simulation? From what I can tell, notRdy0 tells the 6502 that RAM is busy (thus, not ready). So if I read this logic correctly, when clock1 goes low it will stay low as long as notRdy0 is high. In other words, the 6502 will sit and wait for RAM to stop goofing off. Neat! :D


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 12, 2011 11:21 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
PHEW! It looks like I finally have enough logic built up to get something working. I just need to plug in a partial PLA and it should correctly fetch instructions and run the timing control:

Code:
module 6502(clk, bi_data);

   input clk;
   inout [7:0] bi_data;

   reg [7:0] pd = 8'd0;
   reg clock1, clock2;


   // cclk domain
   reg pipeUNK11, pipeUNK23, pipeUNK35, pipeUNK40, pipeUNK41, pipe_T0;

   // cp1 domain
   reg n_24_v, n_653_v, n_666_v;

   wire cclk = clk;
   wire cp1 = !clk;


   /////////////////////////////////////////////////////////////////
   // Pre-Decoder
   wire fetch_v = pipeUNK11;
   wire clearIR_v = !fetch_v;

   wire [7:0] pd_clearIR = clearIR ? 8'd0 : pd;

   wire PD_xxxx10x0_v = !(pd_clearIR[0] | n_1083_v | pd_clearIR[2]);
   wire PD_1xx000x0_v = !(n_1605_v | pd_clearIR[0] | pd_clearIR[3] | pd_clearIR_v[4] | pd_clearIR[2]);
   wire PD_0xx0xx0x_v = !(pd_clearIR[7] | pd_clearIR[4] | pd_clearIR[1]);
   wire PD_xxx010x1_v = !(n_1083_v | pd_clearIR[4] | n_409_v | pd_clearIR[2]);
   wire PD_n_0xx0xx0x_v = !PD_0xx0xx0x_v;

   wire TWOCYCLE = (PD_n_0xx0xx0x_v & PD_xxxx10x0_v) | (PD_1xx000x0_v | PD_xxx010x1_v);
   /////////////////////////////////////////////////////////////////
   
   
   assign n_347_v = ~((op_T2_mem_zp_v|op_T3_mem_zp_idx_v|op_T3_mem_abs_v|op_T4_mem_abs_idx_v|op_T5_mem_ind_idx_v));
   assign n_790_v = ~((op_asl_rol_v|op_lsr_ror_dec_inc_v));
   assign n_368_v = ~((x_op_T3_plp_pla_v|op_T2_jmp_abs_v|op_T4_jmp_v|op_T5_rti_rts_v|xx_op_T5_jsr_v|op_T2_php_pha_v));
   


   wire n_1716_v = ~(op_T3_branch_v | n_653_v | !n_368_v);


   always @ (posedge cclk)
   begin
      pipeUNK11 <= !n_666_v;
      pipeUNK23 <= clock1;
      pipeUNK35 <= n_1716_v & pipeUNK23;
      pipeUNK40 <= !(n_790_v | n_347_v);
      pipeUNK41 <= n_24_v;

      pipe_T0 <= clock1;

      pd <= ~bi_data;
   end


   always @ (posedge cp1)
   begin
      n_24_v <= pipeUNK40;
      n_653_v <= !pipeUNK41;
      n_666_v <= pipeUNK23;
      
      // Timing Control
      clock1 <= (pipeUNK35 & !TWOCYCLE) | !pipe_T0;
      clock2 <= pipe_T0;
   end

endmodule



For now, I'm ignoring any nodes that are at fixed logic levels in the visual6502 demo. There's plenty of logic missing from the above that I assume has to do with instructions not exercised in the visual6502 demo; or corner cases (crossing page boundaries, for example). I'll eventually get back to those, but I'd like something partially working first. I have a separate, more complete file, which has TODO marks on all the incomplete signals.

At a quick glance, it looks like it handles 2 cycle instructions as a special case, with a quick 2 wide shift register to time those. The rest are handled by longer, conditional shift registers, I guess.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Wed Jan 12, 2011 7:16 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10793
Location: England
Nice! I can see how this could grow into an annotated version which works exactly as the NMOS part, but well-structured and commented so it can be understood at a high level.

I particularly like the idea of growing it organically starting with instruction fetching.

Keep us updated!

Cheers
Ed

ps. yes, the visual6502 as presently released just has RDY held high, which is like single-cycle memory and is the normal case. The prerelease version on github can be stalled/unstalled for specific cycles by constructing a suitable URL.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 13, 2011 4:04 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
Thank you for the encouragement, BigEd.

After putting in the PLA, IR fetching logic, and fixing a few mistakes, I got it to work:

Image

For now, the test module manually feeds the correct data to the external databus. I'll need to add the logic for the address lines before it can run on its own.

Anyway, for the quick test, tcstate[0] and tcstate[1] were correct, which is what I've been working on.

With a little duct tape here, some bubble gum there, and a one paperclip I should have a fully working 6502 :P

By the way, whoever coded the "Trace These Too" feature on visual6502: many, many thanks! It's been so fantastically useful in debugging my code.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 13, 2011 7:14 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
Looks good. I'm also interested in the progress.

I did wonder about the dual clock. Right now you've inverted the clock, and are testing both edges. The simulator won't care, but have you tried running this through a synthesizer ? As far as I know, they usually aren't too happy with dual edge clocking.

An alternative may be to double the clock frequency, and run the phi1 stuff from even edges, and phi2 from odd edges using an extra enable signal.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 13, 2011 8:14 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
Arlet, yeah, the clocking is a little weird. I updated the clocking code to more closely model how the 6502 generates its clock. From what I remember in the documentation it pushes the edges of the two clocks away from each other; far enough that it satisfies setup and hold times.
When I synthesize I'll try to setup a PLL or two to correctly generate the dual-clock.

Thank you for commenting on that!


After a bit more muscle work I've extracted the necessary bits for running the AB, ADL, and PCL. So the chip can now manage its own PC and set its AB correctly. The other tcstates are also simulated correctly now.

Code:
`timescale 1ns/1ns

module mos_6502(clk, ab, bi_data);

   input clk;
   inout [7:0] bi_data;
   output reg [15:0] ab = 16'h0000;


   reg [7:0] pd = 8'h00;
   reg [7:0] ir = 8'h00;
   reg [5:0] tcstate = 6'b111111;
   reg [7:0] pcl = 8'h00, pch = 8'h00;
   reg ADL_ABL = 1'b1;   // Load ADL into Address Bus Register Low
   reg I_PC = 1'b0;   // Increment Program Counter
   reg ADD_ADL = 1'b1, PCL_ADL = 1'b0, S_ADL = 1'b0, DL_ADL = 1'b0;   // Flags that select the source for Address Data Low
   reg [7:0] DL = 8'h00, S = 8'h00, ADD = 8'h00;
   wire [7:0] ADL = S_ADL ? S : (PCL_ADL ? pcl : (DL_ADL ? DL : (ADD_ADL ? ADD : 8'h00)));

   // Status Register
   reg p0 = 1'b0, p1 = 1'b1, p2 = 1'b1, p3 = 1'b0, p4 = 1'b1, p6 = 1'b0, p7 = 1'b0;

   wire clock1 = tcstate[0], clock2 = tcstate[1];


   // cclk domain
   reg pipeUNK11 = 1'b0, pipeUNK23 = 1'b0, pipeUNK35 = 1'b0, pipeUNK40 = 1'b0, pipeUNK41 = 1'b1, pipe_T0 = 1'b0, pipeBRtaken_v = 1'b0;

   // cp1 domain
   reg n_24_v = 1'b1, n_653_v = 1'b0, n_666_v = 1'b0;

   reg cclk, cp1, shifted_clk;
   always #100 shifted_clk = clk;

   always @ (posedge shifted_clk) cclk <= 1'b1;
   always @ (negedge clk) cclk <= 1'b0;

   always @ (negedge shifted_clk) cp1 <= 1'b1;
   always @ (posedge clk) cp1 <= 1'b0;


   /////////////////////////////////////////////////////////////////
   // Pre-Decoder
   wire fetch_v = pipeUNK11;
   wire clearIR = !fetch_v;

   wire [7:0] pd_clearIR = clearIR ? 8'd0 : pd;

   wire PD_xxxx10x0_v = !(pd_clearIR[0] | !pd_clearIR[3] | pd_clearIR[2]);
   wire PD_1xx000x0_v = !(!pd_clearIR[7] | pd_clearIR[0] | pd_clearIR[3] | pd_clearIR[4] | pd_clearIR[2]);
   wire PD_0xx0xx0x_v = !(pd_clearIR[7] | pd_clearIR[4] | pd_clearIR[1]);
   wire PD_xxx010x1_v = !(!pd_clearIR[3] | pd_clearIR[4] | !pd_clearIR[0] | pd_clearIR[2]);
   wire PD_n_0xx0xx0x_v = !PD_0xx0xx0x_v;

   wire TWOCYCLE = (PD_n_0xx0xx0x_v & PD_xxxx10x0_v) | (PD_1xx000x0_v | PD_xxx010x1_v);
   /////////////////////////////////////////////////////////////////
   

   // PLA
   // TODO: This should be a module that takes the appropriate inputs and
   // gives a large 130 bit output. We can then write an include that
   // assigns named wires to the bit array.
   `include "pla_decode.v"
   
   
   assign n_256_v = ~((op_T5_ind_x_v|op_T0_brk_rti_v|op_T0_jmp_v|op_T5_rts_v|op_T4_v|op_T5_rti_v|op_T3_v));
   assign n_347_v = ~((op_T2_mem_zp_v|op_T3_mem_zp_idx_v|op_T3_mem_abs_v|op_T4_mem_abs_idx_v|op_T5_mem_ind_idx_v));
   assign n_790_v = ~((op_asl_rol_v|op_lsr_ror_dec_inc_v));
   assign n_368_v = ~((x_op_T3_plp_pla_v|op_T2_jmp_abs_v|op_T4_jmp_v|op_T5_rti_rts_v|xx_op_T5_jsr_v|op_T2_php_pha_v));
   

   wire n_1716_v = ~(op_T3_branch_v | n_653_v | !n_368_v);
   wire n_1286_v = ~(op_brk_rti_v | x_op_jmp_v | op_jsr_v | clock1);
   wire n_1211_v = ~(op_T5_jsr_v | op_T2_branch_v | n_1286_v | !n_666_v | op_T2_abs_access_v);
   wire n_182_v = clock1 & !op_T5_rts_v & n_1211_v;
   wire n_1619_v = !(op_T2_branch_v | n_182_v);
   wire n_620_v = (_op_branch_bit7_v | _op_branch_bit6_v | !p1) & (!p0 | !_op_branch_bit6_v | _op_branch_bit7_v);
   wire BRtaken_v = (ir[5] | !(_op_branch_bit7_v | _op_branch_bit6_v | !p1) ) & (!ir[5] | n_620_v);


   always @ (posedge cclk)
   begin
      pipeUNK11 <= !n_666_v;
      pipeUNK23 <= clock1;
      pipeUNK35 <= n_1716_v & clock1;
      pipeUNK40 <= !(n_790_v | n_347_v);
      pipeUNK41 <= n_24_v;

      pipe_T0 <= clock1;
      pipeBRtaken_v <= !(n_1619_v | (BRtaken_v & op_T2_branch_v));

      pd <= bi_data;

      // Bus Control Signals
      ADL_ABL <= !n_653_v & n_24_v;
      ADD_ADL <= !n_256_v;
      PCL_ADL <= op_T5_jsr_v | op_T2_branch_v | op_T2_abs_access_v | ( !( op_brk_rti_v | x_op_jmp_v | op_jsr_v ) & !clock1 ) | !n_666_v;
      S_ADL <= op_T2_stack_v | op_T0_jsr_v;
      DL_ADL <= op_T2_ind_v | op_T2_zp_zp_idx_v;
   end


   always @ (posedge cp1)
   begin
      n_24_v <= !pipeUNK40;
      n_653_v <= !pipeUNK41;
      n_666_v <= pipeUNK23;

      if(ADL_ABL)
         ab[7:0] <= ADL;

      // Increment PC?
      I_PC <= pipeBRtaken_v | PD_xxxx10x0_v;

      if(!I_PC)
         {pch, pcl} <= {pch, pcl} + 16'd1;

      if(fetch_v)
         ir <= pd_clearIR;
      
      // Timing Control
      tcstate[0] <= (pipeUNK35 & !TWOCYCLE) | !pipe_T0;
      tcstate[1] <= pipe_T0;
      tcstate[2] <= tcstate[1];

      if(!tcstate[0])
         tcstate[5:3] <= 3'b111;
      else
         tcstate[5:3] <= tcstate[4:2];
   end

endmodule



My short-term goal is to get the visual6502 example program working. To that end I'll start the arduous task of bashing my now swollen head into the keyboard until verilog code eventually falls out of it.
Or just get DL, AC, and the associated control signals working ...


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Thu Jan 13, 2011 8:33 am 
Offline
User avatar

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands
I don't know how often this is possible, but when you're all done with the first implementation, it may be good to try to identify latches that are clocked by two phases, and substitute them with a single edge flip flop.

In my verilog 6502 model, I started from the other end. I decided on a single clock from the beginning, and didn't worry too much about 100% authentic behavior at every bus cycle. I just tried to get as close as possible.

It would be interesting to see how close these two approaches can get, and how they differ in area, speed, and accuracy.


Top
 Profile  
Reply with quote  
 Post subject:
PostPosted: Fri Jan 14, 2011 1:05 am 
Offline

Joined: Mon Jan 10, 2011 11:53 pm
Posts: 19
Arlet: To be honest, I would much prefer to have done a single clock implementation from the beginning :P My job affords me the luxury of working with single clock systems all the time, with only the occasional need to cross clock domains (which I despise doing and debugging). But building this accurate implementation of the 6502 is a great learning experience for me, if for nothing else than working with dual-clocks.

As a side note, I believe the code I've written thus far is ever so slightly incorrect. It's registered, whereas the 6502 uses latches. So, for example:

Code:
always @ (posedge cp1)
begin
   n_24_v <= !pipeUNK40;
end


Should really be:

Code:
always @ *
if (cp1)
begin
   n_24_v <= !pipeUNK40;
end


which should correctly simulate a latch. I've always worked with registers, so I'm not even sure if the above would synthesize on the FPGAs I work with.

So far I've just worked around the differences but I may go back and update the code to be more accurate.


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 24 posts ]  Go to page 1, 2  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 20 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: