6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Nov 23, 2024 5:58 pm

All times are UTC




Post new topic Reply to topic  [ 12 posts ] 
Author Message
PostPosted: Mon Apr 06, 2020 9:37 am 
Offline

Joined: Sun Apr 05, 2020 6:11 am
Posts: 2
Hello! I want to embark on a 65C816-based computer project. I made a schematic of the computer with the RAM and ROM as well as the glue logic, and I was about to put in two VIAs until I realized that I didn't want to have to route and place all of the 7400 glue logic when I finish. I have decided that I will use an EPM7032 CPLD to replace the glue logic in order to keep assembly simple and keep my design readable, leaving with a half-finished schematic and no I/O. In addition to VIAs, I also intended to place a 28L92 UART and a LM2576-3.3 regulator for components such as an SD card or CPLD. However, I still want to know whether or not that I designed the board correctly, handling quirks such as the bank address and VPA/VDA, and also know whether or not it can run with a 16MHz oscillator (effective clock rate 8MHz). The intended memory map is as follows:
Code:
 ADDR. RANGE    | FUNCTION
----------------|----------
                | $XX = 0 THRU $7F
 $XX0000-00BFFF | 512K ONBOARD RAM (MIRRORS EVERY 16 BANKS)
 $XXC000-XXFFFF | 16K ROM (MIRRORS IN EVERY BANK)
 $800000-FFFFFF | I/O AREA

In addition to this mapping, there is a JK flip-flop on the board (one of 3) that was intended to connect to the output of a VIA. This flip-flop, when triggered, should remove the ROM from the memory map, resulting in this:
Code:
 ADDR. RANGE    | FUNCTION
----------------|----------
 $000000-7FFFFF | 512K ONBOARD RAM (MIRRORS EVERY 16 BANKS)
 $800000-FFFFFF | I/O AREA

The point of that was to have a 16K bootstrap ROM that would load a 48K image into bank 1 from an external memory such as an SD or CF card, then long-jump into it, which would trigger the flip-flop and present some sort of user program selection interface, which would then load a program from the external memory. Without further ado, here is my schematic:
Attachment:
File comment: Early 65C816 SBC schematic
r16.pdf [82.89 KiB]
Downloaded 134 times

All praise and/or criticism is appreciated, even if it's only a matter of personal preference :D


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 06, 2020 9:43 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8509
Location: Midwestern USA
robbie wrote:
Hello! I want to embark on a 65C816-based computer project...Without further ado, here is my schematic...All praise and/or criticism is appreciated, even if it's only a matter of personal preference :D

Can you please post a monochrome version so I can read it? Thanks.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 06, 2020 10:33 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
Hi,

Welcome and great to see another '816 project.

Based on my own '816 board, I'd suggest putting IO in Bank 0. For no other reason that you can use LEDs on the VIA pins as a boot/debug aid which you can control in 6502 emulation mode at power on time, or '816 native mode without going to the hassle or using either data-bank fiddling or 24-bit load/store instructions.

I have 256 bytes mapped for IO at $00.FE00 in mine, leaving the top 256 bytes in bank 0 for 'rom' vectors, the rest (of the 512K) is RAM. (It's actually all RAM, no ROM)

And yes, CPLD to replace TTL glue logic gets my vote. I use 2 x 22v10 GALs in my '816 board.

Reset circuit - look at the DS1813 to save board space with those resistors/capacitors and a gate.

Good luck and do keep us informed.

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 06, 2020 10:58 am 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 730
Location: Tokyo, Japan
robbie wrote:
In addition to this mapping, there is a JK flip-flop on the board (one of 3) that was intended to connect to the output of a VIA. This flip-flop, when triggered, should remove the ROM from the memory map....

Can you explain in detail the logic behind how this works? I'm looking at the schematic, but not quite geting it.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Mon Apr 06, 2020 12:37 pm 
Offline
User avatar

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10986
Location: England
Welcome, robbie! I was a bit surprised by your memory map, until I saw the second version, which is the one usually in play after boot.

An advantage of using a CPLD is that you will be able to iterate on your glue logic, so long as you have enough signals available: that is, so long as the CPLD has all the inputs it will need.

It's unusual to put I/O in a high bank, I think, and Gordon certainly has a point. I think it can work though - it's not a show-stopper.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 7:25 am 
Offline

Joined: Sun Apr 05, 2020 6:11 am
Posts: 2
BigDumbDinosaur wrote:
Can you please post a monochrome version so I can read it? Thanks.

Attachment:
r16.pdf [81.63 KiB]
Downloaded 78 times

drogon wrote:
Based on my own '816 board, I'd suggest putting IO in Bank 0. For no other reason that you can use LEDs on the VIA pins as a boot/debug aid which you can control in 6502 emulation mode at power on time, or '816 native mode without going to the hassle or using either data-bank fiddling or 24-bit load/store instructions.

Thanks for the tip, not only would that be more useful for testing, but I could also take advantage of the movable direct-page as size optimization if I want to unroll an IO-intensive sequence. The reason I had the IO starting in bank $80 is because I had the A23 line as an IO selector line to emulate the dual-address-space system on the Z80, but I do see how that's a little ridiculous with only 512K of SRAM at the moment (I do intend to upgrade to Garth Wilson's 4M module eventually when I feel that I am pushing the limits of 512K). With that said, I have decided to adopt your $00.FE00 IO page into my memory map.
drogon wrote:
leaving the top 256 bytes in bank 0 for 'rom' vectors, the rest (of the 512K) is RAM. (It's actually all RAM, no ROM)

What exactly does this mean? Were the vectors loaded by the software? How did you achieve bootstrapping?
cjs wrote:
Can you explain in detail the logic behind how this works? I'm looking at the schematic, but not quite geting it.

In the "address decoding" segment of the schematic, there's a JK flip-flop with the /K input tied high and the J input connected to a "ROM_OFF" label that doesn't actually go anywhere. If it were connected to a VIA, the system would be able to turn off the ROM by asserting ROM_OFF, killing the AND chain that powers the ROM selector and completely disabling ROM.
BigEd wrote:
An advantage of using a CPLD is that you will be able to iterate on your glue logic, so long as you have enough signals available: that is, so long as the CPLD has all the inputs it will need.

After posting this, I saw the forum post about the ATF150x series being pin-compatible with the EPM7xxx series CPLDs and decided to switch to that in order to avoid using a 3.3V regulator. I'll still need to use one eventually if I go the SD route, but that's more of a stretch goal at the moment. Even though I originally wrote that I would use an EPM7032 (equivalent ATF1502), I realized that 32 macrocells is probably not enough to replace all of the glue logic considering some experiments I did with a VGA timer on a CPLD a couple of months ago.
-----------------------------------------------------------------------
Now that I've decided to use a CPLD, I've started on writing some Verilog to handle some glue logic. The ultimate goal right now is to replace all 7400-series logic except for the latch and bus-transciever, as a) their job is too mundane to be worth consolidating IMO, and b) as 74ABT chips, they are probably faster than what the equivalent would be on a CPLD. Here's what I've written so far:
Code:
module eeprom_waitstate (
  input rom_sel, phi2, resb,
  output reg rdy
);
  always @(posedge phi2, negedge resb) begin
    if (!resb) rdy <= 1;
    else rdy <= rom_sel || !rdy;
  end
endmodule

Code:
module addr_decoder (
  input [23:0] address,
  input phi2, resb, vda, vpa, rom_off,
  output ram_sel, rom_sel, via1_sel, via2_sel, uart_sel
);
  reg [4:0] state;
  reg rom_en;
 
  always @(posedge phi2, negedge resb) begin
    if (!resb) begin
      state <= 5'b00000;
      rom_en <= 1'b1;
    end else begin
      if (rom_off) rom_en <= 1'b0;
      if (vda || vpa) begin
        if (address < 16'h00C000) state <= 5'b00001;
        else if (address < 16'h00FE00) begin
          if (rom_en) state <= 5'b00010;
          else state <= 5'b00001;
        end
        else if (address < 16'h00FE10) state <= 5'b00100;
        else if (address < 16'h00FE20) state <= 5'b01000;
        else if (address < 16'h00FE30) state <= 5'b10000;
        else state <= 5'b00001;
      end else state <= 5'b00000;
    end
  end
 
  assign ram_sel  = !state[0];
  assign rom_sel  = !state[1];
  assign via1_sel = !state[2];
  assign via2_sel = !state[3];
  assign uart_sel = !state[4];
endmodule

I have not yet written any controller for the /OE, /WE lines, as well as the /CE line on the 74ABT245. Here's the intended memory map from the address decoder:
Code:
 ADDR. RANGE    | FUNCTION
----------------|----------
 $000000-00BFFF | ONBOARD SRAM (512K)
 $00C000-00FDFF | BOOTSTRAP ROM/ONBOARD SRAM (SWITCHABLE BY TRIGGER)
 $00FE00-00FE0F | VIA1 REGISTERS
 $00FE10-00FE1F | VIA2 REGISTERS
 $00FE20-00FE2F | UART REGISTERS
 $00FE30-00FEFF | BOOTSTRAP ROM/ONBOARD SRAM (RESERVED FOR EXPANSION)
 $00FF00-00FFFF | BOOTSTRAP ROM/ONBOARD SRAM (VECTOR AREA)
 $010000-FFFFFF | ONBOARD SRAM (MIRRORS AT BANK $10, $20, ETC.)

One problem I've already discovered is that all of the IO lines I intend to use don't fit on the 36 IOs of the 44-pin EPM7064 and therefore ATF1504. While Atmel does make a 64-IO 84-pin ATF1504, it's an original design and is therefore not supported by Quartus II 13.0. I thought of some possible options:
  • If I get really clever, I would be able to use the 36-IO chip by chopping off the least-significant nibble of the address line and removing the ROM_OFF line, pushing the design into using exactly 36 IOs. Instead, I would replace the ROM_OFF line with VPA && VDA && A16, meaning that execution has transferred over to the program loaded in bank 1. This is the least flexible option, and is also probably the least kosher.
  • Make custom Quartus II definitions so that I could use the 84-pin Atmel chip. I don't even know if this is possible.
  • I don't really know much about ProChip Designer, but I submitted a license application. Is the software free? If so, this is probably the best option.
  • Translate the project to CUPL. I don't really know much about CUPL, but it doesn't seem like it's nearly as friendly for sequential logic as Verilog.
-----------------------------------------------------------------------
Lastly, I just have a couple of questions about the future of the design. I understand that the memory map is very basic, but just how extensible is it? How advanced of a system could I build with this setup? What bells and whistles could I add to fill up the entire address space? In addition, would you add any IO chips other than the VIAs and the UART?


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 8:35 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
robbie wrote:
drogon wrote:
Based on my own '816 board, I'd suggest putting IO in Bank 0. For no other reason that you can use LEDs on the VIA pins as a boot/debug aid which you can control in 6502 emulation mode at power on time, or '816 native mode without going to the hassle or using either data-bank fiddling or 24-bit load/store instructions.


Thanks for the tip, not only would that be more useful for testing, but I could also take advantage of the movable direct-page as size optimization if I want to unroll an IO-intensive sequence. The reason I had the IO starting in bank $80 is because I had the A23 line as an IO selector line to emulate the dual-address-space system on the Z80, but I do see how that's a little ridiculous with only 512K of SRAM at the moment (I do intend to upgrade to Garth Wilson's 4M module eventually when I feel that I am pushing the limits of 512K). With that said, I have decided to adopt your $00.FE00 IO page into my memory map.


I'll say something that may appear somewhat controversial, but you'll only save one or 2 clock ticks per access by moving the direct page to the IO region. I know it's been suggested, and maybe even implemented in the past, but is it worth it if you're just transferring a few bytes to/from a VIA or a serial port at a time? Consider taking an interrupt to read characters from a serial port - your interrupt handler now has to save the current DP, set a new one, do the read, restore the DP, then do the rest of the tidying up. I'm not personally convinced shaving a couple of cycles off the actual access is worth the overhead of fiddling with the DP register - I may be wrong though - when I looked at it, my thoughts were along the lines of ... "I'm using a pointer in the direct page for the data transfer, so it's not going to work that well" ...

Same for RAM - 512KB - It's very easy to just slap down 4MB (or whatever), and maybe you have plans for it, but as an exercise, work out how long it takes to clear 4MB of RAM with an 8Mhz 65816 clock... (best will be 7 cycles per byte plus overhead to change banks) the SNES has 128KB of user RAM and the Apple IIgs had multiple MB of RAM, but it did have bigger applications like word processors, spreadsheets, and a graphical user interface that could make use of it, but it's worthwhile having a good think about what you might want - putting a current-day operating system onto it would be fantastic, but remember 8/16 bits, 8-16Mhz ...

robbie wrote:
drogon wrote:
leaving the top 256 bytes in bank 0 for 'rom' vectors, the rest (of the 512K) is RAM. (It's actually all RAM, no ROM)

What exactly does this mean? Were the vectors loaded by the software? How did you achieve bootstrapping?


Vectors for both the hardware and my operating system.

It's a long story... All explained here: https://projects.drogon.net/6502-ruby/ which is the 6502 version which my 816 board also uses, but in essence: No ROMs. There is an external bit of "magic" that can populate the RAM of the 6502/816 with a ~200 byte bootloader plus the hardware vectors that can then communicate with the same external bit of magic to do stuff like serial and disk (SD card) IO. when my OS initialises it copies a block of about 120 bytes worth of vectors to the top of $00FF00 for the hardware and OS (software) vectors so the '816 can take interrupts and BRK correctly and application programs have a well-known mechanism to do things like print characters, open disk files, and so on.

I have 8 banks of 64KB of RAM from $000000 through $07FFFF with a single gap at $00FE00 for IO. I only have one hardware device (a single 65C22 VIA) because that's all I need for now, but to further decode that region down would be fairly trivial if I ever needed it.

The OS vectors are JMPs into the OS at fixed locations in $00FFxx and they mimic the same ones as the Acorn MOS did on the BBC Micro back in 1981. I had to make some changes when I moved to the '816 due to clashes with the extended hardware vectors, but it still enables me to run some old Acorn ROM images like BBC Basic...

Cheers,

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 9:02 am 
Offline
User avatar

Joined: Sat Dec 01, 2018 1:53 pm
Posts: 730
Location: Tokyo, Japan
drogon wrote:
I'll say something that may appear somewhat controversial, but you'll only save one or 2 clock ticks per access by moving the direct page to the IO region. I know it's been suggested, and maybe even implemented in the past, but is it worth it if you're just transferring a few bytes to/from a VIA or a serial port at a time?

I'm the one that usually seems to be suggesting using the DP for I/O, and I don't think it's at all controversial that it's not worth it for just a few bytes.

The application I was thinking of was for mass storage I/O where you're moving tens of kilobytes. There you're talking hundreds of thousands of cycles of copying, so the overhead of saving and restoring the DP is essentially zero.

But there turned out to be a bit more to it than that: my technique involved not only using the direct page for I/O (which saved a single cycle per byte) but also using indexed absolute addressing (which implies self-modifying code) to save another two cycles. At that link I spell out a couple of simple loops and go from 16 to 13 cycles using this, for almost a 20% speed increase.

That's not insignificant, but it's definitely in the "it depends on the individual" area of the answers to, "Is it worthwhile?"

Personally I'd put I/O in the zero bank just to keep the option open, unless I was aiming for the simplest possible address decoding and wanted lots of RAM for DPs and stacks.

_________________
Curt J. Sampson - github.com/0cjs


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 9:14 am 
Offline
User avatar

Joined: Wed Feb 14, 2018 2:33 pm
Posts: 1488
Location: Scotland
cjs wrote:
drogon wrote:
I'll say something that may appear somewhat controversial, but you'll only save one or 2 clock ticks per access by moving the direct page to the IO region. I know it's been suggested, and maybe even implemented in the past, but is it worth it if you're just transferring a few bytes to/from a VIA or a serial port at a time?

I'm the one that usually seems to be suggesting using the DP for I/O, and I don't think it's at all controversial that it's not worth it for just a few bytes.

The application I was thinking of was for mass storage I/O where you're moving tens of kilobytes. There you're talking hundreds of thousands of cycles of copying, so the overhead of saving and restoring the DP is essentially zero.

But there turned out to be a bit more to it than that: my technique involved not only using the direct page for I/O (which saved a single cycle per byte) but also using indexed absolute addressing (which implies self-modifying code) to save another two cycles. At that link I spell out a couple of simple loops and go from 16 to 13 cycles using this, for almost a 20% speed increase.

That's not insignificant, but it's definitely in the "it depends on the individual" area of the answers to, "Is it worthwhile?"

Personally I'd put I/O in the zero bank just to keep the option open, unless I was aiming for the simplest possible address decoding and wanted lots of RAM for DPs and stacks.


OK, that's a valid point.

I was pondering the lack of a stride register for block move (so you could set it to zero for the writes, then block move N bytes to one address), however one thing that may help here - if you were to dedicate a whole 64K bank to bulk IO, then you can block move into it - all the device attached to that bank has to do is take a byte on the write strobe for that bank and as long as it can buffer up (e.g.) 256 or 512 bytes at a time at full clock speed, then it could end up being remarkably fast. (7 clocks per byte) Not as fast as DMA which could potentially be 1 clock per byte, but for keeping the hardware simple, then it might be a good idea to try. That may be a good thing to try for things like an IDE interface (compact flash cards, etc.)

-Gordon

_________________
--
Gordon Henderson.
See my Ruby 6502 and 65816 SBC projects here: https://projects.drogon.net/ruby/


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 10:52 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3367
Location: Ontario, Canada
cjs wrote:
20% speed increase. [...] That's not insignificant, but it's definitely in the "it depends on the individual" area of the answers to, "Is it worthwhile?"

Thanks for the clear-eyed view. Yes, exactly as you say, it does depend on the individual -- or on the situation, to be precise. In some situations IO in Direct Page is a "meh" or even a non-starter. But in an IO-bound situation, 20% (for example) could potentially be a deal-maker. :!:

I welcome discussions of the cost/benefit of IO in Direct Page because situations and priorities do vary, and greatly so. IMO, anyone who has a yes/no answer to the "is it worth it" question should consider adopting a more flexible outlook.

6502 / 'C02 is a somewhat different kettle of fish, but FWIW see the thread major speedup with 65C02 I/O mapped into zero-page, ...
I wrote:
  • in vintage microcomputers, free addresses in Z-pg are unobtainium. That's a regrettable reality.
  • Unfortunately, yesterday's reality engenders a mindset that some modern-day builders seem to accept without question -- namely, that nothing merits the sacrifice of Z-pg. But in the context of a new design, the tradeoff might be attractive -- or even compelling.

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 11:50 am 
Offline

Joined: Mon May 21, 2018 8:09 pm
Posts: 1462
One thing to keep in mind is that putting I/O in a high bank, setting the DBR to that bank and using absolute addressing (not absolute long!) is only one cycle slower than putting it in bank zero, setting the DPR and using DP addressing. You can even do it in emulation mode if you need to. And it has the distinct advantage of reducing the fragmentation of the memory map in bank zero.

You can also do some hardware mapping tricks to permit using 16-bit operations (to halve the addressing-mode overhead), or even the block-move instructions (which are 7 cycles per byte moved, regardless of bank, but require self-modifying code). For the latter, say you have a block of 256 bytes mapped to a single I/O register; you can now block-move up to 256 bytes to or from that register, almost twice as fast as the naive loop using bank zero, the DPR, and a self-modifying indexed-long store. And that's something you can conveniently do if you've dedicated, say, bank $FF to I/O.


Top
 Profile  
Reply with quote  
PostPosted: Tue Apr 07, 2020 8:18 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8545
Location: Southern California
Note that some "long" op codes are missing, most notoriously BIT. (INC and DEC are nice too when you just want to change bit 0 of an output port. Those are missing as well.) So if DBR is usually 0, keeping I/O in bank 0 will have another advantage there.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 30 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: