6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 9:24 pm

All times are UTC




Post new topic Reply to topic  [ 76 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
PostPosted: Tue May 17, 2022 3:44 pm 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
Digging more into WINCUPL, I was able to extract more information regarding whether using negative or positive logic uses more PLD resources. For the case of the ATF22V10C and address decoding in particular I think the answer is no.

I analyzed the fuse changes for each pin after activating the Fuse Plot in the DOC file (files attached for the curious). Switching from negative to positive logic for my seven output pins resulted in the following (see post 47 for the PLD files):

  • the mode fuse on four pins changed, and
  • one pair of fuses on three pins swapped from blown to not blown or visa versa.

Code:
Pin     Neg     Pos         Mode Fuse   Swapped Fuses
16      CLKB    !CLKB           x           1
17      WE      !WE                         1
18      ROM_CS  !ROM_CS         -
19      RAM_CS  !RAM_CS         -
20      OE      !OE                         1
21      IO_CS   !IO_CS          -
23      IRQ     !IRQ

LEGEND    x : Mode fuse not blown in pos, blown in neg
          - : fuse blown in pos, not blown in neg
          1 : one pair of fuses for pin swapped state

For the ATF22V10C at least, it seems each output pin has a mode fuse to indicate whether it's active high or low. It's use or not with positive or negative logic in the case above doesn't reflect the use of more or less PLD resources as it seems these can't be used for anything else.

The swapped fuses also don't reflect the use of more or less resources since in the case above a pair of fuses are swapping state.

What I found most interesting and worthy of an update here:
  • not all of the mode fuses changed,
  • the mode fuse changes weren't always in the same direction,
  • the pins with swapped fuses all had simple logic equivalencies, and,
  • CLKB was the only pin with a mode and swapped fuse change.

It seems as if WINCUPL is doing some internal optimizations (note I haven't selected any of WINCUPL's optimization options for this analysis).

Again, I've just looked at this for the ATF22V10C and basic address decoding. It's possible that the logic used here is easier for WINCUPL to optimize and these results won't hold for more complicated situations. In that case positive logic might lead to more efficient PLD usage. At the same time, I know almost nothing about all of this and my analysis could be totally wrong. So any feedback regarding it would be appreciated.


Attachments:
pos_logic_fuse_plot.doc [17.97 KiB]
Downloaded 50 times
neg_logic_fuse_plot.doc [17.97 KiB]
Downloaded 47 times
Top
 Profile  
Reply with quote  
PostPosted: Tue May 17, 2022 8:40 pm 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8144
Location: Midwestern USA
tmr4 wrote:
Digging more into WINCUPL, I was able to extract more information regarding whether using negative or positive logic uses more PLD resources. For the case of the ATF22V10C and address decoding in particular I think the answer is no...

All of my experience with CUPL has been with the CPLDs, not the GALs. In playing around with positive vs, negative logic, I did see a greater use of PTs with the latter. That being the case, apparently there are some significant differences in the internal architecture of the two PLD types. As always, YMMV.

One thing I noted in examining your two .DOC files (note to the uninitiated: a .DOC file generated by CUPL compilation is not an MS Word document) is the huge amount of logic being consumed to generate EXRAM. I suspect that is the result of the idiosyncratic memory map you’ve set up. Also not helping is a GAL doesn't have buried nodes to assist with bank bits generation and the aggregation certain kinds of address bus conditions. In your application, I believe the availability of buried nodes would have significantly reduced logic consumption.

Edit: Fixed a typo.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Last edited by BigDumbDinosaur on Wed May 18, 2022 4:34 am, edited 2 times in total.

Top
 Profile  
Reply with quote  
PostPosted: Tue May 17, 2022 10:56 pm 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
BigDumbDinosaur wrote:
One thing I noted in examining your two .DOC files ... is the huge amount of logic being consumed to generate EXRAM. I suspect that is the result of the idiosyncratic memory map you’ve set up.

Yes. With the overlapping memory map that I used in the original post with
Code:
EXRAM = Address:[20000..7FFFF]
EXRAM only used 2 product terms.

Product tern usage went way up trying to refine the memory map to isolate a specific page, as I was doing with both bank 0 ROM and the I/O in bank 2. Usage went way down with 1k blocks and even further with 4k blocks. Of course this isn't really surprising. It takes less logic to describe a large block than a small one. But I was surprised at the large increase to isolate a single page.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 18, 2022 4:37 am 
Offline
User avatar

Joined: Thu May 28, 2009 9:46 pm
Posts: 8144
Location: Midwestern USA
tmr4 wrote:
With the overlapping memory map that I used in the original post with
Code:
EXRAM = Address:[20000..7FFFF]
EXRAM only used 2 product terms.

Product tern usage went way up trying to refine the memory map to isolate a specific page, as I was doing with both bank 0 ROM and the I/O in bank 2.

That’s the problem with granular decoding, whether with discrete gates or programmable logic. You eat up a lot of logic, sometimes for results whose value is questionable.

_________________
x86?  We ain't got no x86.  We don't NEED no stinking x86!


Top
 Profile  
Reply with quote  
PostPosted: Wed May 18, 2022 5:24 am 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
BigDumbDinosaur wrote:
That’s the problem with granular decoding, whether with discrete gates or programmable logic. You eat up a lot of logic, sometimes for results whose value is questionable.

Point taken. Whether I have 256 bytes, 1k, 4k or 10k of ROM in Bank 0 doesn't really matter except to see how far something can be pushed. It was a good learning experience.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 18, 2022 8:38 am 
Offline

Joined: Fri Jul 09, 2021 10:12 pm
Posts: 741
You've probably already seen it, but in case not, the datasheet has a big diagram of the internal structure of this device (page 11 I think), which shows all the sums (different OR gate widths for different outputs), the feedback paths, "free" inverters, etc. I'm pretty sure I've also seen a decent diagram of the macrocell structure explaining how it "changes" in registered modes, but apparently not in this datasheet.

It is very good at anything which can be expressed as a sum of a fairly low number of products, but terrible if you end up needing a product of sums. Since inverts are basically free, you can usually transform one into the other, and wincupl seems to do a fairly good job of this. So the problems only really start when you need a sum of products of sums, for example.

The size of a range is not a problem, it's the complexity of the range that matters. Complex ranges waste scarce resources partly because you require more input pins (and waste output macrocells by using them as inputs instead) and partly because it requires nesting more sums within products (which also requires extra macrocells after a point).

So a simple range just requires checking that the top N bits of the address have specific values. This only requires one product term. For example, $c0-$ff only requires checking that the top two bits are set - that's a one product.

As soon as you want to exclude a smaller range within that, though, you need to either add another sum term before the product (which this hardware doesn't support unless you explicitly introduce another macrocell with its own sum-of-products as an input to the main one) or expand out the logic into a larger number of product terms summed together.

Again for example, $c8-$ff could be written as:

Code:
b7=1 & b6=1 & !(b5=0 & b4=0 & b3=0)


But this invert is not a free one, it splits the product up and requires the subexpression to be either computed in a separate macrocell (which I think Wincupl only does if you are explicit) whose feedback output can be inverted for free, or converted into a sum and expanded into more product terms:

Code:
   b7=1 & b6=1 & !(b5=0 & b4=0 & b3=0)
=> b7=1 & b6=1 & (b5=1 # b4=1 # b3=1)
=> (b7=1 & b6=1 & b5=1) # (b7=1 & b6=1 & b4=1) # (b7=1 & b6=1 & b3=1)

This fits within one sum term, but now requires three product terms as inputs rather than just one - and note that we now need five address bit inputs overall covering the entire range from high to low. The smaller the excluded range is, the more address bits and product terms you require here, which becomes a problem if you run out of one or other.

Now you may have good reasons for wanting to include fragments of page zero along with larger ranges, but hopefully this explains a bit why it is costly. Actually I'm now doubting how useful this post is, compared to what I thought it would be when I started writing it - but I'll send it anyway just in case!


Top
 Profile  
Reply with quote  
PostPosted: Wed May 18, 2022 9:29 am 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
gfoot wrote:
You've probably already seen it, but in case not, the datasheet has a big diagram of the internal structure of this device (page 11 I think), which shows all the sums (different OR gate widths for different outputs), the feedback paths, "free" inverters, etc. I'm pretty sure I've also seen a decent diagram of the macrocell structure explaining how it "changes" in registered modes, but apparently not in this datasheet.

Yeah, I've seen it, but never found it useful. And now that I looked at it more closely, I realized it's not even complete, but partially cropped. The Lattice GAL22V10 datasheet is much more informative and has the macrocell diagram you mention.

gfoot wrote:
Now you may have good reasons for wanting to include fragments of page zero along with larger ranges, but hopefully this explains a bit why it is costly.

I doubt many would consider it a good reason, but I figured bank 0 is a lot like a big zero page on the 6502 so why waste it with something I can put elsewhere. Nothing more than that. Of course it's highly unlikely I'll ever have a need for the space I've created. I could have just kept the same memory map I had with my 6502 build and treated bank 1 and up at expanded RAM. Seems a bit boring though and I definitely wouldn't have learned as much going that way.

gfoot wrote:
Actually I'm now doubting how useful this post is, compared to what I thought it would be when I started writing it - but I'll send it anyway just in case!

Been there, done that. Your post made me go back and look at both of the datasheets I downloaded when I got the PLD months ago. I came away with a better perspective. I've actually have learned enough through all of this to understand the more detailed Lattice datasheet now.


Top
 Profile  
Reply with quote  
PostPosted: Wed May 18, 2022 10:10 am 
Offline
User avatar

Joined: Sat Jul 24, 2021 1:37 pm
Posts: 282
tmr4 wrote:
gfoot wrote:
You've probably already seen it, but in case not, the datasheet has a big diagram of the internal structure of this device (page 11 I think), which shows all the sums (different OR gate widths for different outputs), the feedback paths, "free" inverters, etc. I'm pretty sure I've also seen a decent diagram of the macrocell structure explaining how it "changes" in registered modes, but apparently not in this datasheet.

Yeah, I've seen it, but never found it useful. And now that I looked at it more closely, I realized it's not even complete, but partially cropped. The Lattice GAL22V10 datasheet is much more informative and has the macrocell diagram you mention.


Wow, thanks for sharing, that one is so much more detailed and clear. The one for the GAL16V8 from Lattice is also much better.

_________________
BB816 Computer YouTube series


Top
 Profile  
Reply with quote  
PostPosted: Wed May 18, 2022 1:24 pm 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
akohlbecker wrote:
Wow, thanks for sharing, that one is so much more detailed and clear. The one for the GAL16V8 from Lattice is also much better.

Thanks, I hadn't thought of downloading the gal16v8 datasheet. Interestingly, the ATF16v8 datasheet is much better than the atf22v10. I have the chip but hadn't looked at the datasheet since it doesn't work for my current design.


Top
 Profile  
Reply with quote  
PostPosted: Thu May 26, 2022 11:33 am 
Offline
User avatar

Joined: Tue Aug 11, 2020 3:45 am
Posts: 311
Location: A magnetic field
tmr4 on Sun 15 May 2022 wrote:
Ultimately I plan on a small handheld unit.


You might want to avoid PLD in portable applications due to power consumption. A blank PLD may consume 105mA and a programmed PLD may consume more than 247mA.

Regarding memory map, it is possible to place I/O in all banks. This scheme is not popular with GARTHWILSON, BigDumbDinosaur or Dr Jefyll for the very simple reason that it fragments RAM. However, simplifies address decode, reduces address decode latency and reduces 65816 program size/cycles when accessing I/O. Specifically, if the same I/O is in all banks then you never need to make a long reference to I/O.

tmr4 on Mon 16 May 2022 wrote:
Moving to the 65816, I add at least two chips for the bank address latch and data bus buffer, and likely a larger SRAM. But I had no space on the PCB, see attached image. Something has to give. Right now I don't know what it will be.


If working with DIP chips, it is possible to make mixed chip stacks. For example, address decode and AND gates for interrupts may fit under one 6522. Likewise, 74x74 for two phase clock may fit under processor. If you don't require more than four bank latch bits, Dr Jefyll has previously suggested using 74x157 in a transparent latch configuration. With or without a two phase clock, this may configured such that an inverter is not required. This reduces latency. It also fits under DIP processor.

Anyhow, get creative. There is space for two 16 pin DIP chips under each 40 pin DIP chip.

_________________
Modules | Processors | Boards | Boxes | Beep, Beep! I'm a sheep!


Top
 Profile  
Reply with quote  
PostPosted: Thu May 26, 2022 6:25 pm 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
Sheep64 wrote:
You might want to avoid PLD in portable applications due to power consumption. A blank PLD may consume 105mA and a programmed PLD may consume more than 247mA.

Thanks. Yeah, I'm aware of the PLD power draw. I'm not planning a battery power device though, but one I can use out in my greenhouse where I can plug it in. I have thought of going to a 3.3 volt design for a battery powered unit, but I'll leave that for another day. Still, I'll find it hard giving up the flexibility of the PLD.

Sheep64 wrote:
Regarding memory map, it is possible to place I/O in all banks...

Interesting idea. I hadn't thought of that.

Sheep64 wrote:
... However, simplifies address decode, reduces address decode latency and reduces 65816 program size/cycles when accessing I/O. Specifically, if the same I/O is in all banks then you never need to make a long reference to I/O.

I started this project with a 65C02 and designed my Forth operating system for size rather than speed. I've been having fun converting it to the 65816 where my new memory map pretty much required long references throughout, so doing it for I/O as well wasn't a big deal. In essence I wasn't getting size or speed, but with the 65816, size wasn't really important anymore.

I suppose that's a good thing since the direct threading I was using practically begged for subroutine threading. That actually got me some speed gains as eliminating NEXT with the RTL saved me quite a few cycles.

Sheep64 wrote:
Anyhow, get creative. There is space for two 16 pin DIP chips under each 40 pin DIP chip.

Yeah. I've seen several of the clever design ideas on this site to fit more in a small area. I suppose many/most go out the door though with the recommendations I've been getting to go to PLCCs and surface mount chips.


Top
 Profile  
Reply with quote  
PostPosted: Fri May 27, 2022 2:49 am 
Offline
User avatar

Joined: Fri Dec 11, 2009 3:50 pm
Posts: 3346
Location: Ontario, Canada
tmr4 wrote:
Sheep64 wrote:
Anyhow, get creative. There is space for two 16 pin DIP chips under each 40 pin DIP chip.

Yeah. I've seen several of the clever design ideas on this site to fit more in a small area. I suppose many/most go out the door though with the recommendations I've been getting to go to PLCCs and surface mount chips.
Right. I'm always on the lookout for creative solutions, and in years gone by I've done my share of chip stacking! :P But surface mount does change the picture considerably. If a DIP-40 can be replaced with a PLCC (or better yet a flat pack), the increased density happens automatically.

tmr4 wrote:
[...] eliminating NEXT with the RTL saved me quite a few cycles.
Whoa.. what?? You were using Return Long as a form of threading?? That would imply 24-bit addresses, which is rather provocative. Perhaps I've misunderstood your remark.

-- Jeff

_________________
In 1988 my 65C02 got six new registers and 44 new full-speed instructions!
https://laughtonelectronics.com/Arcana/ ... mmary.html


Top
 Profile  
Reply with quote  
PostPosted: Fri May 27, 2022 4:14 am 
Offline

Joined: Sat Feb 19, 2022 10:14 pm
Posts: 147
Dr Jefyll wrote:
Whoa.. what?? You were using Return Long as a form of threading?? That would imply 24-bit addresses, which is rather provocative. Perhaps I've misunderstood your remark.

I probably was a bit brief/inexact in my explanation. My 6502 Forth was DTC, which when converted for my 65816 multi-bank memory map, resulted in a 44 cycle NEXT, yes using long addressing. Using long addressing, I figured I might as well switch to STC where the DTC jump to NEXT is basically replace with an RTL. And all of DTC NEXT and possible DOCOL are replaced in STC with a JSL.

It's doubled the size of my compiled Forth code from the 6502 version, but I've got a lot more memory to work with and I get the 16-bit performance boost and dramatically faster "NEXT" with this new version.


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 14, 2022 12:20 pm 
Offline
User avatar

Joined: Tue Aug 11, 2020 3:45 am
Posts: 311
Location: A magnetic field
tmr4 on Thu 26 May 2022 wrote:
I wasn't getting size or speed, but with the 65816, size wasn't really important anymore.


And with agriculture, speed isn't important either. I think the most demanding part of hydroponics is that water pumps should stop within 15 seconds. Maybe 30 seconds.

Regarding PLCC, it is rapidly becoming an obsolete format and that may be a problem if you agricultural system is commercial. It may be preferable to move from DIP to surface mount and skip PLCC entirely. If you have components on both sides, intentional surface mount* has better density than three tiers of chip stacked DIP. Whereas, through hole PLCC sockets may have worse density than two tiers of chip stacked DIP.

*On the 6502 Forum, any chip is a surface mount chip. You merely have to splay the pins wide enough.

_________________
Modules | Processors | Boards | Boxes | Beep, Beep! I'm a sheep!


Top
 Profile  
Reply with quote  
PostPosted: Tue Jun 14, 2022 7:35 pm 
Offline
User avatar

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8428
Location: Southern California
Sheep64 wrote:
*On the 6502 Forum, any chip is a surface mount chip. You merely have to splay the pins wide enough.

There is a tool to J-lead thru-hole parts, but I have not seen one myself. That would make these parts take less room than making them gull-wing.

Another way to do it is called the "I-lead," where you leave the pin straight but cut it just below the shank, and use a solder fillet to make a good connection to the foil.

Yet another is to remain thru-hole but stagger the rows so they don't interfere with each other. Then you solder single-row (not double-row) sockets on one side, insert the ICs that go on the other side and solder them, then insert the ICs that go on the first side into their sockets. Another way is to just solder from the component side since the other side will be covered with ICs by that time.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 76 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC


Who is online

Users browsing this forum: No registered users and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: