Use GAL as a small eprom

GARTHWILSON · Post by **GARTHWILSON** » Sat Jul 25, 2015 7:58 pm

i_r_on wrote:

2 bytes to go, current size is 34 bytes.

edit: Oops sorry, I just missed reinitialization in interrupt routine... it's now 32 bytes, thanks!

You can do another trick with the BIT# op code 89:

Code: Select all

NMI:    CLC
        .DB  $89
IRQ:    SEC
        ROL
        DEX
        <etc.>

and there might be more that could be done. If there's an NMI, the SEC op code becomes an operand for the BIT# op code, and BIT# only affects Z, not V, N, C, A, or anything else.

i_r_on · Post by **i_r_on** » Sat Jul 25, 2015 8:17 pm

GARTHWILSON wrote:

and there might be more that could be done. If there's an NMI, the SEC op code becomes an operand for the BIT# op code, and BIT# only affects Z, not V, N, C, A, or anything else.

Thanks I noted this also. If I found a bug a byte or two will be very useful.

@hoglet : I use WinCUPL. Is there an extra optimization needed if I just get the products from the page you linked? I might try this using GAL22V10D after I test the rom code with an eprom.

hoglet · Post by **hoglet** » Sat Jul 25, 2015 8:36 pm

i_r_on wrote:

@hoglet : I use WinCUPL. Is there an extra optimization needed if I just get the products from the page you linked? I might try this using GAL22V10D after I test the rom code with an eprom.

I would start by just using WinCUPL and expressing the data as a truth table (see section 2.3.1 of the manual):
http://ecee.colorado.edu/~mcclurel/Atme ... oc0737.pdf
Make sure to include don't cares for unused addresses.

Only if this seems to struggle with too many product terms would I resort to manual optimization.

Dave

Dr Jefyll · Post by **Dr Jefyll** » Sat Jul 25, 2015 11:10 pm

hoglet wrote:

Another thought...

The number of product terms a data bit requires might actually be less than the number of 1's (or 0's if inverted).

Thanks, Dave -- I quit before dealing with this point. Your explanation is better than what I could have come up with anyway!

Code: Select all

   JMP $0000

i_r_on, you can save one more byte by replacing the JMP with a forward BVC that wraps around from page $FF to page $00. On 'C02 BRA will work as well. BRA and BVS are only two bytes; JMP of course is three.

Also a warning: as is, there's a threat that a single IRQ will be recognized multiple times. AFAIK the simplest solution is to limit the width of the low pulses applied to IRQ. IRQ needs to be high again before the ISR concludes. Interrupts get re-enabled after RTI pops the flags off the stack. If IRQ remains low then one or more extra (spurious) responses will result.

i_r_on · Post by **i_r_on** » Sat Jul 25, 2015 11:39 pm

Dr Jefyll wrote:

Code: Select all

   JMP $0000

i_r_on, which 6502 are you using? If it's a CMOS chip (65C02) then the JMP can be replaced with a forward BRA that wraps around from page $FF to page $00. This saves one byte, as a BRA is shorter than a JMP.

Also a warning: as is, there's a threat that a single IRQ will be recognized multiple times. AFAIK the simplest solution is to limit the width of the low pulses applied to IRQ. IRQ needs to be high again before the ISR concludes. Interrupts get re-enabled as the flags get popped off the stack by RTI. If IRQ is still low after this then an extra (spurious) response will result.

Unfortunately I don't use 65c02, it's an UM6502 by UMC. UM6502 is easier to find and much cheaper than 65c02 here.

After pulling IRQ low I wait at least 10 6502 cycles then pull it high again. Pin is going high while 6502 handles the interrupt. 10 cycles is : 7 cycles for interrupt to trigger, 3 cycles for already executing operation to finish. Foreground operation is always "LOOP BVC LOOP", it always takes 3 cycles while transfer continues. I used 10 microseconds wait previously and didn't have any problem, arduino has already overhead pulling pin high and low probably. I didn't bother to see the actual pulses produced by arduino by my logic analyzer. I'll test when I find time to see the actual pulse lengths.

Before triggering another interrupt controlling micro waits a little bit more than interrupt handler to do it's job. Say if handler takes 30 microseconds than controlling micro waits for 40 microseconds . Optimal times can be found by testing.

Dr Jefyll · Post by **Dr Jefyll** » Sat Jul 25, 2015 11:56 pm

i_r_on wrote:

Unfortunately I don't use 65c02, it's an UM6502 by UMC.

This will also work; I edited my post after you read it -- sorry!

As for limiting pulse widths, you seem to have the matter under control. But, FWIW, an alternative approach can avoid the issue entirely. Instead of using IRQ and NMI to enter data, and using SO (set overflow) to cue termination, you could use NMI (0) or NMI and SO together (1) to enter data, and use IRQ to cue termination. The advantage is that SO is edge-triggered, like NMI. Hence an excessively long pulse is harmless.

i_r_on · Post by **i_r_on** » Sun Jul 26, 2015 12:23 am

Dr Jefyll wrote:

This will also work; I edited my post after you read it -- sorry!

Good, another byte to save for potential bugs.

Dr Jefyll wrote:

As for limiting pulse widths, you seem to have the matter under control. But, FWIW, an alternative approach can avoid the issue entirely. Instead of using IRQ and NMI to enter data, and using SO (set overflow) to cue termination, you could use NMI (0) or NMI and SO together (1) to enter data, and use IRQ to cue termination. The advantage is that SO is edge-triggered, like NMI. Hence an excessively long pulse is harmless.

The issue with the S.O. is 6502 should check it's existence and sync to it and should clear it. You can set the overflow flag from outside but you can't clear it. And you must clear it in the foreground code since interrupts will pull the status register from stack when they finish.
From the very start I chose an asynchronous approach not to deal with timing issues. I already need to do the nmi with the same timing by the way. If I try to send another bit while 6502 handling the NMI then the whole data would get corrupted.

For faster transfers even S.O. alone can be used by the way.. here is an implementation I found on web : http://www.ele.uva.es/~jesus/6502copy/proto.html

Dr Jefyll · Post by **Dr Jefyll** » Sun Jul 26, 2015 1:18 am

i_r_on wrote:

And you must clear it in the foreground code since interrupts will pull the status register from stack when they finish.

Good point. A PLP / CLV / PHP sequence in the ISR will correct the issue, but that requires three bytes. However, once the thing is rewritten the necessary space may appear. For example in this alternative scheme termination is cued by IRQ, so the IRQ vector can point straight to 0000 -- no JMP/BRA/BVS 000 is required.

i_r_on wrote:

For faster transfers even S.O. alone can be used by the way.. here is an implementation I found on web : http://www.ele.uva.es/~jesus/6502copy/proto.html

Cool -- thanks!

BTW another termination strategy is to fix the number of bytes as 128 (or 256). Then the INX can be followed by a BMI (or BEQ) that's taken for termination. To get 128 (or 256) bytes you may have to add some padding, which is a disadvantage. But on the plus side you have one less signal to pass between the controlling micro and the 6502. And I expect the code would be shorter than any other option mentioned so far.

i_r_on · Post by **i_r_on** » Sun Jul 26, 2015 2:52 am

Dr Jefyll wrote:

BTW another termination strategy is to fix the number of bytes as 128 (or 256). Then the INX can be followed by a BMI (or BEQ) that's taken for termination. To get 128 (or 256) bytes you may have to add some padding, which is a disadvantage. But on the plus side you have one less signal to pass between the controlling micro and the 6502. And I expect the code would be shorter than any other option mentioned so far.

Actually my plan was to deploy a bit longer, more versatile and speed optimized version of this loader to zero page and run there.

Like this,

1. 6502 is reset, 32 bytes boot rom awaits for 64-256 bytes length actual loader.
2. Once this loader is transferred and executed, it will copy itself to the last page of the memory and run there changing reset / irq and nmi vectors.
3. Micro transfers the actual payload and this is invoked after loading.

In my case S.O. pin is required for this actual loader for a speedy operation. But of course one can in another setup use a CIA or another chip to transfer the rest of the program in another way. Indeed it's a nice trick.

cbscpe · Post by **cbscpe** » Sun Jul 26, 2015 8:01 pm

I would try to check with WinCUPL. Let it select the PINs, so it can put the Data outputs of the bit that requires more product terms to the appropriate OLMC (the GAL22V10 has different numbers of PT for the OLMCs). Then you can use the TABLE instruction of WinCUPL which would look like

Code: Select all

TABLE ADDRESS => [D7..0] {
	[00] => 'h'ab;
	[01] => 'h'cd;
	[02] => 'h'ef;
}

It's more of a try and error, but you get a first impression how feasible it is to put a bootloader into a GAL. Perhaps you could even use an additional input to the GAL instead of using SO that would modify the wait cycle. Then you could use it for a 65C816 as well.
I actually tried your last posted bootloader and wincupl shows a max of 8 PTs. That would eve fit in a gal16v8.

Cheers

Peter

cbscpe · Post by **cbscpe** » Sun Jul 26, 2015 9:00 pm

I just had another idea. What if you would set carry after the reset before wait for the bootloader and only the entry of the NMI would clear carry and then fall through to the IRQ. As far as I know the status in the interrupt routine is just saved on the stack and but C V Z N are preserved i.e. the interrupt sees what the main program had at the time of the interrupt.

Code: Select all

       SEC
WAITBOOTLOADER
       BVC.  WAITBOOTLOADER


NMIROUTINE
       CLC
IRQROUTINE
...


       RTI  this will set carry again

i_r_on · Post by **i_r_on** » Sun Jul 26, 2015 9:14 pm

cbscpe wrote:

I would try to check with WinCUPL. Let it select the PINs, so it can put the Data outputs of the bit that requires more product terms to the appropriate OLMC (the GAL22V10 has different numbers of PT for the OLMCs). Then you can use the TABLE instruction of WinCUPL which would look like

Code: Select all

TABLE ADDRESS => [D7..0] {
	[00] => 'h'ab;
	[01] => 'h'cd;
	[02] => 'h'ef;
}

It's more of a try and error, but you get a first impression how feasible it is to put a bootloader into a GAL. Perhaps you could even use an additional input to the GAL instead of using SO that would modify the wait cycle. Then you could use it for a 65C816 as well.

Cheers

Peter

Thanks for the info, I'll test it when I find time. Last time it was a lousy job to address decode chip select signal for swinsid. It failed intermittently. GALs have clock input which differentiates it from discrete logic circuits.

But firstly I should test this 32 bytes loader with eproms. If it doesn't work then all is useless (for me)

I'd like to upgrade to a 65C816 but I have plenty of stuff yet to implement for this existing work. I'm still learning.

Actually it could even be reduced to single pin if one wants, just trigger nmi for say 16 different timings and have foreground task to measure time and transfer 4 bit of data.

If one have other means to transfer data (i/o peripherals) upon bootloading then 1 pin bootloading would be really nifty interface. In my case I don't have such setup though I want to introduce a VIA (instead of CIA*) into my circuit to support digi sids. Then I can switch to this method to do parallel speedy transfer.

*: VIA seems to be adaptable to CIA, it have reverse registers and could be adapted as a CIA for certain tasks with proper cabling.

i_r_on · Post by **i_r_on** » Sun Jul 26, 2015 9:26 pm

@cbscpe : Yes, that optimization would certainly work, good way of taking back what RTI steals.

bogax · Post by **bogax** » Sun Jul 26, 2015 10:19 pm

if you used the overflow flag as the data in
and one of the interrupts as a clock, you
could use the other interrupt as the jump to
the start of your code.

if you cleared the accumulator for each byte and started
by clocking in a 1 as an end of byte flag (and then your
8 bits of data) the accumulator could be your bit counter
and clear the carry while shifting in the data

cbscpe · Post by **cbscpe** » Mon Jul 27, 2015 5:37 am

Thats a good Idea. But instead of using SO I would use another input of the GAL that only for the first instruction of the data input interrupt routine would create either the CLC or the SEC opcode. As they differ only in one bit, that would affect only one output equation.

Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom

Re: Use GAL as a small eprom