6502.org Forum  Projects  Code  Documents  Tools  Forum
It is currently Sat Apr 27, 2024 9:37 pm

All times are UTC




Post new topic Reply to topic  [ 12 posts ] 
Author Message
PostPosted: Thu Jan 25, 2024 1:04 am 
Offline

Joined: Fri Dec 21, 2018 1:05 am
Posts: 1076
Location: Albuquerque NM USA
For my VGA controller project, I need program with identical execution paths in every 31.5KHz interrupt. There are 800 clocks between each interrupt where 640 clocks are for active video display, and 160 clocks are available to do simple tasks (actually about 140 clocks due to interrupt overhead). There are two tasks to do, one is very simple: picking which 80 contiguous bytes of data to display for current horizontal line; the second one is more complex: sampling the PS2 keyboard one bit at a time, assemble into a byte and write into PS2 queue. The PS2 queue will be read out and put on the display during the vertical retrace period.

The two tasks are executed in the back porch portion of the horizontal scan. Back porch is the time from horizontal sync interrupt to the active video display. This means the two tasks must have exactly the same execution path for every branch instruction so every active horizontal scan will start at the same time from the horizontal sync interrupt. I can attest that it is critical to get the path exactly the same otherwise the screen will flicker when I enter keystrokes.

Right now I have a working program by tediously counting each branching instruction and padded with string of 1-cycle illegal instructions so every branching path will finish at the same time. I'm afraid to change the program! Is there a tool/methodology for managing this kind of problem?
Bill


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 25, 2024 2:48 am 
Offline
User avatar

Joined: Fri Aug 30, 2002 9:02 pm
Posts: 1681
Location: Sacramento, CA
While not video, I wrote a special real-time clock program for the 6502 badge that was presented to Bill Mensch as a gift. It also had multiple length paths that dealt with LED refresh and time counting (rolling from 59 to 00 and incrementing the next unit).

I used the Kowalski Simulator and installed breakpoints at the top and bottom of each path and used the cycle counter function to indicate total time through each path. I was then able to add some NOP or other combinations of opcodes that balanced each path. Sometimes, adjusting one would affect another path, so it took many iterations to dial in the right sequencing.

This was much easier than tracking individual instruction times.

This project is what drove me to start fixing and update the simulator code as the cycle counting logic was not 100 accurate. I believe it nearly accurate now.

The Simulator can be downloaded from my website.

thanks!
Daryl

_________________
Please visit my website -> https://sbc.rictor.org/


Top
 Profile  
Reply with quote  
PostPosted: Thu Jan 25, 2024 4:08 am 
Offline

Joined: Fri Dec 21, 2018 1:05 am
Posts: 1076
Location: Albuquerque NM USA
I think cycle-accurate simulator is helpful. Although in this particular case I have had very little time therefore the program is necessarily small, so I already have the few types of instruction's execution time memorized. The program has many branches so it is rather tedious tracing through all branches and balancing them. Then after all that I changed my mind and have to do it all over again. sigh!
Bill


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 2:53 am 
Offline

Joined: Mon Feb 20, 2012 1:46 pm
Posts: 23
Location: America
That sounds easy. Here's one approach:
*=$c000
ldx #1
beq s1
bne s3 ; 5 cycles
s1 bne s2 ; 5 cycles
s2 nop ; is zero (path 1)
jmp c1
s3 nop ; is non-zero (path 2)
jmp c1
c1 nop ; continue here

Basically, I follow each branch with its opposite condition, so only one of the pair is ever taken. If the first beq is taken, that's 3 cycles, followed by 2 cycles from the bne not taken. If the beq is not taken, that's 2 cycles followed by the bne which is then taken. So it's 3+2 or 2+3, either way that's 5. Then at the end of the branch, both paths need to take a jmp to return to a common code path. In one case that will seem redundant, but actually any 3 cycle instruction could be put there.
As far as each code path, you will have to manage that yourself, putting nop's in one of them. Also watch for page boundary crossings, which could complicate things.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 4:06 am 
Offline

Joined: Fri Dec 21, 2018 1:05 am
Posts: 1076
Location: Albuquerque NM USA
I see. Thank you. Following that thought process, balancing out each branch path could also be done with adding 1-cycle illegal instruction in the branch-not-taken path.

I have very few cycles to spare, so branch taken/not-taken pair is too time consuming. Even an illegal instruction in the branch-not-taken path is expensive. Right now my longest path is all branch-not-taken which is 2 clocks, then I try to balance out all the branch-taken paths and eventually return to the end of the longest path.
Bill


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 5:33 am 
Offline

Joined: Mon Feb 20, 2012 1:46 pm
Posts: 23
Location: America
I see. There are no 1-cycle instructions, even illegal ones. However, a trick to balance the branch not taken paths to add an extra cycle, is to either load a value from a table across a page boundary, or load a value from an absolute address rather than zero page, converting a 3 cycle instruction to a 4 cycle one.
For example,
;path taken
lda $40
...
;path not taken
lda $0040

Same thing, but using a slower instruction variation.

Anyhow, other than that, you will have to count yourself. There is an assembler that does this for you. C64 Studio: https://github.com/GeorgRottensteiner/C64Studio
Is is geared towards several 8-bit home computers but can be used for any 6502 single board computer as well.
When you Build your file, a listing shows the cycle times for each line.


Attachments:
cycle times.jpg
cycle times.jpg [ 32.27 KiB | Viewed 2702 times ]
Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 5:41 am 
Offline

Joined: Mon Feb 20, 2012 1:46 pm
Posts: 23
Location: America
Another idea, you can avoid branches with a better algorithm, and it seems your PS2 decoding logic is a prime suspect for that. For example, instead of checking if a certain bit is set, you can lookup values in a table which takes a constant time.


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 2:33 pm 
Offline

Joined: Fri Dec 21, 2018 1:05 am
Posts: 1076
Location: Albuquerque NM USA
Illegal instructions on column 3 of instruction decode ($03, $13, $23,...$f3) are executed in 1 cycle. I think column B illegals are also executed in 1 cycle. I used illegal instruction $03 extensively for delay.

My PS2 decode has 5 branches from the main path; one of the branch (HouseKeep) has 3 more branches. So there are quite a number of branches to manage. I used an illegal instruction delay tap with multiple entries to simplify calculation. Reducing the number of branches would be a good thing. I included the PS2 decode program in case you are interested. The main path is at the end, under the comment line ;@@@@@@.

PS2 decode is called every horizontal sync interrupt (31.5KHz) that samples PS2 data and clock via PS2CLKReg which contains PS2Data on data bit[7] and PS2CLK on data bit[6]. Based on PS2CLK, it either do housekeeping function or processing data. It gets weedy after that... Plenty of room for improvements since the code is only 3 days old and I just got it working. I'll think about how to use lookup table instead. One advantage of programmable hardware is I can change the hardware to simplify software.
Bill


Attachments:
PS2_decode.zip [1.24 KiB]
Downloaded 37 times
Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 3:25 pm 
Offline
User avatar

Joined: Fri Jan 26, 2024 5:47 am
Posts: 37
Location: Prague; Czech Republic; Europe; Earth
I did see this Ben Eater video, where he use shift registers to catch full PS/2 message and read it afteward. Maybe something similar could save you some time/cycles here and there? I cannot say, as I do not understand deep enought and you may rejected this already.

_________________
http://micro-corner.gilhad.cz/, http://8bit.gilhad.cz/6809/Expanduino/Expanduino_I.html, http://comp24.gilhad.cz/Comp24-specification.html, and many others


Top
 Profile  
Reply with quote  
PostPosted: Sun Jan 28, 2024 4:02 pm 
Offline

Joined: Fri Dec 21, 2018 1:05 am
Posts: 1076
Location: Albuquerque NM USA
I'm limited by available logic inside the CPLD so I don't have enough flip flops to construct a 8-bit shift register. I have enough to do 4-bit shift register, but I don't think that's useful. There may be a different way. My focus is using as little hardware as possible.
Bill


Top
 Profile  
Reply with quote  
PostPosted: Wed Feb 07, 2024 6:23 am 
Offline

Joined: Mon Feb 20, 2012 1:46 pm
Posts: 23
Location: America
Hi,
It wasn't mentioned in your original post which CPU, now I realize its 65C02. I'm not familiar with this version. I see there are some 1-cycle commands (though, they merge before an interrupt can occur).

I wouldn't be able to work well on the PS/2 reading without the cycles memorized, nor it is obvious what the algorithm is without dissecting the code, so I'm afraid I'm not much help.


Top
 Profile  
Reply with quote  
PostPosted: Thu Feb 08, 2024 4:02 pm 
Offline

Joined: Sun Apr 26, 2020 3:08 am
Posts: 357
if you have a spare register you can use the #$55/AA method.

The Y- reg (or X) is pre-loaded with #$55 (%01010101)

then use this

clc. ; to change Acc to %10101010
tya
bmi s1
sec. ; to change Acc to %01010101
... ;needs a 1-cycle instruction here
s1 rol
tay


Top
 Profile  
Reply with quote  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 12 posts ] 

All times are UTC


Who is online

Users browsing this forum: No registered users and 15 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to: