I've rewritten my alternate ACIA servicing code for the CBM Plus/4 and posted it to my Github. In the end, I was not able to get any confirmation of the flaky bit 7 of the Status register. But since testing only involved one computer, I decide to bypass any possible problem by testing the transmit and receive flags individually instead of bit 7.
But I wanted to revisit the second question I had in the original post because others using the 6551 might be affected by what I found. That involves what happens when your IRQ servicing routine writes a byte to the 6551 Transmit Data Register. If nothing is currently being shifted out, the byte written to the Data register will immediately be transferred to the Shift register, and a new interrupt will be generated immediately - because the Data register is empty again. This will bring the IRQ line low again, and since IRQ is level-triggered in the 6502, when the current interrupt service completes, a new one will follow on immediately.
If there is a transmit queue, and if additional bytes remain in the queue, then the second interrupt will let you write a new byte to the Data register, and another interrupt will not be triggered because the first byte is still being shifted out, so the Data register is not empty. This allows for truly continuous transmission if you can keep the queue at least partially full.
But if there is nothing else in the queue, the second interrupt will be pointless, and will just waste time, because nothing will happen in it.
As I said in the original post, the Plus/4 has a one-byte transmit queue. So since the second interrupt follows immediately after the first one, there is no opportunity for the terminal code to fill the queue again. And indeed I found that there were two interrupts generated for every byte transmitted. At high baud rates, that can compromise performance. So my solution was to wait just a bit after writing the byte to the Data register, and then read the Status register again. That clears the second interrupt and lets IRQ go back high. The delay is needed because the new interupt is not actually generated immediately, but after a period of time that is a function of the baud rate. So a longer delay is needed for slow baud rates than for fast ones.
But in reading the Status register again, there is a possibility that a byte has just been received, and if not serviced, that byte will be lost. So what I ended up with was servicing the transmit first, and if I've sent a byte, I wait a while, then OR the Status register with a copy of the original Status read. So if a byte was received on either Status read, that bit will be set, and I service that receive.
I assume everyone but Commodore would use a real multi-byte transmit queue. But even so, it might make sense after writing the byte to transmit Data, to wait a bit, then read Status again, and if Data is empty, write another byte to it if there is such a byte in the queue. Doing that could save an interrupt.
The only reference I found for the delay was in the Rockwell datasheet for their 6551. It said when a byte is written to the Data register, and then transferred to the shift register, the Data-register-empty flag bit is not valid until after 1/16 of a bit period. Apparently this also applies to generating the interrupt. I was not able to measure that accurately, but tested how much of a delay was needed for the double interrupts to transition to single interrupts, and found that the delay increases with the baud rate. My code is complicated enough that no extra delay is needed for 4800 baud and faster. But for slower baud rates I go into a short delay loop. "Slower" means bit 2 of the Control register is cleared. For very slow baud rates (600 and below), I do not delay since the second interrupt might go faster than the needed delay, and in any case at 600 baud a few extra interrupts per second probably don't really matter.
Hope this is useful.
https://github.com/gbhug5a/My_CBM_stuff/tree/main/Plus4_IRQ_ACIA_error