Don't forget that if you can use the 65c02, you also have TSB (test and set bit) and TRB (test and reset bit). If you already have the mask in the accumulator, the read-test-modify-write process to memory takes only one instruction. The addressing modes available for these are absolute and zero page-- no indirects or indexeds (if I can make that a word).
If you have Rockwell's or WDC's 65c02, you further have the BBS (branch on bit set), BBR (branch on bit reset), SMB (set memory bit), and RMB (reset memory bit) available for zero page.
The first two, BBS and BBR, are 3-byte instructions. The first byte (the op code) actually also tells which bit to test (meaning you don't ever need a mask in the accumulator). The second byte tells which address in zero page to test. The third byte tells how far in which direction to branch if appropriate (meaning you don't have to follow it with a conditional branch instruction). It's all wrapped up in a single instruction.
The other two, SMB and RMB, are 2-byte instructions. The first byte (the op code) actually also tells which bit to set or clear (again meaning you don't ever need a mask in the accumulator). The second byte tells which address in zero page to modify.
As BitWise mentioned, simply using the BIT instruction (regardless of what's in the accumulator, and without affecting it) will set or clear the N flag according to bit 7 of the contents of the address you test, and set or clear the V flag according to bit 6 of the contents of the address you test. This gives special convenience if you connect individual I/O bits you want to test to bits 7 and 6 of I/O ICs like the 6522. You can use the BIT instruction immediately followed by a conditional-branch instruction, with no need to affect the accumulator or use AND or ORA.
Another convenience for output is to put a bit you want to toggle on bit 0 of an I/O IC. As long as the program keeps track of the state of that bit, you can turn it on or off with a single instruction by using INC or DEC on the port. If it's 0 and you want to change it to 1, you don't need to load the port, ORA #1, and store it back-- just INC it. A single instruction takes care of it. The other bits (and the accumulator) will be unaffected. If it's a 1 and you want to make it a 0, just DEC it.
You can use the same trick if you want to have two output bits toggle opposite each other, going from 01 to 10 (in binary) and vice-versa. INC takes 01 to 10, and DEC takes 10 to 01. No need to LDA addr, ORA#, AND#, STA addr, or even LDA addr, EOR#, STA. Just one instruction, INC or DEC on the I/O port where you have the two out-of-phase lines connected on port bits 0 and 1, will do the job.
The 6522 has special functions available for PB7 and PB6. PB7 can be made to toggle automatically every time T1 rolls over, and PB6 can be used as an external clock to trigger T2's counting. Simple application examples would include using PB7 to produce annunciator beeps while the processor is doing something else (so the processor doesn't have to stop everything it's doing just to babysit the port that's putting out a square wave), and using the PB6 input as a frequency or event counter that only interrupts the processor when it gets to a count that you've selected. Note that putting out a tone on PB7 as mentioned here, the tone will not be affected by interrupts.
You should download WDC's programming manual from the lower-right area of their front page at
www.westerndesigncenter.com . I think it's the best there is.