A quick correction is first necessary.
PHY pushes the current value of whatever is in the Y register; PHA means push Accumulator. So you can't say 'PHY $#01'. PHY and PHA have the operands (Y and A) implied by the opcode. Your code should look like this:
Don't forget that the same stack is used for return addresses. So if you call a subroutine, before you return, the stack should be exactly as it was on entry (usually), or you will return to some random location and crash. If you push something, make sure to pull it off (for preserving registers, for instance).
In theory, there is no difference between theory and practice. In practice, there is. ...Jan van de Snepscheut
Note however that the linked page, even though it says "Rockwell," is only for the NMOS instructions, and leaves out the extra CMOS ones like PHY and PLY.
As enso alluded, the stack doesn't care where it got its contents. So for example if you do a PHA PHX PHY, in that order, and then do a PLX, X will take on the value that Y had, because Y was pushed last, and the new top-of-stack will now be the value that was originally in X.
All push operations first decrement the SP, then store the value at $0100,SP.
All pull (pop) operations read the value at $0100,SP, then increment the SP.
There are no "peek at the stack without popping" instructions. In order to do that you need to manually transfer the stack pointer to another register and index.
The stack pointer points to the most recently pushed element on the stack, so TSX then LDA $0100,X looks at the head of the stack. LDA $0101,X looks at the 2nd byte on the stack.
Actually when you push something, the byte is stored at the stack pointer S, and then the stack pointer is decremented, so it ends up pointing to the next available location. When you pull something, it's the reverse, so the stack pointer is incremented before the byte is read.
A nice thing added on the 65816 is the stack-relative addressing where you can read for example the fifth byte on the stack without doing TSX, LDA 105,X, or pulling anything off the stack first.
All push operations first decrement the SP, then store the value at $0100,SP.
All pull (pop) operations read the value at $0100,SP, then increment the SP.
Pushes post-decrement SP and pulls pre-increment SP.
In native mode, the 65C816's stack can be anywhere in the range $000000-$00FFFF, so the notation of $0100,SP wouldn't apply. The actual address in SP is used. As Garth noted, the stack can be treated as indexed RAM with any of the very useful ...,S instructions. I make extensive use of that capability in my POC unit's firmware.