There is a lot of steps to receiving, verifying checksums, decoding packet types, capturing actual data payloads, configuring responses, forming outgoing packets, etc. If you are only running a single application, then using the 6502/65816 to do all of the manipulation won't be a problem as you'll be waiting for the data payload either way.
Remember that IP was invented at a time when processing power was *SO* expensive that even logical ANDing netmasks was considered too much overhead (hence why localhost is assigned an entire class-A for itself).
IP isn't hard to manage if you factor your code correctly. BSD sockets is not an example of what I'd call "well factored." See the uIP stack for an example, non-sockets-derived TCP/IP stack that also supports UDP, and does so quite efficiently.
However, if you are trying run multiple tasks, and only one is working with the ethernet, then you'll be slowing down the other tasks and/or slowing down the data throughput by using the 6502.
If you implement a sockets-like API that exists in the core OS and runs as a "kernel thread," yes. But that's a lot of conditions!!! If you run TCP/IP management code as an application-layer package, it can be task switched along with everything else, thus guaranteeing rapidity of real-time response to external stimuli. The tradeoff is that instead of 20ms ping times, you might now get 40ms. Big deal.
Me personally, I hope to have a system that can focus on the task/tasks at hand and not get burried in the low-level manipulation of I/O devices.
It sounds to me like you've already made up your mind then, and I have to question why you bothered asking in the first place.
For example: Creating the "SBC web server". Now, on top of processing ethernet I/O, you'll have to access a good-sized file system. If we have the 6502/65816 do all of the low level ethernet I/O and the low level storage system I/O, will there be any time left to process some external stimuli that might be used in our web pages?
Several things.
1) Ethernet has the capacity to swamp the processor, no matter what --
EVEN IF you use an external, dedicated controller. So, no matter what is going to happen, your CPU is going to be too sluggish to keep up with a 10Mbps solid transfer rate. Packets WILL get dropped. Guaranteed.
2) You can cache frequently accessed files in a RAM disk, thus significantly dropping your filesystem I/O costs. Look to how Forth manages its screens for an example -- it very closely mimics how 32-bit processors with paging MMUs works, only it does it in software. I would like to note that besides relying on native OS file caching capabilities, many web servers will actually do their own caching of content as well. Modern web servers are more I/O bound than you think!
Anyhow... I am currently working on the ethernet I/O and FAT16/32 support for compact flash storage. By using an AVR to handle the low level I/O, I can allow the 6502/65816 to spend more time processing the data collected by the various I/O sources.
Although FAT is pretty simple, you ARE aware that you are allowed to use other filesystems on the storage media, right? If you're going to spend the time to optimize the software and/or dedicated controller, you might as well optimize the choice of filesystem as well. Or create your own as the case may be.
The AVR's provide such resources as UART/SPI/I2C/ADC/PWM/general purpose I/O that make them ideal for data collection. The 6502 is better suited to perform the manipulation of the data and control the flow.
Well, what kind of manipulation are you talking about here? Overall, I find the capabilities of the AVR core to be comparable to the 6502's raw capabilities. Yeah, there are some warts that I wish would disappear, but that's also true of the 6502 and 65816 too. And when you boil it down, the ATmega cores, at least, are on average twice as fast as a comparably clocked 6502 thanks to its 1-cycle-per-instruction execution rate.
Remember too that the application you're designing doesn't sound like it's the kind that you'd want to have slashdotted. That is, it sounds like only 2 maybe 3 people at most would ever be interested in the data it provides. Hence, 2 to 3 concurrent accesses ought to be trivial for the system to handle, with or without TCP/IP header management in software.