6502.org • View topic - Historical question about ASCII

View unanswered posts | View active topics

Board index » 6502.org Users Forum » General Discussions

All times are UTC

Historical question about ASCII

Page 1 of 2

[ 16 posts ]

Go to page 1, 2 Next

Previous topic | Next topic

Author

Message

Dan Moos

Post subject: Historical question about ASCII

Posted: Tue May 02, 2017 2:42 am

Joined: Sat Mar 11, 2017 1:56 am
Posts: 276
Location: Lynden, WA

I've always wondered this.

Is there a reason that the ASCII codes 0-9 do not map to the same characters? As in ASCII 0 is "0", 1 is "1", ect...

Its just an annoyance to deal with when converting strings to numbers today, but on the computers of yesterday, wouldn't the extra step of subtracting "0" from the characters to make the numbers match have been a bigger deal computation-wise?

The only problem I see is that having "0" be 0 is that you couldn't have a NULL character. Is that reason enough? As I write this, that small thing does seem like a good reason I guess.

Anyone know how it came to be as it is?

Top

GARTHWILSON

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 4:53 am

Joined: Fri Aug 30, 2002 1:09 am
Posts: 8521
Location: Southern California

I sure wish the A came right after the 9, so you wouldn't have to test in the conversion of hex numbers, just subtract $30.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources
The "second front page" is http://wilsonminesco.com/links.html .
What's an additional VIA among friends, anyhow?

Top

Arlet

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 5:30 am

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands

GARTHWILSON wrote:

I sure wish the A came right after the 9, so you wouldn't have to test in the conversion of hex numbers, just subtract $30.

That wouldn't help me, because I generally prefer lower case

Top

BigDumbDinosaur

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 5:30 am

Joined: Thu May 28, 2009 9:46 pm
Posts: 8408
Location: Midwestern USA

Dan Moos wrote:

It's a long story and I suggest you do some searching, starting with Baudot code, which is the distant ancestor of ASCII that was intended for use with nineteenth century telegraph systems. Also, read about teleprinters, which were an essential tool of news services such as API and Reuters for many years.

Despite the seeming incongruities of ASCII with character sets, there is a method to the madness. Only those of us who have been around long enough to have worked with Tele-Type machines and Friden Flexowriters consider ASCII to be 100 percent logical.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

Arlet

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 5:32 am

Joined: Tue Nov 16, 2010 8:00 am
Posts: 2353
Location: Gouda, The Netherlands

A bit of explanation:
https://en.wikipedia.org/wiki/ASCII#History

Top

Rob Finch

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 10:55 am

Joined: Sun Dec 29, 2002 8:56 pm
Posts: 452
Location: Canada

I was wondering at one point why the newer standard Unicode doesn't support things that appear in a keyboard stream like cursor controls. I got talking to a Unicode expert about it one day and they seemed to have a reasonable explanation. So I made up my own set of virtual keycodes for one project.
http://www.finitron.ca/Documents/virtual_keycodes.html
I'd like to know what is the standard for virtual keycodes ?
ASCII is an older code which is great when characters fit into six or eight bits. But for any apps that need to be internationalized a wide code like Unicode is required.

_________________
http://www.finitron.ca

Top

BigDumbDinosaur

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 9:22 pm

Joined: Thu May 28, 2009 9:46 pm
Posts: 8408
Location: Midwestern USA

Rob Finch wrote:

ASCII is an older code which is great when characters fit into six or eight bits. But for any apps that need to be internationalized a wide code like Unicode is required.

Actually, there is only one form of ASCII, called "US-ASCII," and that is seven bits to the datum. ASCII does not define the meanings of data that are in the range $80 to $FF, inclusive. ASCII was strictly a product of the forerunner of the American National Standards Institute, hence the US-ASCII moniker. INCITS uses that reference to avoid confusion with informal extensions to the ASCII set, as well as ASCII-like enhancements, such as Unicode.

Unicode produces a bulkier data stream than ASCII and in situations in which the alphanumeric set plus punctuation and control codes is all that is needed, ASCII will be substantially more efficient and economical of bandwidth. For example, transmitting binary data in Intel or Motorola hex form can be done solely with seven bit ASCII, using only numerals, uppercase letters A-F and a few control codes (typically <CR>, <LF> and <EOT>). Western languages that use only the Latin alphabet are transmittable in ASCII and even when lacking some diacritical marks, are usually intelligible to native speakers. Unicode was primarily developed to handle Latin characters with diacritical marks, such as ü, å, etc., localized characters, such as Æ, as well as the characters found in non-Latin alphabets, e.g., Cyrillic.

Incidentally, one of the reasons the control codes in the ASCII set are $00-$1F is the mechanism in Tele-Types was arranged to recognize the low bit patterns as control functions, not printing characters. As ASCII evolved, this characteristic was accommodated so Tele-Types could be used as computer I/O devices. The spread between "0-9", "A-Z" and "a-z" exists because two bits decide if the character set will be numerals, uppercase letters or lowercase letters. It all makes sense in the contexts of doing case conversion or determining if a user typed a numeral or a letter of either case.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

BigEd

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 9:38 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10940
Location: England

Unicode, when encoded in UTF-8, and carrying ASCII, is no more bulky - unless you regard 8 bits as more than 7 bits, which you might!

Top

BigDumbDinosaur

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 9:56 pm

Joined: Thu May 28, 2009 9:46 pm
Posts: 8408
Location: Midwestern USA

BigEd wrote:

Unicode, when encoded in UTF-8, and carrying ASCII, is no more bulky...

Correct on UTF-8, which parallels ASCII, but supports the $80-$FF range. I was referring to UTF-16.

Quote:

- unless you regard 8 bits as more than 7 bits, which you might!

Funny you mention that. We think in terms of eight bits to the byte, yet in serial communications, seven bits continue to be used in setups that use only ASCII, hand-held serial bar code scanners being one such case. As another example, I have here in my office a Welch-Allen ST6980 magnetic strip reader (MSR, aka credit card reader) that is interfaced to a TIA-232 port on my Linux software development machine. The reader uses seven bit data format, which means, in theory, the UART has less work to perform to serialize and deserialize a datum being exchanged with the MSR.

However, I suspect that any performance gain in that regard will be vanishingly small.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

BigEd

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 10:01 pm

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10940
Location: England

(Umm, UTF-8 expresses any Unicode character - as can UTF-16, but UTF-8 is more compact.)

Top

commodorejohn

Post subject: Re: Historical question about ASCII

Posted: Tue May 02, 2017 10:37 pm

Joined: Thu Jan 21, 2016 7:33 pm
Posts: 276
Location: Placerville, CA

BigDumbDinosaur wrote:

However, I suspect that any performance gain in that regard will be vanishingly small.

Well, at the same baud rate, assuming a fairly standard-ish packet with one stop bit and one start bit per character, it's an 11% increase in throughput.

Top

BigDumbDinosaur

Post subject: Re: Historical question about ASCII

Posted: Wed May 03, 2017 12:26 am

Joined: Thu May 28, 2009 9:46 pm
Posts: 8408
Location: Midwestern USA

BigEd wrote:

(Umm, UTF-8 expresses any Unicode character - as can UTF-16, but UTF-8 is more compact.)

I don't believe UTF-8 supports many non-Latin character sets, such as traditional Chinese. In fact, I seem to recall that pairs of UTF-16 words may be used in such cases, resulting in 32 bits being passed per character.

commodorejohn wrote:

BigDumbDinosaur wrote:

However, I suspect that any performance gain in that regard will be vanishingly small.

Well, at the same baud rate, assuming a fairly standard-ish packet with one stop bit and one start bit per character, it's an 11% increase in throughput.

You are confusing the data rate in bits per second with baud rate—the two are not directly related. Baud refers to the symbol rate on the medium, not the data transmission rate. In the case of telephone modems, baud rate is often a fraction of the bit rate due to the encoding scheme being used. For instance, a typical analog telephone link that spans more than one central office cannot support frequencies much above three kilohertz. If the baud rate of a pair of modems using that link was the same as the practical bit rate you'd have a 3 Kbps link, neglecting the effects of errors. The 56K rate achieved with the V.90 standard was the result of using an advanced encoding scheme that allow many bits to be exchanged within a single symbol, using 8000 baud coming to the subscriber and 3429 going from the subscriber.

On a hardwired TIA-232 link, baud rate and bit rate are the same. If a pair of short-haul modems is introduced, as was a once-common arrangement in large factories and adjacent buildings sharing a common host, the baud rate between the modems will usually be lower than the bit rate on the TIA-232 connections to the modems, since the bandwidth limits of analog telephone lines still apply.

Incidentally, a format of seven data bits and two stops bits is possible with most serial devices, as is seven data bits, one stop bit and parity, the latter which is often used with MSRs and bar code scanners.

_________________
x86? We ain't got no x86. We don't NEED no stinking x86!

Top

BigEd

Post subject: Re: Historical question about ASCII

Posted: Wed May 03, 2017 6:51 am

Joined: Thu Dec 11, 2008 1:28 pm
Posts: 10940
Location: England

BigDumbDinosaur wrote:

BigEd wrote:

(Umm, UTF-8 expresses any Unicode character - as can UTF-16, but UTF-8 is more compact.)

UTF-8 is really rather clever and interesting - I think perhaps you don't know what it is. Well worth looking into!

Top

rwiker

Post subject: Re: Historical question about ASCII

Posted: Wed May 03, 2017 7:10 am

Joined: Thu Mar 03, 2011 5:56 pm
Posts: 284

BigDumbDinosaur wrote:

BigEd wrote:

(Umm, UTF-8 expresses any Unicode character - as can UTF-16, but UTF-8 is more compact.)

This is incorrect. UTF-8 and UTF-16 are different encodings of the same character set, Unicode. UTF-8 encodings can be up to 6 bytes long (I think - it's been a while since I looked at the details), but the US-ASCII subset of Unicode needs only one byte for each character.

Top

Bregalad

Post subject: Re: Historical question about ASCII

Posted: Wed May 03, 2017 9:48 am

Joined: Sat Mar 27, 2010 7:50 pm
Posts: 149
Location: Chexbres, VD, Switzerland

Quote:

UTF-8 fully supports traditional Chinese.

Quote:

Unicode, when encoded in UTF-8, and carrying ASCII, is no more bulky - unless you regard 8 bits as more than 7 bits, which you might!

UTF-7 comes to the rescue, then.

Top

Page 1 of 2

[ 16 posts ]

Go to page 1, 2 Next

Board index » 6502.org Users Forum » General Discussions

All times are UTC

Who is online

Users browsing this forum: No registered users and 10 guests

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum