S. Berliner, III's sbiii.com Infinity Printer Page keywords = Infinity Printer Gamow whell language character code

Updated:   30 Dec 2019; 15:15  ET
    original AT&T Worldnet Website begun 30 May 1996.]

URL:  http://sbiii.com/infprntr.html

S. Berliner, III
Consultant in Ultrasonic Processing
"changing materials with high-intensity sound"

[consultation is on a fee basis]

Technical and Historical Writer, Oral Historian
Popularizer of Science and Technology
Rail, Auto, Air, Ordnance, and Model Enthusiast
Light-weight Linguist, Lay Minister, and Putative Philosopher



note - The vast bulk of my massive Web presence (over 485 pages) had been hosted by AT&T's WorldNet service since 30 May 1996; they dropped WorldNet effective 31 Mar 2010 and I have been scrambling to transfer everything.  Everything's saved but all the links have to be changed, mostly by hand.  See my sbiii.com Transfer Page for any updates on this tedious process.


S. Berliner, III's

sbiii.com

Infinity Printer Page

PAGE INDEX:

On this Infinity Printer page:
  George Gamow's One Two Three Infinity.
  Going Gamow One Better
  W3C Unicode Charactersets

On the Language page:
  ENGLISH FIRST* - English as the official (and ONLY) language of the United States.
    (moved to its own page 08 Apr 03)
  FRA (Federal Railroad Administration) Terminology.
  LIMERICKS.
  TONGUE TWISTERS.
  GERMAN GEMS.
  LONG WORDS.
  OXYMORONS.
  ARABIC NUMERALS.
  U. S. ASCII CHARACTERS.
  AMERICAN ENGLISH KEYBOARDS.
  LINGUISTIC MISCELLANY
    (including Palindromes).

On the Language Continuation Page 1 page:
  Weird Usages.

On the DENGLISH page:
    DENGLISH - Neutered or Degenderized English
    Original Proposal of 11 May 1990.


INFINITY PRINTER

This page is concerned with an imaginary printer, as envisioned in 1947 by scientist and science writer George Gamow, and with languages that can be printed, only; please visit my LANGUAGE page, et seq., my DENGLISH - Neutered or Degenderized English page, and my CULTURE page (so-called) for literature and such and my fun page for humor (again, so-called).

- - - * - - -

George Gamow's One Two Three ... Infinity

In my later childhood, my mother, a blazing Hungarian-born intellect, introduced me to the (then-)new 1947 book, ONE TWO THREE . . . INFINITY, Facts and Speculations of Science, by George Gamow (first published by The Viking Press, Inc., N.Y.).  It was originally dedicated "TO MY SON IGOR WHO WANTS TO BE A COWBOY"*.

Gamow's book spanned almost all aspects of science but what has always stuck with me was his explanation and depiction of what I now term a "Infinity Printer", which is what this page is all about.

* - in later editions, that dedication was revised to read "TO MY SON IGOR WHO WANTED TO BE A COWBOY"
(Igor became a professor of microbiology and inventor).

Gamow delved into BIG numbers in PART I. PLAYING WITH NUMBERS.  After relating various amusing legends about counting and numeration, he stated that "probably the largest number ever mentioned in literature pertains to the famous 'Problem of a Printed Line'."  He proposed building a "printing press that would continuously print one line after another, automatically selecting for each line a different combination of the letters of the alphabet and other typographical signs".  He envisioned a machine which "would consist of a number of separate discs with the letters and signs all along the rim", with the discs "geared to one another in the same way as the number discs in the mileage indicator of your car, so that a full rotation of each disc would move the next one forward one place".  Next, he postulated what amounted to what would later be known as a "crash (or line) printer" but with those rotary discs instead of vertical type bars.

Here, then, is Gamow's fantastic illustration of the printer as he envisioned it:

GamowPrntr
(G. Gamow image © 1947 G. Gamow, One Two Three Infinity)

Gamow's title: "An automatic printing press that has just printed correctly a line from Shakespeare".

The chain drives the drums and the linkage below left slams the paper against the typewheels at every incremental movement of the primary typewheel; one is left to imagine the inking process.

Gamow further elucidated that "if we have only seven discs the number of necessary moves is:

l + 21 + 22 + 23 + etc., or
27-1=2222222-1 = 127.

If you moved the discs rapidly without making any mistakes it would take you about an hour to complete the task. With 64 disks the total number of moves necessary is:

264-1=18,446,744,073,709,551,615

Let us set the machine in action and inspect the endless sequence of different printed lines that come from the press.  Most of the lines make no sense at all.  They look like this:

"aaaaaaaaaaa . . "

or

"boobooboobooboo . . ."

or again:

"zawkporpkossscilm . . "

But since the machine prints all possible combinations of letters and signs, we find among the senseless trash various sentences that have meaning.  There are, of course, a lot of useless sentences such as:

"horse has six legs and . . ."

or

"I like apples cooked in terpentin. . . "

But a search will reveal also every line written by Shakespeare, even those from the sheets that he himself threw into the wastepaper basket!

In fact such an automatic press would print everything that was ever written from the time people learned to write: every line of prose and poetry, every editorial and advertisement from newspapers, every ponderous volume of scientific treatises, every love letter, every note to a milkman. . . . @

[@ - in English, that is! - SBIII]

Moreover the machine would print everything that is to be printed in centuries to come.  On the paper coming from the rotating cylinder we should find the poetry of the thirtieth century, scientific discoveries of the future, speeches to be made in the 500th Congress of the United States, and accounts of intraplanetary traffic accidents of the year 2344.  There would be pages and pages of short stories and long novels, never yet written by human hand, and publishers having such machines in their basements would have only to select and edit good pieces from a lot of trash - which they are doing now anyway.

Why cannot this be done?

Well, let us count the number of lines that would be printed by the machine in order to present all possible combinations of letters and other typographical signs.

There are 26 letters in the English alphabet, ten figures (0, 1, 2 ... 9) and 14 common signs (blank space, period, comma, colon, semicolon, question mark, exclamation mark, dash, hyphen, quotation mark, apostrophe, brackets, parentheses, braces); altogether 50 symbols.  Let us also assume that the machine has 65 wheels corresponding to 65 places in an average printed line.  The printed line can begin with any of these signs so that we have here 50 possibilities.  For each of these 50 possibilities there are 50 possibilities for the second place in the line; that is, altogether 50 x 50 = 2500 possibilities.  But for each given combination of the first two letters we have the choice between 50 possible signs in the third place, and so forth. Altogether the number of possible arrangements in the entire line may be expressed as:

65 times
————^————
50x50x50x . . . x50
or 5065

which is equal to 10110.

To feel the immensity of that number assume that each atom in the universe represents a separate printing press, so that we have 3 10 74 machines working simultaneously. Assume further that all these machines have been working continuously since the creation of the universe, that is for the period of 3 billion years or 1017 seconds, printing at the rate of atomic vibrations, that is, 1015 lines per second. By now they would have printed about 31074x1017x1015 = 310106 lines - which is only about one thirtieth of 1 per cent of the total number required.

Yes, it would take a very long time indeed to make any kind of selection among all this automatically printed material!"

- - - * - - -

Going Gamow One Better

So, now you have seen what so impressed me at the tender age of 13 or 14.  Let me develop this theme far further than Gamow.  Personal computers weren't even a dream yet, let alone virtually every household having it's own dot matrix, daisy wheel, or typeball (or even LED or ink jet or laser) printer!

101- or 103-key keyboards didn't exist yet; American typewriter keyboards were the standard QWERTY style:

qwerty.gif

and that is the character set with which Gamow worked.  Hmm - I count 92 keys, not just Gamow's 50!

Oh, gosh; no wonder there are too many keys; that's a computer keyboard!  Here's a fairly-typical American typewriter keyboard:

qwertype

I'm not so sure about that backslash, \, and there are still more than 50 characters, but you get the general idea..

By the way, we simply must have our old English long "ess" "ſ" so we can write: "in Congreſs aſsembled", except for one little detail; the phrase generally used was actually ""in Congress aʃʃembled" (the long "ess" is not normally used at the end of a word).

However, let us simply step across our borders and we get in trouble right away.

To the south, we immediately encounter Spanish/espaol, with it's upside-down exclamation point () and question mark (), plus the five acute-accented vowels (, , , , and ), not to mention the tilde on the letter n (), and these don't even consider capital letters. and, lastly, the diaeresis/umlaut used in the sequences "ge" and "gi".

Oh, my; our wheel printer is getting complicated!

Now, let is journey northward, across the Canadian border, into la belle province de Qubec.  You start to get the idea, eh?  On top of the accent aigue on the "e", we also have to deal with the other unique diacritical marks of franais!  Eheu!  Then we must contend with the accent grave (`, accent grave), the circumflex (^, accent circonflexe), the diaeresis (, trma), and the cedilla (, cdille). 

French also occasionally uses the tilde diacritical mark (~) above n for words and names of Spanish origin that have been incorporated into the language (e.g., caon, El Nio) but, happily, we aready have that one.

The two ligatures Œ and œ have to be allocated their place.  What about and , as in mstro and plla, hein?

Then there are French digraphs and trigraphs - NO! - I flatly refuse to go there!

I completely forgot to add my own mother's native tongue, Magyar (Hungarian), with its many "normal" accent marks PLUS its rather unique double-acute-accented vowels, most notably Ő and ő!

Somewhere deep in my cortical folds resides a clear recollection of having seen an inverted "V" in some orthography or other: "Λ".  I wonder from whence that may have sprung; please let me know if you happen to know.

Whooie!  Look how our print wheels keep growing!  Oops!  Canada?  That means we ought to consider the First Nation and Inuktitut languages like the Algonquian, Inuit, and Athabaskan language families, as well - and that then spills back into Alaska and Aleut in the U. S. and Native American scripts like Cherokee in the lower 48 and from that on to Hawai'i and Polynesian and Melanesian orthographies!

Well, our now-grossly-complex printer covers much of the world's literature - or does it?  German (Deutsch) is already covered by the "u" with an umlaut (diaresis) from the Spanish usage noted above.  Well, not quite.  We need all the vowels so accented: /, /, /, plus we need 1, the Eszett or scharfes (sharp) "S", and then there's that old long "s" (ſ).

[1 - By the way, that Eszett is actually just a ligature of the old long "s" (ſ) and a regular "Z" - ſZ!]

Then, there are all the Celtiberian scripts!

The world?  How about the Latin of the Romans, with, at a minimum, its "apex", similar to but not quite the same as an acute accent?  There is no apex in UTF-8 but an approximation can be had by typing ᷄, ᷄ COMBINING MACRON-ACUTE, as in the Middle Vietnamese "da᷄u sng".  The apex or da᷄u sng is definiely NOT a tilde NOR an acute accent.

Scandinavian orthographies, especially Icelandic, are just LOADED with interesting letters and accent marks, such as:

, , , , , , , , , , , and
and
, , , , , , , , , , , and

and these are just in the Latin alphabet.  We haven't even touched on runes or futhark; they can be a real in your side!

Then there are Polish literature with its accent marks and Russian literature in Cyrillic, with its many non-Latin characters, (not to mention Glagolitic, Bulgarian, Serbian, and Ukrainian) and related Greek, and so to classical Greek, and thus to biblical and modern Hebrew, and from that to Aramaic and Arabic.  Moving eastward, we come to Armenian, Persian, Brahmic [(including Bengali, Dhivehi, Khmer, Thai, Lao, Sinhala, and Tibetan (with its Chinese and Dzongkha/Bhutan variants)], and more.

Heading even further east, we get into the many seemingly-infinite (20,000-plus) Kan/Chinese logograms,plus the Japanese Kana syllabaries, Katakana and Hiragana, and Korean Hangul (Dubeolsik and Sebeolsik), with all their variations.

Whoops!  We missed Egyptian hieroglyphics and other African writing systems, not to mention Vietnamese and the endless other writing systems of all humanity.

Did I just write "hieroglyphics"?  Another "whoops"!  While the Inca seem not to have had written records, the Maya used hierolyphics and the Aztec/Nahuatl used a pre-Columbian writing system that combined ideographs with Nahuatl-specific phonetic logograms and syllabic signs.  Whoo!

Somehow, just somehow, I don't think we will get very far with an international, cross-cultural version of ye goode Prof. Gamow's mechanical printer.


W3C Unicode Charactersets

W3C, the World Wide Web Consortium, "created to provide access to a 'universe of documents'", has made an outstanding effort to categorize and encode virtually all writing sysems in use today, including some "extinct" systems used only by historians and other scholars.  Unicode "provides a large, single character set that aims to include all the characters needed for any writing system in the world, including ancient scripts (such as Cuneiform, Gothic, and Egyptian Hieroglyphs)".  "It is now fundamental to the architecture of the Web and operating systems, and is supported by all major web browsers and applications".  "The first 65,536* code point positions in the Unicode character set are said to constitute the Basic Multilingual Plane (BMP)."  "The BMP includes most of the more commonly used characters."

[* - "The number 65,536 is 2 to the power of 16 {216}; in other words, the maximum number of bit permutations you can get in two bytes."]

    (Quoted material is taken from the W3C website.)

So, through the Unicode system, your inexpensive little home printer now has the capability to be that "Infinity Printer" of George Gamow's fertile imagination!

HTML5 UTF-8 Character Codes

Below is an edited list of some of the UTF-8 character codes supported by HTML5:
  C0 Controls and Basic Latin > 127
  C1 Controls and Latin-1 Supplement > 127
  Latin Extended-A > 127
  Latin Extended-B > 207
  Spacing Modifiers > 79
  Diacritical Marks > 111
  Greek and Coptic > 143
  Cyrillic Basic > 255
  Cyrillic Supplement > 353
  Cyrillic Extended-A > 31
  Cyrillic Extended-B > 95

That's over 20,000 characters (23,939 by my count)!

I found a tabulation of all 23,939 characters

That number includes these:
  General Punctuation > 111
  Currency Symbols > 47
  Letterlike Symbols > 79
  Arrows > 111

and these non-language characters:
  Mathematical Operators > 255
  Box Drawings > 127
  Block Elements > 31
  Geometric Shapes > 95
  Miscellaneous Symbols > 255
  Dingbats > 191

Then there are the Kan (regular, simplified, and traditional) and Kana and Korean characters:
  Chinese characters > {~25,000}
  Japanese characters > 10,909 {???}
  Korean characters > 10,202
plus Mongolian, a true alphabet having it own codeblocks.

Chinese characters are graphemes, representations of a syllable.  There are more than 85,000 Chinese characters, but only 3.000 of them are essential.  Unicode has code points for roughly 25,000 CJK Chinese-Japanese-Korean characters, and they can be used in documents and web pages coded in charset UTF-8.

UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes.  The name is derived from Unicode Transformation Format - 8-bit. (from Wikipedia)


Now, just in case you are interested, the original U. S. ASCII Characterset is on my Character Set Page, as well as many later sets.

This page is concerned with an imaginary printer, as envisioned in 1947 by scientist and science writer George Gamow, and with languages that can be printed, only; please visit my LANGUAGE page, et seq., my DENGLISH - Neutered or Degenderized English page, and my CULTURE page (so-called) for literature and such and my fun page for humor (again, so-called).


LEGACY

  What happens to all this when I DIE or (heaven forfend!) lose interest?  See LEGACY.

COPYRIGHT NOTICE

See Copyright Notice on primary home page.



U.S.Flag U.S.Flag

THUMBS UP!

THUMBS UP!  -  Support your local police, fire, and emergency personnel!


Contact S. Berliner, III

(Junk and unsigned e-mail and blind telephone messages will NOT be answered)


© Copyright S. Berliner, III - 2019  - all rights reserved.


Return to Top of Page