Tuesday, October 1, 2013

Why does Java use the Unicode?

I like this description, so I quote it. It answers the question why java uses Unicode.

The Java programming language uses the Unicode character set for managing text.

A character set is simply an ordered list of characters, each corresponding to a particular numeric value.

Unicode is an international character set that contains letters, symbols, and ideograms for languages all over the world. Each character is represented as a 16-bit unsigned numeric value. Unicode, therefore, can support over 65,000 unique characters.

Only about half of those values have characters assigned to them at this point. The Unicode character set continues to be refined as characters from various languages are included.

Many programming languages still use the ASCII character set. ASCII stands for the American Standard Code for Information Interchange.

The 8-bit extended ASCII set is quite small, so the developers of Java opted to use Unicode in order to support international users.

However, ASCII is essentially a subset of Unicode, including corresponding numeric values, so programmers used to ASCII should have no problems with Unicode.

John Lewis. Java Software Solutions: Foundations of Program Design.

It is the best computer programming book for beginners.

I recommend this book for Java beginners, because it was written in a simple language. When I studied course "Introduction to Computer Programming" at University, we used it as a textbook. I think this choice was good. I read the entire book and it did not take a lot of time.

But! If you are not a beginner, you definitely don't want to read this book, because it was written for students, so it contains basics only. And that basics is not explained very deeply.