The security concerns raised in this book apply equally to both Java users and Java developers. Using Java is as easy as surfing the Web. The simple use of Netscape Navigator, Internet Explorer, or any other Java-enabled browser to run Java applets is a risky activity. In order to really understand these risks, it is important to gain a deeper understanding of how Java really works. Here is a short but thorough introduction to the Java language.
The Java development environment comprises three major components:
Because Java byte code runs on the Java Virtual Machine, it is possible to run Java code on any platform to which the JVM has been ported. Some Web browsers, such as Netscape and Internet Explorer, include an encapsulated version of the JVM. Using their built-in VMs, such Java-ready browsers can automatically download and execute Java applets when a user accesses an HTML Web page including the <APPLET> tag.
One of the first public introductions to Java came in the form of a whitepaper released by Sun (and since updated many times) [Sun Microsystems, 1995]. An especially pithy sentence from that document attempts to describe the fundamental aspects of Java all at once. It reads:
Java: A simple, object-oriented, distributed, interpreted, robust, secure, architecture neutral, portable, high-performance, multi-threaded, and dynamic language. Quite a collection of buzzwords. In fact, some people joke that Java is "buzzword compliant." This book is concerned mostly with the security claim, of course, but in order to understand the implications of Java for computer security, you need to grasp the other important characteristics of the language first.As the quote claims, Java has many interesting features. They will be briefly introduced here. Pointers to more information on Java can be found on page 31. The Java language is:
Object-oriented: Unlike C++, which is an objectivized version of C, Java is intrinsically object-oriented. This changes the focus of programming from the old procedural way of thinking (as in C and Pascal) to a new data-centric model. In this new model, data objects possess associated methods. Methods perform actions on data objects. Every Java program is composed of one or more classes. Classes are collections of data objects and the methods that manipulate these data objects. Each class is one kind of object. Classes are arranged in a hierarchy such that a subclass inherits behavior and structure from its superclass. Object-oriented languages were designed using the physical world as a metaphor. Classes communicate with each other in much the same way that real physical objects in the world interact.Java has other important characteristics adapted from modern programming languages such as Scheme (a popular dialect of Lisp) and ML. In particular, Java uses:
Garbage collection: Memory management is usually handled in one of two ways. The old-fashioned approach is to have a program allocate and deallocate memory itself. This approach allows all sorts of insidious errors and hard-to-squash bugs. C, for instance, uses this method. By contrast, Lisp introduced the modern concept of garbage collection in 1959! Garbage collection requires the system (rather than the programmer) to keep track of memory usage, providing a way to reference objects. When items are no longer needed, the memory where they live is automatically freed so it is available for other uses. Java provides a garbage collector that uses a low-priority thread to run silently in the background. Java's memory management approach has important implications for the security model since it prevents problems associated with dangling pointers.Though it has more than doubled in size since its original introduction, Java is still a relatively simple language. This is especially apparent when Java is compared with C and C++ [Daconta, 1996]. In C, there are often many possible ways in which to do the same thing. Java tries to provide only one language mechanism with which to perform a particular task. Also, Java provides no macro support. Although some programmers like using macros, macros often end up making programs much harder to read and debug.
The designers of Java made their language simple by removing a number of features that can be found in C and C++. Things that were removed include the goto statement, the use of header files, the struct and union constructs, operator overloading, and multiple inheritance. Together with the elimination of pointers, removal of these aspects of C and C++ makes Java easier to use. This should result in more reliable code.7
We will revisit the impact that Java's features as a language have on security in Chapter 2.
The second major component of the Java development environment is the Java Virtual Machine. The VM makes Java's cross-platform capabilities possible. In order to run Java byte code on a new platform, all that is required is a working VM. Once the VM has been ported to a platform, all Java byte code should run properly.
Making a byte code/VM pair that works well on many varied platforms involves setting a few things in stone. Java has variables that are of fixed size and of fixed format. An integer in Java is always 32 bits, no matter what the word size of the machine running Java. Making data formats machine independent and compiler independent is crucial to making Java truly portable. The very different way in which variables are managed on different C platforms causes no end of portability problems for C programmers.
The VM also makes use of symbolic data stored inside of Java byte code files. Java byte code contains enough symbolic information to allow some analysis of the byte code before it is run. This is one way the Java environment ensures that Java's language rules have been followed by the compiler-something critical to security. Rules checked include, for example, type safety rules, and ensuring that certain things claiming to be of a certain type actually are of that type. Since the Java byte code Verifier is a critical part of the security model, it is discussed in detail in Chapter 2.
Using a Virtual Machine has obvious important repercussions for the Java approach. The VM makes portability possible, and it helps to ensure some of Java's security features. Since Java is often implemented using an interpreter, speed can be an issue. Interpreted languages are inherently slow because each command must be translated to native machine code before it can be run. With a compiler, this work is all done ahead of time, when an executable is created for some particular platform. Without just-in-time (JIT) and hotspot compilers, Java's interpreted code is about 20 times slower than native C code. When this new technology is used, Java speeds begin to approach native C.
The third part of the Java development environment is a set of predefined classes that implement basic functionality. The "personal" version of the JDK includes, for example, an Abstract Windowing Toolkit (AWT). These classes provide a set of graphical user interface (GUI) tools for creating windows, dialogue boxes, scrollbars, buttons, and so forth. Java also includes classes for full network support that provide application program interfaces (APIs) for sockets, streams, URLs, and datagrams. A POSIX-like I/O system with APIs for files, streams, and pipes makes the environment comfortable for experienced Unix programmers. Classes are grouped together into packages according to their functionality. Table 1.1 lists the packages included in the Java Developers' Kit (JDK) version 1.1. Note that Java's core classes have grown significantly in the last few years.
The predefined Java classes provide enough functionality to write full-fledged programs in Java. Using the predefined classes as primitives, it is possible to construct higher-level classes and packages. Many such home grown packages are available both commercially and for free on the Net.
In the early days of Java's popularity, most Java programs took the form of applets, small programs that were attached to Web pages and loaded and run in Web browsers. As Java developed, people began to write substantial applications in Java, using it simply as an improved version of traditional languages such as C.
Java has always been good for more than writing applets, and the world is now catching on to that fact. Java is really a good platform for any application that needs to be extended or customized, perhaps across the network, after it is deployed. A browser is only one example of such an application.
Another increasingly popular use of Java is in Web servers. Many servers have extension mechanisms, but the Java Servlet API provides a particularly flexible and compelling vehicle for extending a server with new application-specific or site-specific functions. Most major Web servers now support the Java Servlet API. Compared to browsers, servers present more difficult security challenges, since servers have more stringent reliability requirements and store more valuable data.
Java's features also make it a good platform for creating new server-type applications. With natural support for multithreading, database access, and networking, Java gives developers a natural leg up in designing such applications. For these reasons, Java is being used increasingly in enterprise computing.
One common structure for such systems uses a "three-tier" architecture. A traditional database server acts as the "back end" tier, storing and managing the data needed to support a business application. The middle tier is a Java-enabled specialized server that interacts with the database and implements the "business logic" needed to manage client interactions with the system. The "front end" tier is a Java applet that runs in the client's Web browser and provides a convenient user interface so that users can interact naturally with the system. Three-tier systems put together several uses of Java and, as a result, face a wide array of security issues.
In addition to all of these applications in traditional computers, Java is being deployed in embedded devices such as smart cards, key rings, and pagers. Embedded applications are often involved in electronic commerce systems, adding yet another series of twists to our security story.
The growing variety of applications is reflected in the subject matter of this book. While the first edition focused almost exclusively on applet security issues, this edition encompasses the full breadth of today's Java applications. We want to provide you with the information you need to know to maintain security while building, deploying, managing, and using up-to-date, Java-based systems. As Java has gotten down to business, so has this book.
Java is much more than simply a language for creating applets. In the early days of Java (less than a handful of years ago), it was important to distinguish applet code (which was typically treated as untrusted and relegated to the sandbox) and application code (which was typically treated as fully trusted built-in code). This distinction is no longer a useful one.
An alternative way to carve up the Java program space is to think about code in terms of levels of trust. Programs that are more trusted can be allowed to carry out potentially dangerous acts (like writing files). Programs that are less trusted will have their powers and permissions curtailed.
If we think about Java programs this way, it is still possible to make sense of the old distinction between applets and applications. Java applets are usually, though not necessarily, small programs meant to be run in the context of a Web browser. Obviously, applets involve the most client-side (or user) security concerns of any Java programs. In fact, Java's security policies originally existed in order to make applets feasible. The Java runtime enforces severe limitations on the things that applet classes may do [McGraw and Felten, 1996]. See www.javasoft.com/sfaq and Chapter 2 for details. In terms of the new trust-based distinction, applets are clearly treated as untrusted. This makes sense, since the origin of an applet is often unknown or unfamiliar.
In the early days of Java, Java applications had no such restrictions. In terms of our trust distinction, applications in Java before Java 2 were treated as completely trusted code. That meant applications could use the complete power of Java, including potentially dangerous functionality.
The reason the old distinction between applets and applications no longer makes sense is that today, applets can be fully trusted and applications can be completely untrusted. (Note the use of the word can in the previous sentence; we don't mean to say that applets are always trusted or that applications are never trusted.) In fact, depending on the situation, each and every Java program can be trusted, partially trusted, or untrusted. Sound complicated? That's because it is.
With the introduction of Java 2, Java includes the ability to create and manage security policies that treat programs according to their trust level. The mechanisms making up the base sandbox are still under there somewhere, but they serve merely as a default situation to handle code that warrants no trust. The interesting thing is that code that is partially trusted can be placed in a specially constructed custom sandbox. That means a partially trusted applet can be allowed to, say, read and write a particular file or make a network connection to a particular server. This is good news for Java developers who were chafing under the constraints of the restrictive original sandbox.
Figure 1.6 illustrates the way in which the old applet/application distinction can be recast in terms of black-and-white trust. It also shows the impact that Java 2 has on the black-and-white trust model, transforming it into a shades-of-gray trust model.
Currently, a large and growing number of Java systems are running the gamut from Java gizmos (including Java rings), through smart cards with built-in Java interpreters, to complete Java Development Kits and IDEs. As with any platform meant to interact in a networked world, there are security concerns with each flavor of Java. This book discusses security risks that apply to all flavors of Java, but will focus on Java 2 and Card Java 2.0.
Counterintuitively, Java is both growing and shrinking at the same time. The JDK, now up to Java 2, is doubling in size with each major release. At the same time, embedded Java systems like Card Java 2.0 are stripping Java functionality down to bare bones. Both of these moves have important security implications. Java 2 involves fundamental changes to the Java security model as the Java sandbox is metamorphosing into a trust-based system built on code signing. Card Java 2.0 removes much of the sandbox, leaving smart card applets more room to misbehave.
All of Java's built-in security functionality, including the recently added authentication and encryption features (which began to appear with JDK 1.1), are available to Java application developers. This functionality makes it possible for an application to establish its own security policy. In fact, Java-enabled browsers do just that, determining the security policy by which all applets that run inside them must abide. For obvious reasons, an applet is not allowed to change the browser's (or for that matter, any application's) security model!
Copyright ©1999 Gary McGraw and Edward Felten.