Recall that when a Java program is compiled, it compiles down to platform-independent Java byte code. As Figure 2.3 shows, Java byte code is verified before it can run. This verification scheme is meant to ensure that the byte code, which may or may not have been created by a Java compiler, plays by the rules. After all, byte code could well have been created by a "hostile compiler'' that assembled byte code meant to violate the rules of the Java VM; created directly with an editor like emacs; created almost directly with a Java byte code assembler like Jasmin; or compiled from another source language like C++, Scheme, or Ada into Java byte code. The important thing is not what the source code looked like or even what language it was written in, but what the byte code (the code that actually ends up running) does. In this sense, the Verifier makes mystery code a bit less mysterious.
Verifying class files containing byte code is one way in which Java automatically checks untrusted code before it is allowed to run. Once Java code has been verified, it can execute in uninterrupted fashion on a VM (with much less need to make security-critical checks while the code runs). This strategy leads to improvements in the efficiency of Java execution, which offset the speed concerns raised by Java's security checking. The Verifier is built in to the VM and cannot be accessed by Java programmers or Java users. In most Java implementations, when Java code arrives at the VM and is formed into a Class by the Class Loader, the Verifier automatically examines it. The Verifier checks byte code at a number of different levels. The simplest test makes sure that the format of a code fragment is correct. On a less-basic level, a built-in theorem prover is applied to each code fragment. The theorem prover helps to make sure that byte code does not forge pointers, violate access restrictions, or access objects using incorrect type information. If the Verifier discovers a problem with a class file, it throws an exception, loading ceases, and the class file never executes. The verification process, in concert with the security features built into the language and checked at runtime, helps to establish a base set of security guarantees.3
The Verifier also ensures that class files that refer to each other preserve binary compatibility. Because of Java's ability to dynamically load classes, there is the possibility that a class file being dynamically linked may not be compatible with a referring class. Binary incompatibility problems could occur when a library of Java classes is updated or when a class from a large Java program is not recompiled during development. There are rules of compatibility that govern the ability to change use of classes and methods without breaking binary compatibility [Venners, 1998]. For example, it is okay to add a method to a class that is used by other classes, but not okay to delete methods from a class used by other classes. Compatibility rules are enforced by the Verifier.
Binary incompatibility also has security implications. If a malicious programmer can get your VM to accept a set of mutually incompatible classes, the hostile code will probably be able to break out of the sandbox. This problem happened several times with early VM implementations. (See, for example, You're Not My Type, in Chapter 5.)
Java users and business people investigating the use of Java in their commercial enterprises often complain about the length of time it takes for a Java applet to get started running in a browser. E-commerce system designers paint startup delay as a business-side show stopper, citing the fact that consumers do not react well even to a 20-second delay in their shopping experience. Many people believe falsely that the main delay in starting applets is the time it takes to download the applet itself. But given a reasonably fast connection to the Internet, what takes the longest is not downloading the code, but verifying it. The inherent costs of verification fit the classic tradeoff between functionality and security to a tee. As security researchers, we believe the security that byte code verification provides is well worth the slight delay. We also think it is possible to speed up the verification process so that its execution time is acceptable.
In order to work, the Verifier reconstructs type state information by looking through the byte code. The types of all parameters of all byte code instructions must be checked, since the byte code may have come from an untrustworthy source. Because of this possibility, the Verifier provides a first line of defense against external code that may try to break the VM. Only code that passes the Verifier's tests will be run.
The process of Verification in Java is defined to allow different implementations of the Java VM a fair amount of flexibility. The VM specification lists what must be checked and what exceptions and errors may result from failing a check, but it does not specify exactly when and how to verify types. Nevertheless, most Java implementations (especially the most widely used commercial VMs) take a similar approach to verification. The process is broken into two major steps: internal checks that check everything that can be checked by looking only at the class file itself and runtime checks that confirm the existence and compatibility of symbolically referenced classes, fields, and methods.
Through the two kinds of checks, the Verifier assures a number of important properties. Once byte code passes through verification, the following things are guaranteed:
When Java source code is compiled, the results of the compilation are put into Java class files, whose names typically end with the .class or .cls extension. Java class files are made up of streams of 8-bit bytes. Larger values requiring 16 or 32 bits are composed of multiple 8-bit bytes. Class files contain several pieces of information in a particular format. Included in a class file are:
The byte code inside a class file is made up of instructions that can be divided into several categories. Among other things, byte code instructions, called opcodes, implement:
Opcodes and their associated operands represent the fundamental operations of the VM. Every method invocation in Java gets its own stack frame to use as local storage for variable values and intermediate results. As discussed earlier, the intermediate storage area part of a frame is called the operand stack. Opcodes refer to data stored either on the operand stack or in the local variables of a method's frame. The VM uses these values as well as direct operand values from an instruction as execution data.
In practice, the internal checks performed by the Verifier occur very soon after the VM loads a class file. Loading all possible classes that might be called in a particular execution of a class is not the most efficient approach. Instead, Java loads each class only when it is actually needed at runtime.
In order to verify a class, the VM must load in the definition of any not-yet-loaded class that is referenced by the class being verified. More precisely, the class being verified refers to some other classes by name, and the VM needs to decide exactly which class each name refers to, so it can replace the reference-by-name with a reference to a specific class object. This process is known as dynamic linking, and has proven to be a persistent source of security problems.
As in the Verifier's internal checking, dynamic linking can throw an error should it fail. An invalid reference happens when a referenced class does not exist, or when a referenced class exists but does not contain some referenced field or method. Once an error of this sort is thrown, the class file doing the reference is no longer considered valid.
The main benefit that the Verifier provides is that it speeds up execution by removing much of the checking that would otherwise have to occur at runtime. For example, there is no runtime check for stack overflow since it has already been done by the Verifier.
The Verifier disallows many obvious approaches to byte code manipulation. Nonetheless, a number of researchers have succeeded in creating byte code that should be illegal, but nevertheless passes the Verifier. The Princeton Secure Internet Programming team was the first to sneak attacks involving illegal byte code past the Verifiers. Mark LaDue, creator of the Hostile Applet Home Page (www.rstcorp.com/hostile-applets), performs a number of interesting experiments in which he creates byte code that does not play by the rules and yet passes verification. Other Java security researchers, most notably the Kimera group at the University of Washington, have discovered problems in commercial Verifiers. Their work places special emphasis on correct verification. We have more to say about the Verifier and Java holes in later chapters.
The Verifier acts as the primary gatekeeper in the Java security model. It ensures that each piece of byte code downloaded from the outside plays by the rules.4 That way, the Java VM can safely execute byte code that may not have been created by a Java compiler. When the Verifier finds a problem in a class, it rejects the malformed class and throws an exception. This is obviously a much more reasonable behavior than running buggy or malicious code that crashes the VM.
In order for the Verifier to succeed in its role as gatekeeper, the Java runtime system must be correctly implemented. Bugs in the runtime system will make byte code verification useless. Most of the available Java implementations appear to be mostly correct (see Chapter 5), but behavior may vary from one implementation to another.
It would be nice for Sun or other interested parties to create comprehensive validation and verification test suites for the entire Java development environment and publish the results of testing the various Java implementations. Existing test suites have neither been completely developed nor verified by outside experts. From a security perspective, this certainly presents a problem. It is especially critical to verify any third-party Java environment to ensure that it properly implements the Java run time. Without a guarantee of bug-free run time, Java security falls to pieces. Java users should think carefully about the Java run time product that they use.
We should note, though, that one cannot construct a test suite that will find all security problems. Although testing technology is improving, it can never be perfect-not even in theory. All we can hope for is that security testing technology will find most of the bugs in new code before it is released.
Copyright ©1999 Gary McGraw and Edward Felten.