Sunday, August 28, 2011

Consider static factory methods instead of constructors

The normal way for a class to allow a client to obtain an instance of itself is to provide a public constructor. There is another technique that should be a part of every programmer’s toolkit. A class can provide a public static factory method, which is simply a static method that returns an instance of the class. Here’s a simple example from Boolean (the boxed primitive class for the primitive type boolean). This method translates a boolean primitive value into a Boolean object reference: 

public static Boolean valueOf(boolean b) { 
     return b ? Boolean.TRUE : Boolean.FALSE; 
}

Note that a static factory method is not the same as the Factory Method pattern from Design Patterns. The static factory method described in this item has no direct equivalent in Design Patterns. 

A class can provide its clients with static factory methods instead of, or in addition to, constructors. Providing a static factory method instead of a public constructor has both advantages and disadvantages. 

One advantage of static factory methods is that, unlike constructors, they have names. If the parameters to a constructor do not, in and of themselves, describe the object being returned, a static factory with a well-chosen name is easier to use and the resulting client code easier to read. For example, the constructor BigInteger(int, int, Random), which returns a BigInteger that is probably prime, would have been better expressed as a static factory method named BigInteger.probablePrime. (This method was eventually added in the 1.4 release.) 

A second advantage of static factory methods is that, unlike constructors, they are not required to create a new object each time they’re invoked. This allows immutable classes to use pre-constructed instances, or to cache instances as they’re constructed, and dispense them repeatedly to avoid creating unnecessary duplicate objects. The Boolean.valueOf(boolean) method illustrates this technique: it never creates an object. This technique is similar to the Flyweight pattern. It can greatly improve performance if equivalent objects are requested often, especially if they are expensive to create. 

The ability of static factory methods to return the same object from repeated invocations allows classes to maintain strict control over what instances exist at any time. Classes that do this are said to be instance-controlled. There are several reasons to write instance-controlled classes. Instance control allows a class to guarantee that it is a singleton or noninstantiable. Also, it allows an immutable class to make the guarantee that no two equal instances exist: a.equals(b) if and only if a==b. If a class makes this guarantee, then its clients can use the == operator instead of the equals(Object) method, which may result in improved performance. Enum types provide this guarantee. 

A third advantage of static factory methods is that, unlike constructors, they can return an object of any subtype of their return type. This gives you great flexibility in choosing the class of the returned object. 

One application of this flexibility is that an API can return objects without making their classes public. Hiding implementation classes in this fashion leads to a very compact API. This technique lends itself to interface-based frameworks, where interfaces provide natural return types for static factory methods. Interfaces can’t have static methods, so by convention, static factory methods for an interface named Type are put in a noninstantiable class named Types. 

Not only can the class of an object returned by a public static factory method be nonpublic, but the class can vary from invocation to invocation depending on the values of the parameters to the static factory. Any class that is a subtype of the declared return type is permissible. The class of the returned object can also vary from release to release for enhanced software maintainability and performance. 

The class java.util.EnumSet, introduced in release 1.5, has no public constructors, only static factories. They return one of two implementations, depending on the size of the underlying enum type: if it has sixty-four or fewer elements, as most enum types do, the static factories return a RegularEnumSet instance, which is backed by a single long; if the enum type has sixty-five or more elements, the factories return a JumboEnumSet instance, backed by a long array. 

A fourth advantage of static factory methods is that they reduce the verbosity of creating parameterized type instances. Unfortunately, you must specify the type parameters when you invoke the constructor of a parameterized class even if they’re obvious from context. This typically requires you to provide the type parameters twice in quick succession: 

Map<String, List<String>> m = new HashMap<String, List<String>>();

This redundant specification quickly becomes painful as the length and complexity of the type parameters increase. With static factories, however, the compiler can figure out the type parameters for you. This is known as type inference. For example, suppose that HashMap provided this static factory: 

public static HashMap<K, V> newInstance() { 
     return new HashMap<K, V>();
}


Then you could replace the wordy declaration above with this succinct alternative:

Map<String, List<String>> m = HashMap.newInstance();

Someday the language may perform this sort of type inference on constructor invocations as well as method invocations, but as of release 1.6, it does not. 

Unfortunately, the standard collection implementations such as HashMap do not have factory methods as of release 1.6, but you can put these methods in your own utility class. More importantly, you can provide such static factories in your own parameterized classes. 

The main disadvantage of providing only static factory methods is that classes without public or protected constructors cannot be subclassed. The same is true for nonpublic classes returned by public static factories. For example, it is impossible to subclass any of the convenience implementation classes in the Collections Framework. Arguably this can be a blessing in disguise, as it encourages programmers to use composition instead of inheritance. 

A second disadvantage of static factory methods is that they are not readily distinguishable from other static methods. They do not stand out in API documentation in the way that constructors do, so it can be difficult to figure out how to instantiate a class that provides static factory methods instead of constructors. The Javadoc tool may someday draw attention to static factory methods. In the meantime, you can reduce this disadvantage by drawing attention to static factories in class or interface comments, and by adhering to common naming conventions. Here are some common names for static factory methods: 

  • valueOf—Returns an instance that has, loosely speaking, the same value as its parameters. Such static factories are effectively type-conversion methods. 
  • of—A concise alternative to valueOf, popularized by EnumSet. 
  • getInstance—Returns an instance that is described by the parameters but cannot be said to have the same value. In the case of a singleton, getInstance takes no parameters and returns the sole instance. 
  • newInstance—Like getInstance, except that newInstance guarantees that each instance returned is distinct from all others. 
  • getType—Like getInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method. 
  • newType—Like newInstance, but used when the factory method is in a different class. Type indicates the type of object returned by the factory method. 
In summary, static factory methods and public constructors both have their uses, and it pays to understand their relative merits. Often static factories are preferable, so avoid the reflex to provide public constructors without first considering static factories.

No comments:

Post a Comment