Saturday, April 23, 2011

Using Guice to Retrieve Parameterized Types at Runtime

At every point in a Java coder's life they have tried to write a class that does something like the following:

Of course, E.class is not going to fly with Java, as the parameterized type E is only a compile time check.

Let's look at a more real world example. Let's say I want to define a Function type which is parameterized by an Input (I) and Output (O) type:

This type of thing turns out to be very useful, as functions aren't first class citizens in Java. Once you have a class like this it's natural to want to do things such as filter a collection of Function objects based on either Input or Output type, automatically construct processing chains of Functions, etc. To do this we'd probably need to change our Function definition to look like the following, so we can retrieve the Input/Output types at runtime:

The problem we immediately see with this is that we can't implement those functions at this level, for the same reason as my initial simple example. One option is to make these two functions abstract, but just by looking at an example Function implementation we see how sub-optimal this is:

The additional functions we've forced on the users of our abstract class is even worse if we think about creating anonymous inner functions from this interface and all the extra code clutter. We also see that they return trivial values and are pretty redundant given these types are already specified in the extends clause (if only we could get that information).

Thankfully Guice provides us a nice clean way around this problem, allowing us to move those methods back into the abstract Function class, where we wanted them all along, by injecting TypeLiteral objects. This makes our new Function class look like this:

Now if we use this class in the following way:

Then the printed output is, as expected:

Double2String: class java.lang.Double --> class java.lang.String
String2Object: class java.lang.String --> class java.lang.Object

The beauty of this is it is very simple to inject the types you want when you want them, and this can be applied anywhere you need the runtime type information the compiler erases.

Disclaimer: The above code is intended to be illustrative and was not thoroughly tested.

Sunday, April 3, 2011

Immutable Instance Variables with Method Injection in Guice

In my previous post I showed one of the downsides to method injection in Guice, which is you can't make the fields that are set via those methods immutable by marking them final. This is a shame because method injection can be very handy, and given it is also inherited by classes extending your class, it is useful in abstract classes up the hierarchy. One way around this is using an ImmutableHolder class. The class would look something like this and impose simple write-once, read only after initialization semantics:

Our configurable class would be slightly more complicated, as the setter/getter for this data would have to get or set the underlying data from our Holder (very similar to how you might wrap some data with a SoftReference or WeakReference):

But it's use is now transparent to the outside users of this class. Unfortunately, we still have the ability to create our ConfigurableClass without initializing it since we're using an optionally injected setter, though you don't have to do this. However, instead of getting a null value back and eventually throwing a NullPointerException when we try to use the Configuration object (which could be in a completely different context if we pass it along to other functions) we will get an UnitializedException as early as possible. This technique could very well be a viable tradeoff in the cases where we want to fail earlier and with a more obvious message, as well as offer write-once semantics while using method injection.

Disclaimer: The above code is intended to be illustrative and was not thoroughly tested.

Saturday, April 2, 2011

Configured Classes in Guice

When using dependency injection most of the time the dependencies of the classes you write fall into one of two categories:
(1) Single configuration, where there is only one class which will satisfy that dependency in the system at once. These are the classes in which we'd simply bind our Interface to our concrete implementation and forget about it. An example would be an Environment object - in different executions of our application we may want to bind to different Environments, but while it's running we'd only need one concrete implementation of that interface.
(2) Multiple configurations, where we wish to create an object with a bunch of different implementations or variations all living in the same execution of our application. An example of this would be a DataPoint, where we'd likely want to create DataPoints from a variety of sources or initializations. In this case the most straightforward way to handle this would be to simply create a factory, and use that factory to create instances of our DataPoints (likely using Assisted Injection), passing in parameters that change and letting Guice inject those that don't.

The interesting situation is when we'd like to support both of these use cases in a single object. I have run into this case when designing library APIs that make use of Google Guice, where the typical case may be single configuration but for generality it is necessary to also support the multiple configuration case fairly easily as well. There are a number of ways Guice provides to let you achieve this goal, and I'm going to outline the major ones I have used, along with discussing some of their advantages and disadvantages.

Option 1: Optional Setters


The first approach is to use optional setter injection on our ConfigurableClass as follows:

The single configuration use case can then be achieved through a binding in a module either in configure(), as a Provider, or a Provider method. This allows the ConfigurableClass object to be fully initialized as we would like when we Inject it somewhere, as we see in our Driver below, without any additional work.

The multiple use case scenario starts by not binding the Configuration class in a module (at least not without an annotation), and then using the setter manually to create various configurations:

If you didn't need the Configuration itself to be injected, you could even create these dynamically wherever they were needed and have no bindings in the module related to the configurations.

The benefits to this approach are it's relative simplicity from the design of the ConfigurableClass object, and if the typical use case is single configuration per project it works as expected after the binding is created for the Configuration. In addition, in the multi-configuration use case, having the setter to initialize the object means you are free to dynamically create as many configurations as you want on the fly relatively easily (either by creating them with new or by creating/initializing them from a Provider, the Injector, etc.).

The main drawback to this approach is that determining whether the object is initialized when it is injected can only be answered by inspecting the bindings of the injector creating it. Therefore, you have to be careful about the situations in which you do/don't have to manually call the setter, and make sure to document this behavior if this is a library provided to downstream users. This approach is not a fail-fast solution, because you may not realize the object wasn't properly initialized until much later in your application when you try to use it.

Option 2: Private Modules


Another possible option is by using Guice's PrivateModule feature. In this case, we can simply use Constructor Injection in our ConfigurableClass as follows:

The single configuration use case now works the same as before, where we simply set the binding in a module and let the Injector inject the instance into our ConfigurableClass instance.

There are a few benefits of this approach over the previously laid out design. The first is that we can now make the config instance variable in ConfigurableClass immutable if we desire by setting it to final. This wasn't possible before since we were using a post-construction set method even in the single configuration case. Another thing to note about the single configuration case is that now if we forget to bind the Configuration, Guice will throw an Exception on object creation from the Injector, instead of leaving us (silently) with an uninitialized object as in the previous method making this a better fail-fast solution.

However, we now see that the multiple configurations situation is more difficult since we can no longer directly create these objects and initialize them with our setter. This is where Guice's PrivateModule capability can come in handy, by letting you set a group of bindings that is specific to a certain, in this case annotated, binding.


This is actually significantly more powerful than the previous approach, in that you can specify an entire tree of bindings that are private to the classes which are exposed (hence the name PrivateModule). Therefore, you could also modify the bindings of the things injected into MyConfiguration and MyOtherConfiguration in Configure1 and Configure2, something that is not really possible in the optional setter solution above. The downside is we've up'ed the complexity of our modules now, and we're limited to those configurations which we can enumerate upfront for the most part (though we could make a PrivateModule with constructor arguments). For a more detailed example of using PrivateModules see this example.

Option 3: Factories


Another option for passing in parameters to objects created by the Injector is by creating a factory and using Assisted Injection. Our ConfigurableClass looks similar to Option 2, but with an Assisted annotation on the constructor argument. To complement this class we will also have a Factory interface for creating our objects:

We can have Guice build a factory that creates instances of our class and use our factory by doing the following:


However, we see that for the single configuration case we've now made things more complicated since the user can no longer just bind a Configuration and must Inject both a Configuration and a ConfigurableClassFactory into the places they wish to have a ConfigurableClass instance. We do still retain the benefits of Option 2 with respect to immutability and an inability to create an uninitialized instance, and we have the flexibility of Option 1 to create on the fly as many configurations as we want with our factory without enumerating them upfront.

One possible way around this problem for single configuration cases is to stack another factory on top of our Guice-generated one which will resolve the @Assisted parameters from the Injector:

This means that we can use this new extended factory in the same way as our previous one for the multiple configurations case, but we can also use it without passing in our Configuration object in the single configuration case:

The real downside here is having to create two factories when really we only wanted one. We could also forego Guice creating the factory for us with the FactoryModuleBuilder and simply create one that had all the functionality we wanted, too. It may also be possible to generalize this boilerplate 2nd factory code to use in multiple places, using reflection to find/resolve the necessary instances in the single configuration case where we are assuming all the objects which were marked as Assisted we simply want to resolve against the Injector as if they weren't - but I haven't tried that yet myself.

Option 4: Child Injectors


The last major option is the use of Child Injectors. This is fairly straightforward, but involves direct manipulation of the Injector:

It seems like this would be best suited at the very top levels of your application and not much deeper, as it requires creation of new Injectors and Injection of the current parent Injector (something I try to avoid doing unless absolutely necessary). A better alternative to this approach would be to use the PrivateModules approach seen earlier, as it achieves largely the same effect, except the end result is a single Injector rather than a group of them.

Conclusions


It is quite common to want to create objects with different configurations or injected types, and I've shown 4 different approaches to solving this in Guice. They each have their tradeoffs in terms of complexity, ease of use for end users of a library, extensibility, and their ability to throw exceptions early and fail-fast. The situations where you want to use one versus another may depend on how likely it is that the user will want multiple configurations of the given class in question, or which tradeoffs seem the most acceptable. In addition, there are of course variations to all these approaches, and they can be tailored in various ways to the individual situations that may arise in a given project.

Disclaimer: The above code is intended to be illustrative and was not thoroughly tested.