The Dependency Finder Developer Guide

by Jean Tessier


Table of Contents


Introduction


Installation

Java Application

JAVA_HOME

DEPENDENCYFINDER_HOME

DEPENDENCYFINDER_OPTS

DEPENDENCYFINDER_CONSOLE

Web Application

web.xml


History Behind Dependency Finder

Class Files

Dependencies

Metrics

API Differences


Contributing to Dependency Finder

I spend a lot of my personal time on Dependency Finder and I have grown very attached to it. I work very hard to keep the code quality as high as I can, and in order to do that, I must retain strict control over what goes into Dependency Finder. I take great pride in the work I do on Dependency Finder and that is part of why all the package names start with "com.jeantessier".

If you have built some great addition or enhancement to Dependency Finder, there are two ways you can share it with the world and help Dependency Finder (and yourself too).

Separate Project

The best way to share your addition/enhancement with the world is for you to create your own open source project. SourceForge can help you with this and there are other alternatives. With your own project, you can take all the credit for your work and manage it the way you want to. You are free to redistribute Dependency Finder with your code. You don't have to make your project open source if you don't want to, you can charge for it or release it to the public domain. The Dependency Finder license is very lax, you can do pretty much what you want with it as long as my name remains on the Dependency Finder code.

By pointing to the Dependency Finder project from your project, you help it show up on web search engines like Yahoo! and Google. This raises awareness of Dependency Finder and helps it gain popularity. I will, of course, return the favor and mention you project on the Dependency Finder website. This way, we both help each other.

Assimilation

Another, less desireable, option is for you to surrender your code to me for inclusion in Dependency Finder. I will review your code thoroughly and I may modify it extensively to make it meets my standards of quality (I'm not necessarily claiming that my standards are better, but they are mine) and that it fits well with the rest of Dependency Finder. In the end, your code will end up in a com.jeantessier package and will bear the standard license header. Your name and the nature of your contribution will be listed on the Dependency Finder website and in some of the documentation, but most likely nowhere in the code itself.

Money

At this time, I do not accept any monetary contributions or any other form of compensation. The only thing I get out of Dependency Finder is the joy to know people are using my stuff and actually finding it useful. I don't even accept donations as some people might construe it entitles them to something. If you want to help the project, simply tell a friend, or two, or three ...


Class Files

You use the com.jeantessier.classreader package to parse compiled Java code. You feed .class files to a ClassfileLoader and you get instances of Classfile back.

ClassfileLoader

The ClassfileLoader classes you will use the most often derive from ClassfileLoaderEventSource. These loaders are the ones who actually create instances of Classfile. They also keep track of LoadListener objects and drive the event model associated parsing class files.

ClassfileLoader has only one public method for loading classes that takes a collection of names. This defines a session. Listeners will receive a BeginSession event at the beginning of the method and an EndSession event just before the method returns. Each name in the collection defines a group. This can be a JAR file, a Zip file, a directory hierarchy containing .class files, or even a single .class file. Each group starts with a BeginGroup and ends with an EndGroup event. Finally, the loader sends a BeginClassfile event before processing every class file, and an EndClassfile event afterwards.

There are two concrete subclasses of ClassfileLoaderEventSource:

They differ in whether or not they keep track of the classes they parse.

You can get instances of Classfile either from EndClassfile events or by querying the loader. You can traverse them with objects that implement com.jeantessier.classreader.Visitor. They have callback methods that get called by the various parts of Classfile objects.

Depending on the nature of a given name for a group, the loader uses a decorator loader to actually open input streams to individual class files. These decorators are subclasses of ClassfileLoaderDecorator. They are:

A decorator opens input streams from the data sources and passes them to an underlying ClassfileLoader.

The reason for the distinction is because the first kind handles how long the Classfile instances remain in memory, while the second kind handles various input file types.

Visitor

You can implement the Visitor interface to traverse Classfile structures. It has callback methods that get called by various parts of the structure.

Look at the code of ClassMetrics for an example of using the Visitor pattern to traverse Classfile instances.


Dependencies

You create a dependency graph with a NodeFactory. The factory keeps track of the package nodes at the top of the graph and all their surbordinate nodes. Individual nodes keep track of their outbound and inbound dependencies.

You extract dependencies from Classfile instances with a CodeDependencyCollector. You can either give it your own NodeFactory or you can let it create one from scratch for you. Now the interesting part is that CodeDependencyCollector is a LoadListener, so you only have to register it with a ClassfileLoader and it will visit each Classfile as it is loaded.

CodeDependencyCollector fires its own set of events during processing. If you are interested in those events, simply implement DependencyListener and register yourself with it. You will receive a DependencyEvent and the start and end of each Classfile, and a separate one for each and every dependency. It is important to note that the dependency graph does not have duplicates of dependencies, but there will be multiple events. It is also impossible to determine the order of calls from the dependency graph, but the events arrive in the order the dependencies are discovered.

For example, if you take the following method:

    public void f() {
        try {
            out.print("abcd");
        } catch (java.io.Exception ex) {
            out.close()
        }
    }

will trigger the following events:

  1. f() --> out
  2. f() --> java.io.Writer.print(java.lang.String)
  3. f() --> java.lang.String
  4. f() --> java.io.Writer.close()
  5. f() --> java.io.IOException

The third one comes from analyzing the signature of print(), not from "abcd". The last one comes from looking at exception handlers.

Visitor

You can implement the Visitor interface to traverse Node structures. It has callback methods that get called by various types of nodes.

Look at the code of DependencyReporter and DependencyMetrics for an example of using the Visitor pattern to traverse Node instances.

If you traverse a standard dependency graph, such as the ones produced by CodeDependencyCollector, you will visit each dependency twice. If we take the dependency A --> B as an example, a visitor will see it once during the call sequence:

    A.Accept(visitor)
        visitor.Visit[Package|Class|Feature](A)
            visitor.VisitOutbound(A.Outbound())
                B.AcceptOutbound(visitor)
                    visitorVisitOutbound[Package|Class|Feature]Node(B)

AND a second time during:

    B.Accept(visitor)
        visitor.Visit[Package|Class|Feature](B)
            visitor.VisitInbound(B.Inbound())
                A.AcceptInbound(visitor)
                    visitor.VisitInbound[Package|Class|Feature]Node(A)

In some cases, you can easily limit your processing to calls to VisitOutbound[Package|Class|Feature]Node() and cover everything.

Visitor and TraversalStrategy

Here is an example showing the sequence of calls betweeen a visitor implementation based on VisitorBase, its TraversalStrategy, and a sample dependency graphs.

Here is the sample graph, magnified so you can see what is going on. The focus will be placed on package P1, class C1, and feature f1. Even-numbered elements have dependencies on them and they have dependencies on odd-numbered elements. This example illustrates the traversal order and processing that occurs on child nodes, outbound dependencies, and inbound dependencies.


Sample Dependency Graph

For the sake of this example, we use a plain SelectiveTraversalStrategy. It will dictate the traversal of a node's outbound and inbound dependencies before the traversal moves on to the subnodes. We will decorate it with a SortedTraversalStrategy that will sort groups of nodes in alphabetical order.

    visitor = new SomeVisitor(new SortedTraversalStrategy(new SelectiveTraversalStrategy()))
    visitor.TraverseNodes({P2, P1, P3})
    strategy.Order({P2, P1, P3})   ==>  {P1, P2, P3}

    P1.Accept(visitor)
    visitor.VisitPackageNode(P1)
    strategy.InScope(P1)
        visitor.PreprocessPackageNode(P1)
        strategy.PreOutboundTraversal()                        ==>  true
            visitor.TraverseOutbound(P1.Outbound())            // empty
        strategy.PreInboundTraversal()                         ==>  true
            visitor.TraverseInbound(P1.Inbound())              // empty
        visitor.PreprocessAfterDependenciesPackageNode(P1)

    visitor.TraverseNodes(P1.Classes())
    strategy.Order({C2, C1, C3})   ==>  {C1, C2, C3}

    C1.Accept(visitor)
    visitor.VisitClassNode(C1)
    strategy.InScope(C1)
        visitor.PreprocessClassNode(C1)

    strategy.PreOutboundTraversal()   ==>  true
        visitor.TraverseOutbound(C1.Outbound())
        strategy.Order({C3, C5})   ==>  {C3, C5}
        C3.AcceptOutbound(visitor)
        visitor.VisitOutboundClassNode(C3)
        C5.AcceptOutbound(visitor)
        visitor.VisitOutboundClassNode(C5)

    strategy.PreInboundTraversal()   ==>  true
        visitor.TraverseInbound(C1.Inbound())
        strategy.Order({C4, C2, f6, f4})   ==>  {C2, C4, f4, f6}
        C2.AcceptInbound(visitor)
        visitor.VisitInboundClassNode(C2)
        C4.AcceptInbound(visitor)
        visitor.VisitInboundClassNode(C4)
        f4.AcceptInbound(visitor)
        visitor.VisitInboundFeatureNode(f4)
        f6.AcceptInbound(visitor)
        visitor.VisitInboundFeatureNode(f6)

    visitor.PreprocessAfterDependenciesClassNode(C1)
    visitor.TraverseNodes(C1.Features())
    strategy.Order({f2, f1, f3})   ==>  {f1, f2, f3}

    f1.Accept(visitor)
    visitor.VisitFeatureNode(f1)
    strategy.InScope(f1)
        visitor.PreprocessFeatureNode(f1)

    strategy.PreOutboundTraversal()   ==>  true
        visitor.TraverseOutbound(f1.Outbound())
        strategy.Order({C3, C5, f3, f5, f7})   ==>  {C3, C5, f3, f5, f7}
        C3.AcceptOutbound(visitor)
        visitor.VisitOutboundClassNode(C3)
        C5.AcceptOutbound(visitor)
        visitor.VisitOutboundClassNode(C5)
        f3.AcceptOutbound(visitor)
        visitor.VisitOutboundFeatureNode(f3)
        f5.AcceptOutbound(visitor)
        visitor.VisitOutboundFeatureNode(f5)
        f7.AcceptOutbound(visitor)
        visitor.VisitOutboundFeatureNode(f7)

    strategy.PreInboundTraversal()   ==>  true
        visitor.TraverseInbound(f1.Inbound())
        strategy.Order({f6, f4, f2})   ==>  {f2, f4, f6}
        f2.AcceptInbound(visitor)
        visitor.VisitInboundFeatureNode(f2)
        f4.AcceptInbound(visitor)
        visitor.VisitInboundFeatureNode(f4)
        f6.AcceptInbound(visitor)
        visitor.VisitInboundFeatureNode(f6)

    strategy.PostOutboundTraversal()   ==>  false
    strategy.PostInboundTraversal()    ==>  false
    visitor.PostProcessFeatureNode(f1)

    f2.Accept(visitor)
        ...
    f3.Accept(visitor)
        ...
    visitor.PostProcessBeforeDependenciesClassNode(C1)
    strategy.PostOutboundTraversal()   ==>  false
    strategy.PostInboundTraversal()    ==>  false
    visitor.PostProcessClassNode(C1)

    C2.Accept(visitor)
        ...
    C3.Accept(visitor)
        ...
    visitor.PostProcessBeforeDependenciesPackageNode(P1)
    strategy.PostOutboundTraversal()   ==>  false
    strategy.PostInboundTraversal()    ==>  false
    visitor.PostProcessPackageNode(P1)

    P2.Accept(visitor)
        ...
    P3.Accept(visitor)
        ...

OO Metrics

You use com.jeantessier.metrics.MetricsGatherer instance to read class files and compute the metrics. It is a com.jeantessier.classreader.Visitor and will traverse the complete structure rooted at the Classfile instance and compute various metrics.

The MetricsGatherer uses a MetricsFactory to create the various Metrics instances. The factory uses a MetricsConfiguration instance to decide what measurements make up a given set of metrics. The configuration is loaded at runtime from an XML file.

By default, the value of each measurement is computed only the first time it is requested and then cached for further request. You can refresh the caches through the API and you can turn off caching of individual measurements through their descriptor and in the configuration file.

Data Structure




API Differences

You can use com.jeantessier.classreader.ClassfileLoader classes to examine the baseline of your codebase; be they in JAR files, loose class files, or a combination of both. You can apply the same treatment to your latest codebase. You now have two sets Classfile instances.

You can use com.jeantessier.dependency.NodeFactory to create a tree of packages, classes, and features from each codebase. You can then start to compare them to each other. If a package is in the old codebase but not in the new one, you can mark it as having been removed. If it is not in the old codebase but it is in new one, then you can mark it as having been recently added. For packages that are present in both codebase, you can repeat this analysis at the class level, and then at the feature level.


Command-Line Parsing

The com.jeantessier.commandline package gives you the tools you need to parse the command-line to your program, validate switches and parameters, and even print a summary usage statement when your program is not called properly. Switches start with a dash ("-") and usually have specific semantics attached to them. Parameters are just strung out on the command-line and usually don't have individual specific semantics, besides those they share as a group.

You create a CommandLine instance to parse your command-line. At creation time, you can supply a specifc ParameterStrategy. Here are a the ones that ship with Dependency Finder.

AnyParameterStrategy
No restrictions, the command-line can include any number of parameters, including none at all. This is the default strategy if you do not specify one.
AtLeastParameterStrategy
The command-line must include at least a certain number of parameters or the framework will find the command-line invalid.
AtMostParameterStrategy
The command-line can include at most a certain number of parameters or the framework will find the command-line invalid.
ExactlyParameterStrategy
The command-line must include an exact number of parameters or the framework will find the command-line invalid.
NullParameterStrategy
The command-line cannot include any parameters or the framework will find the command-line invalid.

Once you have a parser, you can add switch definitions to it. There are four types of switches described below.

MultipleValuesSwitch
The switch must be followed by a value, but it can occur multiple times on the command-line. The values are accumulated in the same order as on the command-line and you retrieve them as a single List.
OptionalValueSwitch
The switch can appear by itself or followed by a value. It can only appear once on the command-line.
SingleValueSwitch
The switch must be followed by a value. It can only appear once on the command-line.
ToggleSwitch
The switch be followed by a value. It acts as a boolean, false if absent or true if present on the command-line.

You add switches with the matching AddSwitch() methods on CommandLine. You can supply switches with default values and specify if they are mandatory (must appear on the command-line) or not.

When you create the parser, you can also specify if the parser will be strict or not. Strict parsers will only accept switches that are explicitly specified. Non-strict parsers treat an unknown switch as an OptionalValueSwitch.

Along with your CommandLine parser, you can create a CommandLineUsage that will create a summary description of your command-line specification. You can use this summary in error messages for invalid command-lines to help users figure out what they did wrong.

To actually parse your command-line, just call CommandLine's parse() method and pass it the string array that is main()'s sole parameter. The parser will throw an exception if anything went wrong. After parsing, you can check for the presence of specific switches with the IsPresent() method and get the value(s) of specific switches with one of the Switch() methods. You can retrieve parameters, if any, with the Parameters() method.


Building Dependency Finder

    C:\>ant

Compiling a Build

    C:\>ant jar
    C:\>ant clean
    C:\>ant docs
    C:\>ant dist
    C:\>ant src
    C:\>ant war

or

    C:\>ant jar clean docs dist src war

Testing the Build

    C:\>ant tests
    C:\>textjunit TestAll

Making a Release

  1. cvs tag release-20030101
  2. ant realclean docs dist war realclean src
  3. ant ftp
  4. cvs log -rrelease-20020711:release-20030101
  5. Create new release on SourceForge.net
  6. Notify monitoring people
  7. Close bugs
  8. Close feature requests
  9. New news item
  10. Post to news groups (comp.lang.java, comp.lang.java.announce, comp.lang.java.softwaretools, comp.software.measurement, comp.software-eng)
  11. Generate sample files