Nathan Zumwalt's Blog
My Links
Blog Stats
  • Posts - 6
  • Stories - 3
  • Comments - 2
  • Trackbacks - 6
Archives

 I love data structures.  In college, it was the only class that taught pertinent information (as least in my humble opinion), and it’s something programmers use every day (even if they don’t realize it). 

 

Many programmers don’t like creating their own data structures for some reason.  I don’t know if it’s because it’s intimidating, or they’re just lazy.  What the short-sighted programmer doesn’t realize is that exclusively using Java’s built in data structures and creating work around code to implement requirements makes code less intuitive and harder to maintain.  Instead, if a requirement isn’t easily implemented using one of Java’s many built in data structures, then a custom data structure should be created.

 

There are several ways to implement your custom data structure. The Collections Framework defines a series of interfaces and abstract concrete classes.  The “traditional” way of creating a new data structure would be to implement one of these interfaces, or extending one of these concrete classes.  The problem with this approach is that there are generally many methods that need to be implemented.  Unless all those methods are required for the customer data structure, much of that code is useless.

 

A second approach is to directly extend one of the Collections Framework classes.  For example, you extend the ArrayList class.  The problem here is that you end up with methods implemented in the super class that aren’t relevant to your custom data structure.  A perfect example is the Stack class in the Collections Framework.  It extends the Vector class, which includes methods such as “add(int index, Object value)”.  This would allow the user to insert values into the middle of the Stack.  In essence, this defeats the point of a stack.

 

The final approach is to use composition.  Here, a base class is a private property of the custom data structure, and methods from the custom data structure operate on the underlying Collections class.  This allows the data structure to “hide” methods that we don’t want to expose to the data structure user, but we’re not reinventing the wheel by rebuilding a Collections class from the ground up. 

 

Code is the easiest way to illustrate this concept.  The following example is a HistogramList.  In this data structure, the number of occurrences of an Object is just as important as the Object itself.

 

import java.util.ArrayList;

import java.util.Collections;

 

/**

 * @author Nathan Zumwalt, Pariveda Solutions

 * @version $Id: $

 */

public class HistogramList {

 

   private ArrayList list = new ArrayList();

 

   public boolean add(Object o) {

 

      boolean found = false;

      for (int x = 0; x < list.size() && !found; x++) {

         HistogramListItem listItem = (HistogramListItem) list.get(x);

 

         if ( listItem.getItem().equals(o) ) {

            listItem.incrementOccurances();

            found = true;

         }

      }

 

      if ( !found ) {

         HistogramListItem listItem = new HistogramListItem(o);

         list.add(listItem);

      }

 

      Collections.sort(list);

 

      return true;

   }

 

   public int size() {

      return list.size();

   }

 

   public Object getItem(int index) {

      return ((HistogramListItem) list.get(index)).getItem();

   }

 

   public int getOccurances(int index) {

      return ((HistogramListItem) list.get(index)).getOccurances();

   }

 

   public String toString() {

      String returnString = "";

      for (int i = 0; i < list.size(); i++) {

         HistogramListItem listItem = (HistogramListItem) list.get(i);

returnString += listItem.getOccurances() + ":" + listItem.getItem() + ", ";

      }

      return returnString.substring(0, returnString.length() - 2);

   }

 

   private class HistogramListItem implements Comparable {

      private Object item = null;

      private int occurances = 0;

 

      public int compareTo(Object o) {

         HistogramListItem listItem = (HistogramListItem) o;

 

         if ( listItem.getOccurances() > this.getOccurances() )

            return 1;

         else if ( listItem.getOccurances() < this.getOccurances() )

            return -1;

         else

            return 0;

      }

 

      public HistogramListItem(Object item) {

         this.item = item;

         occurances = 1;

      }

 

      public Object getItem() {

         return item;

      }

 

      public int getOccurances() {

         return occurances;

      }

 

      public void incrementOccurances() {

         this.occurances++;

      }

   }

 

   public static void main(String[] args) {

      HistogramList list = new HistogramList();

      list.add("asdf2");

      list.add("asdf");

      list.add("asdf");

      list.add("asdf");

      list.add("asdf1");

      System.out.println(list);

 

   }

}

 

Note that the HistogramList doesn’t implement any interfaces, or extend any classes.  I do implement some standard list methods, such as “size”, “add” and “getItem” for clarity, though I could name these methods what ever I want. 

 

The bulk of the code for HistogramList is in the add method.  Note that I also keep this list sorted for easier processing later.  Before the performance zealots start rioting, it may seem inefficient to sort on every insert, but it isn’t that bad.  The Collections.sort method uses a modified Merge sort, which is very quick for nearly sorted lists, and the underlying list will always be sorted except for the item we just inserted.  If this still makes you cringe, the call to sort can be done in the “getItem” method or explicitly done with a “sort” method.

 

Generally, I’m not a fan on inner classes, but I think this data structure is an example of a good use of private inner classes.  Here, we want to completely hide the underlying storage mechanism from the data structure user.  This would be impossible (and very annoying) if we required the user to create a HistogramListItem in order to insert into our HistogramList.  Note that the HistogramListItem implements Comparable so that our list is properly sorted when a new item is inserted. 

 

Another benefit of using composition is that items can be stored in separate underlying data structures for accessing the same data in different ways.  To keep the mind limber and show this isn’t Java-specific, here is some C# code to illustrate this concept:

 

using System;

using System.Collections;

 

namespace DataStructures

{

   public class CustomerCollection : ICollection, IEnumerable

   {

      public CustomerCollection()

      {

      }

 

      private ArrayList _arrayList = new ArrayList();

      private Hashtable _hashtable = new Hashtable();

 

      public void Add(Customer customer)

      {

         _arrayList.Add(customer);

         _hashtable.Add(customer.Name, customer);

      }

 

      public Customer this[int index]

      {

         get { return (Customer) _arrayList[index]; }

      }

 

      public Customer this[string name]

      {

         get { return (Customer) _hashtable[name]; }

      }

 

      public void Remove(int index)

      {

         object o = _arrayList[index];

         _hashtable.Remove(o);

         _arrayList.Remove(o);

      }

 

      public void CopyTo(Array array, int index)

      {

         _arrayList.CopyTo(array, index);

      }

 

      public int Count

      {

         get { return _arrayList.Count; }

      }

 

      public object SyncRoot

      {

         get { return _arrayList.SyncRoot; }

      }

 

      public bool IsSynchronized

      {

         get { return _arrayList.IsSynchronized; }

      }

 

      public IEnumerator GetEnumerator()

      {

         return _arrayList.GetEnumerator();

      }

   }

}

 

Here, we have an ArrayList for retrieving a list of Customer objects sequentially, and a Hashtable for quick retrievals of customers based on name.  Since we’ve already implemented most of the methods anyway, the IEnumerator and ICollection interfaces are extended.  Note that this code also only allows objects of type Customer to be inserted into the collection.  This is the best way to enforce type use until generics are supported in C#.

 

As we’ve seen, composition is a powerful tool in creating custom data structures.  Many times, it’s more efficient from a code hiding and maintenance perspective than extending existing classes or interfaces.  While many would argue this approach doesn’t conform to the strictures of OO design, I believe it conforms to the spirit of OO design, and that’s all that matters in the end.

 

posted on Tuesday, February 01, 2005 5:55 PM
Feedback
Title  
Name  
Url
Comments   
Protected by Clearscreen.SharpHIPIn order to prevent spam, please enter the code to post a comment.:
Nathan Zumwalt