#ifndef HASHTABLE_H #define HASHTABLE_H ///////////////////////////////////////////////////////////////////// // HashTable.h - HashTable Container // // ver 2.5 - Holds nodes with (key, value) pairs. // // // // Language: Visual Studio.Net, using standard C++ // // Platform: Dell Dimension 8100, Windows 2000, SP 2 // // Application: Demonstrates class templates and functors // // CSE687, Spring 2003 // // Author: Jim Fawcett, CST 2-187, Syracuse University // // (315) 443-3948, jfawcett@twcny.rr.com // ///////////////////////////////////////////////////////////////////// /* Module Operation: ================= This module provides five classes: - node holds a single lookup key and a value - HashTable holds nodes in a table of pointers to nodes. Some nodes may point to other nodes, although this is usually rare. - HashBase, a base class for deriving Hash function objects. - HashString, a hash function object for strings. - HashIterator provides smart pointers that iterate through a HashTable, element-by-element. HashTables provide an iterator trait so that iterators can be declared as HashTable::iterator. They exhibit fast lookup and insertion by using a hash function object to compute table addresses. If two unique keys hash to the same address, then the keys are arranged in a linked list of nodes with the most recent first in the list. - HashTable does not allow duplicate keys. - If an Insert(dupKey,aValue) is attempted where dupKey already exists in the table, the original value associated with dupKey will be replaced by aValue. - If the Value(key) function is called for a key that is not in the HashTable a range_error exception is thrown. - Only one derived Hash function is provided by this module, for string keys. To store any other type you will need to provide your own hash function. - Any hash function that derives from HashBase can be used by this HashTable. - All that a new hash function is required to provide is a function: long int operator()(const key& k) const; Note that, unlike the std::map, HastTable does not hold elements in order sorted by key. Element order is determinded by addresses generated by the hash function. The nodes and HashIterator classes provide key_type and value_type traits so that functions may use objects of these classes without knowing their template parameter types. // node class interface: ===================== node n1("one",w1), n3("two",w2), n4("three",w3); // creates three nodes node n2("four",w4,&n1); // n1 is linked to n2 n3 = n2->next(); // assigns n1 to n3 key_type k = n2.Key(); // returns "four" value_type v = n2.Value(); // returns w4 n2.successor() = &n4; // links n4 to n2, losing n1 string s = n2.show(); // s holds display string HashTable class interface: ========================== HashTable ht(10); // creates a HashTable with // a 10 element table HashTable ht2(ht); // ht2 is copy of ht ht2 = ht3; // assigns ht3 state to h2 ht.Insert("aardvark",1); // inserts node("aardvark",1) ht["mongoose"] = 12; // inserts node("mongoose",12) // both Insert and operator[] add new node if key does not exist // otherwise they overwrite the old value with their new values bool b = ht.Contains("arrdvark"); // is true int v = ht.Value("aardvark"); // is 1 long int size = ht.Size(); // physical size of table long int numElements(); // number of table entries ht.Verbose(true); // announces inserts & location HashTable::iterator it // declares an iterator it = ht.begin(); // assigns iterator pointing // to first node it = ht.end(); // assigns iterator pointing // one past last node HashIterator class interface: ============================= HashIterator it1, it2; // default ctor HashIterator it3(it1); //copy ctor it2 = it3; // assignment node n = *it2; // contents pointed to string s = it2->Key(); // selection of Key() (++it2)->Value(); // increments then selects (it2++)->Value(); // selects then increments (--it2)->Value(); // decrements then selects (it2--)->Value(); // selects then decrements bool b = it2==it3; // test for equality bool b = it2!=it3; // test for inequality long int = it2.CurrentIndex(); // returns current table index The HashIterator class provides a nice demonstration of how to write an iterator. */ // ///////////////////////////////////////////////////////////////////// // Build Process // ///////////////////////////////////////////////////////////////////// // Required Files: // // HashTable.h, HashTable.cpp // // // // Build Process: // // Set preprocessor definition TEST_HASHTABLE in properties. // // Set HashTable as the startup project. // // Compile and run in Visual Studio. // // or: // // cl /DTEST_HASHTABLE HashTable.cpp // ///////////////////////////////////////////////////////////////////// /* Maintenance History: ==================== ver 2.5 : 08 Jun 10 - added std:: qualifier to range_error exception identifiers so compilation will succeed with VS2010 - found by Dusan Palider. Thanks Dusan. ver 2.4 : 06 Feb 06 - fixed bug in hash function for doubles in HashTable.cpp ver 2.3 : 11 Feb 05 - fixed memory leaks in destructor by delete [] table and in assignment operator by deleting node chains. Bug reported by Karan Sachar. Thanks Karan. ver 2.2 : 01 Feb 04 - added typename qualifer to global sum funtion defined in HashTable.cpp. ver 2.1 : 28 Feb 03 Thanks to Ryan Nugent, Ed Friers, and Eliot Dudley for bug reports. - moved HashString::operator() definition to HashTable.cpp, as it is not templatized. Eliminates multiple definition link errors. - Modified HashString::operator() which was badly flawed in that it return negative table addresses under some conditions. - Added a more stressful Hashing test to the test stub that shows the modified version does a fairly good job of hashing arbitrary strings. - Provided a constructor for HashString that allows you to enter a size for standalone testing. Has same default construction as in previous version. ver 2.0 : 05 Feb 03 - fixed problem with HashBase, spotted in class on 02/03/03 - changed the HashTable interface to match a subset of the std::map interface, e.g., changed Add to insert, and added some additional members. - provided support for iteration, via HashIterator class - added demonstration HashDouble functor ver 1.0 : 02 Feb 03 - first release */ // #include #include #include #include /////////////////////////////////////////////////////////// // node class /////////////////////////////////////////////////////////// // - holds data in HashTable // - compiler generated copy ctor is correct // - assignment operator provided to allow assigning // key and value without changing the node // linkage // - nodes link through node* successor /////////////////////////////////////////////////////////// template class node { public: typedef key key_type; typedef value value_type; node(const key& k, const value& v, node* successor=0); ~node(); node& operator=(const node& nd); const key& Key() const; value& Value(); node* next(); node*& successor(); std::string show(); private: node* _successor; key _key; value _value; }; //----< constructor: key, value, pointer to successor >---------- template node ::node(const key &k, const value &v, node* successor) : _successor(successor),_key(k),_value(v) {} //----< destructor >--------------------------------------------- template node::~node() { } // //----< assignment operator >------------------------------------ template node& node::operator=(const node& nd) { if(this == &nd) return *this; _key = nd._key; _value = nd._value; // intentionally keeping the old successor return *this; } //----< returns value of node >---------------------------------- template value& node::Value() { return _value; } //----< returns key of node >------------------------------------ template const key& node::Key() const { return _key; } //----< returns pointer to successor node >---------------------- template node* node::next() { return _successor; // can't modify } //----< returns reference to pointer to successor node >--------- template node*& node::successor() { return _successor; // can modify } //----< return string showing node contents >-------------------- template std::string node::show() { std::ostringstream temp; temp << "{ " << _key << ", " << _value << " }"; return temp.str(); } // ///////////////////////////////////////////////////// // Hash functor class ///////////////////////////////////////////////////// // - accepts size of HashTable // - returns a table address from operator() // - HashType ht ==> // ht.operator()(key) <==> ht(key) // - table addresses: loc = ht(key) // // Functors are more powerful than functions. // Here, HashBase allows HashTable to tell it // the table size so its derived classes can // properly compute table addresses. ///////////////////////////////////////////////////// template class HashBase { public: HashBase(long int size=0) : _size(size) {}; void Size(long int size) { _size = size; } virtual unsigned long operator() (const T& t) const = 0; protected: long int _size; }; class HashString : public HashBase { public: HashString(long int size=0) : HashBase(size) {} unsigned long operator() (const std::string& s) const; }; // ///////////////////////////////////////////////////////// // HashTable class ///////////////////////////////////////////////////////// // - node* table, array of node pointers // - nodes can link through node* successor ///////////////////////////////////////////////////////// template < typename key, typename value, typename Hash > class HashIterator; template < typename key, typename value, typename Hash > class HashTable { friend class HashIterator; public: typedef key key_type; typedef value value_type; typedef HashIterator< key,value,Hash > iterator; HashTable(long int size); HashTable(const HashTable& ht); ~HashTable(); HashTable& operator=(const HashTable& ht); HashIterator find(const key& k); HashIterator insert(const key& k, const value& v); value& operator[](const key& k); bool Contains(const key& k) const; value& Value(const key& k); value Value(const key& k) const; void Verbose(bool v); long int Size(); long int numberOfElements(); typedef node* PointerToNode; iterator begin(); iterator end(); private: void deleteNodeChain(node* pNode); PointerToNode* table; // array of pointers to node long int tableSize; long int numElements; Hash _hash; bool _verbose; }; //----< constructor takes table size >--------------------------- template < typename key, typename value, typename Hash > HashTable< key,value,Hash >::HashTable(long int size) : tableSize(size), _verbose(false), numElements(0) { table = new PointerToNode[size]; for(long int i=0; i------- template < typename key, typename value, typename Hash > void HashTable< key,value,Hash >::deleteNodeChain(node* pNode) { if(pNode == 0) return; if(pNode->next() !=0) deleteNodeChain(pNode->next()); delete pNode; pNode = 0; } //----< destructor uses helper function >------------------------ template < typename key, typename value, typename Hash > HashTable< key,value,Hash >::~HashTable() { for(long int i=0; i-------- template < typename key, typename value, typename Hash > HashIterator HashTable< key,value,Hash >::insert(const key& k, const value& v) { unsigned long loc = _hash(k); if(Contains(k)) // don't store duplicate keys { if(_verbose) std::cout << "\n setting { " << k << ", " << Value(k) << " } to {" << k << ", " << v << " }"; Value(k) = v; return find(k); } ++numElements; node* pNode = new node(k,v,table[loc]); table[loc] = pNode; if(_verbose) std::cout << "\n loc: " << loc << " - { " << k << ", " << v << " }"; return HashIterator(*this,pNode,loc); } //----< Contains checks for containment of given key >----------- template < typename key, typename value, typename Hash > bool HashTable< key,value,Hash >::Contains(const key& k) const { unsigned long loc = _hash(k); node* pNode = table[loc]; if(pNode == 0) return false; do { if(pNode->Key() == k) return true; pNode = pNode->next(); } while(pNode != 0); return false; } // //---< return iterator pointing to node with key, else end() >--- template < typename key, typename value, typename Hash > HashIterator HashTable< key,value,Hash >::find(const key& k) { unsigned long loc = _hash.operator()(k); node* pNode = table[loc]; if(pNode == 0) return end(); do { if(pNode->Key() == k) return HashIterator(*this,pNode); pNode = pNode->next(); } while(pNode != 0); return end(); } //----< Value returns reference to key's value >----------------- template < typename key, typename value, typename Hash > value& HashTable< key,value,Hash >::Value(const key& k) { unsigned long loc = _hash.operator()(k); node* pNode = table[loc]; do { if(pNode->Key() == k) return pNode->Value(); pNode = pNode->next(); } while(pNode != 0); throw std::range_error("can't find key"); } //----< Value returns copy of key's value >---------------------- template < typename key, typename value, typename Hash > value HashTable< key,value,Hash >::Value(const key& k) const { unsigned long loc = _hash.operator()(k); node* pNode = table[loc]; do { if(pNode->Key() == k) return pNode->Value(); pNode = pNode->next(); } while(pNode != 0); throw std::range_error("can't find key"); } //----< puts HashTable into verbose mode >----------------------- template < typename key, typename value, typename Hash > void HashTable< key,value,Hash >::Verbose(bool v) { _verbose = v; } //----< returns table size >------------------------------------- template < typename key, typename value, typename Hash > long int HashTable< key,value,Hash >::Size() { return tableSize; } // //----< returns number of elements in table >-------------------- template < typename key, typename value, typename Hash > long int HashTable< key,value,Hash >::numberOfElements() { return numElements; } //----< another syntax for insertion >--------------------------- // // ht["aKey"] = aValue // - if aKey not in table a node("aKey",aValue) // is created // - if aKey is in table, its value is overwritten with aValue // template < typename key, typename value, typename Hash > value& HashTable::operator[](const key& k) { value v = value(); if(Contains(k)) v = Value(k); iterator it = insert(k,v); return it->Value(); } //----< returns iterator pointing to first node >---------------- template < typename key, typename value, typename Hash > HashIterator HashTable< key,value,Hash >::begin() { for(int i=0; i--------------- template < typename key, typename value, typename Hash > HashIterator HashTable< key,value,Hash >::end() { for(int i=tableSize-1; i>=0; --i) { if(table[i] != 0) { return iterator(*this,0,i+1); } } return iterator(*this); } // //----< copy constructor>---------------------------------------- template < typename key, typename value, typename Hash > HashTable< key,value,Hash >::HashTable(const HashTable& ht) : tableSize(ht.tableSize), _verbose(false) { table = new PointerToNode[tableSize]; for(long int i=0; i it; // we know we won't change ht - just reading its values // but compiler doesn't know that so we need const_cast HashIterator itBeg = const_cast< HashTable< key,value,Hash>* >(&ht)->begin(); HashIterator itEnd = const_cast< HashTable< key,value,Hash>* >(&ht)->end(); it = itBeg; while(it != itEnd) { key k = it->Key(); value v = it->Value(); this->insert(k,v); ++it; } } // //----< assignment operator >------------------------------------ template < typename key, typename value, typename Hash > HashTable& HashTable< key,value,Hash >::operator=(const HashTable& ht) { if(this == &ht) return *this; for(long int i=0; i it; // see note in copy constructor HashIterator itBeg = const_cast< HashTable< key,value,Hash>* >(&ht)->begin(); HashIterator itEnd = const_cast< HashTable< key,value,Hash>* >(&ht)->end(); it = itBeg; while(it != itEnd) { key k = it->Key(); value v = it->Value(); this->insert(k,v); ++it; } return *this; } // ///////////////////////////////////////////////////////////////////// // HashIterator class ///////////////////////////////////////////////////////////////////// // - Iterator is a smart "pointer" // - has pointer methods *it, it->, ++it, it++, --it, it-- // - has copy, assignment, and comparison methods // - walks table AND node chains in table, so ++ and -- methods // are complicated // - needs to use both table index: CurrentIndex // and node pointer: pCurrentNode ///////////////////////////////////////////////////////////////////// template class HashIterator : public std::iterator< std::bidirectional_iterator_tag, node > { public: typedef key key_type; typedef value value_type; typedef HashIterator iterator; HashIterator(); HashIterator(const HashIterator& hi); HashIterator( HashTable& ht, node* pNode = 0, long int index=0 ); HashIterator& operator=(const HashIterator& hi); node& operator*(); node* operator->(); iterator& operator++(); iterator operator++(int); iterator& operator--(); iterator operator--(int); bool operator==(const HashIterator& hi) const; bool operator!=(const HashIterator& hi) const; long int CurrentIndex(); private: HashTable* pHashTable; node* pCurrentNode; long int _CurrentIndex; }; template inline long int HashIterator::CurrentIndex() { return _CurrentIndex; } // //----< default constructor >------------------------------------ template HashIterator::HashIterator() : pHashTable(0), pCurrentNode(0) {} //----< copy constructor >--------------------------------------- template HashIterator:: HashIterator(const HashIterator& hi) { pHashTable = hi.pHashTable; // iterator pointing to same table pCurrentNode = hi.pCurrentNode; _CurrentIndex = hi._CurrentIndex; } //----< ctor takes a HashTable, pointer to node, and index >----- // // used only in find(), begin(), and end() // template HashIterator:: HashIterator( HashTable& ht, node* pNode, long int index ) : pHashTable(&ht), pCurrentNode(pNode), _CurrentIndex(index) {} //----< assignment operator >------------------------------------ template HashIterator& HashIterator:: operator=(const HashIterator& hi) { if(this == &hi) return *this; pHashTable = hi.pHashTable; // points to same table pCurrentNode = hi.pCurrentNode; _CurrentIndex = hi._CurrentIndex; return *this; } // //----< de-reference operator* >--------------------------------- template node& HashIterator::operator*() { return *pCurrentNode; } //----< selection operator-> >----------------------------------- template node* HashIterator::operator->() { return pCurrentNode; } //----< pre-increment operator++ >------------------------------- // // Has to walk both table and node chains // template HashIterator& HashIterator::operator++() { if(pCurrentNode != 0 && (pCurrentNode = pCurrentNode->next()) != 0) return *this; // next node in chain if(_CurrentIndex < pHashTable->tableSize-1) { long int Index = _CurrentIndex; while(pHashTable->table[++Index] == 0) { if(Index == pHashTable->tableSize-1) { pCurrentNode = 0; ++_CurrentIndex; return *this; // no more nodes } } _CurrentIndex = Index; pCurrentNode = pHashTable->table[_CurrentIndex]; return *this; // first node in next chain } pCurrentNode = 0; _CurrentIndex = pHashTable->tableSize; return *this; } //----< post-increment operator++ >------------------------------ template HashIterator HashIterator::operator++(int) { HashIterator temp(*this); // save current iterator operator++(); // increment internal state return temp; // return temp of prior state } // //----< pre-decrement operator-- >------------------------------- template HashIterator& HashIterator::operator--() { // are we at first node in chain or nodeless table index? if(pCurrentNode == pHashTable->table[_CurrentIndex] || pCurrentNode == 0) { if(_CurrentIndex == 0) // at table beginning so return return *this; // backup until we find a node pointer while(--_CurrentIndex && pHashTable->table[_CurrentIndex] == 0) if(_CurrentIndex == 0) return *this; // walk chain until we get to the last element pCurrentNode = pHashTable->table[_CurrentIndex]; while(pCurrentNode->next() != 0) pCurrentNode = pCurrentNode->next(); return *this; } // not at first node in chain or nodeless index node* temp = pHashTable->table[_CurrentIndex]; while(temp->next() != pCurrentNode && temp->next() != 0) temp = temp->next(); pCurrentNode = temp; return *this; } //----< post-decrement operator-- >------------------------------ template HashIterator HashIterator::operator--(int) { HashIterator temp(*this); operator--(); return temp; } // //----< equality comparison >------------------------------------ template bool HashIterator::operator==(const HashIterator& hi) const { if( pHashTable == hi.pHashTable && pCurrentNode == hi.pCurrentNode && _CurrentIndex == hi._CurrentIndex ) return true; return false; } //----< inequality comparison >---------------------------------- template bool HashIterator::operator!=(const HashIterator& hi) const { if(*this == hi) return false; return true; } #endif