C/C++中的哈希表实现
C/C++(关联数组)中的哈希表是一种将键映射到值的数据结构。
这使用哈希函数来计算键的索引。
基于哈希表索引,我们可以将值存储在适当的位置。
如果两个不同的键获得相同的索引,则需要使用其他数据结构(存储桶)来解决这些冲突。
使用哈希表的全部好处在于访问时间非常快。
虽然可能会发生冲突,但是如果我们选择一个很好的哈希函数,则该机会几乎为零。
因此,平均而言,时间复杂度是常数O(1)访问时间。
这称为摊销时间复杂度。
C++ STL(标准模板库)具有" std :: unordered_map()"数据结构,该数据结构实现了所有这些哈希表功能。
但是,了解如何从头开始构建哈希表是一项至关重要的技能,而这正是我们旨在向您展示的内容。
任何哈希表实现都具有以下三个组成部分:
良好的哈希函数,可将键映射到值
支持插入,搜索和删除操作的哈希表数据结构。
解释键冲突的数据结构
选择哈希函数
第一步是选择冲突可能性较低的合理的哈希函数。
但是,由于这只是为了说明,所以我会做相反的事情!逆向心理学,是吗?
在本文中,我们将仅处理字符串(或者C中的字符数组)。
我将使用一个非常简单的哈希函数,该函数简单地将字符串的ASCII值求和。
我正在使用它来向您展示我们将如何处理冲突案例。
#define CAPACITY 50000 //Size of the Hash Table unsigned long hash_function(char* str) { unsigned long i = 0; for (int j=0; str[j]; j++) i += str[j]; return i % CAPACITY; }
您可以针对不同的字符串进行测试,并检查它们是否冲突。
例如,字符串" Hel"和" Cau"将冲突,因为它们具有相同的ASCII值。
注意:我们必须返回哈希表容量之内的数字。
否则,我们可能会访问未绑定的内存位置,从而导致错误。
定义哈希表数据结构
哈希表是一组项目,它们本身是{key:value}对。
让我们先定义项目结构。
typedef struct Ht_item Ht_item; //Define the Hash Table Item here struct Ht_item { char* key; char* value; };
现在,哈希表具有一个指向" Ht_item"的指针数组,因此它是一个双指针。
除此之外,我们还将使用count来跟踪Hash表中的元素数量,并将表的大小存储在size中。
typedef struct HashTable HashTable; //Define the Hash Table here struct HashTable { //Contains an array of pointers //to items Ht_item** items; int size; int count; };
创建哈希表及其项
我们需要函数来在内存中创建一个新的哈希表,并创建其项。
首先创建一个项目。
这非常简单,因为我们只需要为其键和值分配内存并返回指向该项的指针。
Ht_item* create_item(char* key, char* value) { //Creates a pointer to a new hash table item Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item)); item->key = (char*) malloc (strlen(key) + 1); item->value = (char*) malloc (strlen(value) + 1); strcpy(item->key, key); strcpy(item->value, value); return item; }
现在,让我们编写用于创建表格的代码。
这将为包装器结构" HashTable"分配内存,并将其所有项目设置为" NULL"(因为未使用它们)。
HashTable* create_table(int size) { //Creates a new HashTable HashTable* table = (HashTable*) malloc (sizeof(HashTable)); table->size = size; table->count = 0; table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*)); for (int i=0; i<table->size; i++) table->items[i] = NULL; return table; }
现在,我们几乎完成了这部分。
作为C/C++程序员,您有责任使用malloc()
,calloc()
释放已在堆上分配的内存。
因此,让我们编写一些函数,这些函数可以释放表项以及整个表。
void free_item(Ht_item* item) { //Frees an item free(item->key); free(item->value); free(item); } void free_table(HashTable* table) { //Frees the table for (int i=0; i<table->size; i++) { Ht_item* item = table->items[i]; if (item != NULL) free_item(item); } free(table->items); free(table); }
现在,我们已经完成了构建功能哈希表的基础工作。
现在让我们开始编写我们的insert()
,search()
和delete()
方法。
插入哈希表
我们将创建一个函数ht_insert()来为我们执行此任务。
这需要一个"哈希表"指针,一个"键"和一个"值"作为参数。
void ht_insert(HashTable* table, char* key, char* value);
现在,插入功能涉及某些步骤。
- 根据{key:value}对创建项目
- 基于哈希函数计算索引
- 通过比较键检查索引是否已被占用,如果没有被占用。
我们可以直接将其插入index
- 否则,这是一次碰撞,我们需要处理它
我将说明创建初始模型后如何处理碰撞,因此让我们稍等一下。
第一步很简单。
我们直接调用create_item(key,value)
。
int index = hash_function(key);
第二步和第三步使用" hash_function(key)"来获取索引。
如果是第一次插入密钥,则该项必须为" NULL"。
否则,确切的key:value对已经存在,或者是冲突。
在这种情况下,我们将定义另一个函数handle_collision()
,顾名思义,它将为我们处理这种潜在的碰撞。
//Create the item Ht_item* item = create_item(key, value); //Compute the index int index = hash_function(key); Ht_item* current_item = table->items[index]; if (current_item == NULL) { //Key does not exist. if (table->count == table->size) { //Hash Table Full printf("Insert Error: Hash Table is full\n"); free_item(item); return; } //Insert directly table->items[index] = item; table->count++; }
让我们考虑第一种情况,其中存在key:value对(即,之前已插入相同的项目)。
在这种情况下,我们必须仅将项目值更新为新值。
if (current_item == NULL) { .... .... } else { //Scenario 1: We only need to update value if (strcmp(current_item->key, key) == 0) { strcpy(table->items[index]->value, value); return; } else { //Scenario 2: Collision //We will handle case this a bit later handle_collision(table, item); return; } }
好的,所以我们的插入函数(没有冲突)现在看起来像这样:
void handle_collision(HashTable* table, Ht_item* item) { } void ht_insert(HashTable* table, char* key, char* value) { //Create the item Ht_item* item = create_item(key, value); Ht_item* current_item = table->items[index]; if (current_item == NULL) { //Key does not exist. if (table->count == table->size) { //Hash Table Full printf("Insert Error: Hash Table is full\n"); return; } //Insert directly table->items[index] = item; table->count++; } else { //Scenario 1: We only need to update value if (strcmp(current_item->key, key) == 0) { strcpy(table->items[index]->value, value); return; } else { //Scenario 2: Collision //We will handle case this a bit later handle_collision(table, item); return; } } }
在哈希表中搜索项目
如果要检查插入是否正确完成,则还需要定义一个搜索功能,该功能检查键是否存在,如果存在则返回相应的值。
char* ht_search(HastTable* table, char* key);
逻辑很简单。
它只是移至非NULL项并比较键。
否则,我们将返回NULL
char* ht_search(HashTable* table, char* key) { //Searches the key in the hashtable //and returns NULL if it doesn't exist int index = hash_function(key); Ht_item* item = table->items[index]; //Ensure that we move to a non NULL item if (item != NULL) { if (strcmp(item->key, key) == 0) return item->value; } return NULL; }
测试我们的基本模型
让我们通过编写一个" main()"驱动程序来检查我们编写的内容是否正确。
我们将添加另一个函数print_table()
来打印哈希表,以供说明。
#include <stdio.h> #include <stdlib.h> #include <string.h> #define CAPACITY 50000 //Size of the Hash Table unsigned long hash_function(char* str) { unsigned long i = 0; for (int j=0; str[j]; j++) i += str[j]; return i % CAPACITY; } typedef struct Ht_item Ht_item; //Define the Hash Table Item here struct Ht_item { char* key; char* value; }; typedef struct HashTable HashTable; //Define the Hash Table here struct HashTable { //Contains an array of pointers //to items Ht_item** items; int size; int count; }; Ht_item* create_item(char* key, char* value) { //Creates a pointer to a new hash table item Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item)); item->key = (char*) malloc (strlen(key) + 1); item->value = (char*) malloc (strlen(value) + 1); strcpy(item->key, key); strcpy(item->value, value); return item; } HashTable* create_table(int size) { //Creates a new HashTable HashTable* table = (HashTable*) malloc (sizeof(HashTable)); table->size = size; table->count = 0; table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*)); for (int i=0; i<table->size; i++) table->items[i] = NULL; return table; } void free_item(Ht_item* item) { //Frees an item free(item->key); free(item->value); free(item); } void free_table(HashTable* table) { //Frees the table for (int i=0; i<table->size; i++) { Ht_item* item = table->items[i]; if (item != NULL) free_item(item); } free(table->items); free(table); } void handle_collision(HashTable* table, unsigned long index, Ht_item* item) { } void ht_insert(HashTable* table, char* key, char* value) { //Create the item Ht_item* item = create_item(key, value); //Compute the index unsigned long index = hash_function(key); Ht_item* current_item = table->items[index]; if (current_item == NULL) { //Key does not exist. if (table->count == table->size) { //Hash Table Full printf("Insert Error: Hash Table is full\n"); //Remove the create item free_item(item); return; } //Insert directly table->items[index] = item; table->count++; } else { //Scenario 1: We only need to update value if (strcmp(current_item->key, key) == 0) { strcpy(table->items[index]->value, value); return; } else { //Scenario 2: Collision //We will handle case this a bit later handle_collision(table, index, item); return; } } } char* ht_search(HashTable* table, char* key) { //Searches the key in the hashtable //and returns NULL if it doesn't exist int index = hash_function(key); Ht_item* item = table->items[index]; //Ensure that we move to a non NULL item if (item != NULL) { if (strcmp(item->key, key) == 0) return item->value; } return NULL; } void print_search(HashTable* table, char* key) { char* val; if ((val = ht_search(table, key)) == NULL) { printf("Key:%s does not exist\n", key); return; } else { printf("Key:%s, Value:%s\n", key, val); } } void print_table(HashTable* table) { printf("\nHash Table\n-------------------\n"); for (int i=0; i<table->size; i++) { if (table->items[i]) { printf("Index:%d, Key:%s, Value:%s\n", i, table->items[i]->key, table->items[i]->value); } } printf("-------------------\n\n"); } int main() { HashTable* ht = create_table(CAPACITY); ht_insert(ht, "1", "First address"); ht_insert(ht, "2", "Second address"); print_search(ht, "1"); print_search(ht, "2"); print_search(ht, "3"); print_table(ht); free_table(ht); return 0; }
输出
Key:1, Value:First address Key:2, Value:Second address Key:3 does not exist Hash Table ------------------ Index:49, Key:1, Value:First address Index:50, Key:2, Value:Second address ------------------
这似乎按预期工作,所以现在让我们继续处理冲突。
处理碰撞
有多种解决冲突的方法。
我们将研究一种称为"独立链接"的方法,该方法旨在为具有相同哈希索引的所有项目创建独立的链。
我们将使用链接列表创建这些链。
每当发生碰撞时,我们都会添加其他项目,这些项目会在"溢出存储桶列表"的同一索引上发生碰撞。
因此,我们将不必删除哈希表上的任何现有记录。
由于链表在插入,搜索和删除时具有O(n)时间复杂度,因此在发生冲突的情况下,我们也将具有O(n)的最坏情况访问时间。
此方法的优点是,如果您的哈希表的容量较低,则是一个不错的选择。
使用链表进行单独链接-维基百科
涵盖了这一点,让我们开始实施我们的旧链接列表!
typedef struct LinkedList LinkedList; //Define the Linkedlist here struct LinkedList { Ht_item* item; LinkedList* next; }; LinkedList* allocate_list () { //Allocates memory for a Linkedlist pointer LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList)); return list; } LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) { //Inserts the item onto the Linked List if (!list) { LinkedList* head = allocate_list(); head->item = item; head->next = NULL; list = head; return list; } else if (list->next == NULL) { LinkedList* node = allocate_list(); node->item = item; node->next = NULL; list->next = node; return list; } LinkedList* temp = list; while (temp->next->next) { temp = temp->next; } LinkedList* node = allocate_list(); node->item = item; node->next = NULL; temp->next = node; return list; } Ht_item* linkedlist_remove(LinkedList* list) { //Removes the head from the linked list //and returns the item of the popped element if (!list) return NULL; if (!list->next) return NULL; LinkedList* node = list->next; LinkedList* temp = list; temp->next = NULL; list = node; Ht_item* it = NULL; memcpy(temp->item, it, sizeof(Ht_item)); free(temp->item->key); free(temp->item->value); free(temp->item); free(temp); return it; } void free_linkedlist(LinkedList* list) { LinkedList* temp = list; while (list) { temp = list; list = list->next; free(temp->item->key); free(temp->item->value); free(temp->item); free(temp); } }
现在,我们需要将这些"溢出存储桶"列表添加到哈希表中。
我们希望每个项目都有一个这样的链,因此对于整个表,它是LinkedList指针的数组。
typedef struct HashTable HashTable; //Define the Hash Table here struct HashTable { //Contains an array of pointers //to items Ht_item** items; LinkedList** overflow_buckets; int size; int count; };
现在,我们定义了overflow_buckets
,我们将添加函数来创建和删除它们。
我们还需要在旧的create_table()和free_table()函数中考虑它们。
LinkedList** create_overflow_buckets(HashTable* table) { //Create the overflow buckets; an array of linkedlists LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*)); for (int i=0; i<table->size; i++) buckets[i] = NULL; return buckets; } void free_overflow_buckets(HashTable* table) { //Free all the overflow bucket lists LinkedList** buckets = table->overflow_buckets; for (int i=0; i<table->size; i++) free_linkedlist(buckets[i]); free(buckets); } HashTable* create_table(int size) { //Creates a new HashTable HashTable* table = (HashTable*) malloc (sizeof(HashTable)); table->size = size; table->count = 0; table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*)); for (int i=0; i<table->size; i++) table->items[i] = NULL; table->overflow_buckets = create_overflow_buckets(table); return table; } void free_table(HashTable* table) { //Frees the table for (int i=0; i<table->size; i++) { Ht_item* item = table->items[i]; if (item != NULL) free_item(item); } //Free the overflow bucket linked linkedlist and it's items free_overflow_buckets(table); free(table->items); free(table); }
!我们还有工作要做。
现在,转到" handle_collision()"函数。
这里有两种情况。
如果该项目的溢出存储区列表不存在,我们需要创建一个这样的列表并将该项目添加到其中。
否则,我们可以简单地将项目插入列表
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) { LinkedList* head = table->overflow_buckets[index]; if (head == NULL) { //We need to create the list head = allocate_list(); head->item = item; table->overflow_buckets[index] = head; return; } else { //Insert to the list table->overflow_buckets[index] = linkedlist_insert(head, item); return; } }
插入完成了,但是现在,我们还需要更新搜索功能,因为我们可能还需要查看溢出存储桶。
char* ht_search(HashTable* table, char* key) { //Searches the key in the hashtable //and returns NULL if it doesn't exist int index = hash_function(key); Ht_item* item = table->items[index]; LinkedList* head = table->overflow_buckets[index]; //Ensure that we move to items which are not NULL while (item != NULL) { if (strcmp(item->key, key) == 0) return item->value; if (head == NULL) return NULL; item = head->item; head = head->next; } return NULL; }
最后,我们考虑了" insert()"和" search()"中的冲突!
为了提醒您代码的当前状态,我将完整的程序发布到现在。
#include <stdio.h> #include <stdlib.h> #include <string.h> #define CAPACITY 50000 //Size of the Hash Table unsigned long hash_function(char* str) { unsigned long i = 0; for (int j=0; str[j]; j++) i += str[j]; return i % CAPACITY; } typedef struct Ht_item Ht_item; //Define the Hash Table Item here struct Ht_item { char* key; char* value; }; typedef struct LinkedList LinkedList; //Define the Linkedlist here struct LinkedList { Ht_item* item; LinkedList* next; }; typedef struct HashTable HashTable; //Define the Hash Table here struct HashTable { //Contains an array of pointers //to items Ht_item** items; LinkedList** overflow_buckets; int size; int count; }; static LinkedList* allocate_list () { //Allocates memory for a Linkedlist pointer LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList)); return list; } static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) { //Inserts the item onto the Linked List if (!list) { LinkedList* head = allocate_list(); head->item = item; head->next = NULL; list = head; return list; } else if (list->next == NULL) { LinkedList* node = allocate_list(); node->item = item; node->next = NULL; list->next = node; return list; } LinkedList* temp = list; while (temp->next->next) { temp = temp->next; } LinkedList* node = allocate_list(); node->item = item; node->next = NULL; temp->next = node; return list; } static Ht_item* linkedlist_remove(LinkedList* list) { //Removes the head from the linked list //and returns the item of the popped element if (!list) return NULL; if (!list->next) return NULL; LinkedList* node = list->next; LinkedList* temp = list; temp->next = NULL; list = node; Ht_item* it = NULL; memcpy(temp->item, it, sizeof(Ht_item)); free(temp->item->key); free(temp->item->value); free(temp->item); free(temp); return it; } static void free_linkedlist(LinkedList* list) { LinkedList* temp = list; while (list) { temp = list; list = list->next; free(temp->item->key); free(temp->item->value); free(temp->item); free(temp); } } static LinkedList** create_overflow_buckets(HashTable* table) { //Create the overflow buckets; an array of linkedlists LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*)); for (int i=0; i<table->size; i++) buckets[i] = NULL; return buckets; } static void free_overflow_buckets(HashTable* table) { //Free all the overflow bucket lists LinkedList** buckets = table->overflow_buckets; for (int i=0; i<table->size; i++) free_linkedlist(buckets[i]); free(buckets); } Ht_item* create_item(char* key, char* value) { //Creates a pointer to a new hash table item Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item)); item->key = (char*) malloc (strlen(key) + 1); item->value = (char*) malloc (strlen(value) + 1); strcpy(item->key, key); strcpy(item->value, value); return item; } HashTable* create_table(int size) { //Creates a new HashTable HashTable* table = (HashTable*) malloc (sizeof(HashTable)); table->size = size; table->count = 0; table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*)); for (int i=0; i<table->size; i++) table->items[i] = NULL; table->overflow_buckets = create_overflow_buckets(table); return table; } void free_item(Ht_item* item) { //Frees an item free(item->key); free(item->value); free(item); } void free_table(HashTable* table) { //Frees the table for (int i=0; i<table->size; i++) { Ht_item* item = table->items[i]; if (item != NULL) free_item(item); } free_overflow_buckets(table); free(table->items); free(table); } void handle_collision(HashTable* table, unsigned long index, Ht_item* item) { LinkedList* head = table->overflow_buckets[index]; if (head == NULL) { //We need to create the list head = allocate_list(); head->item = item; table->overflow_buckets[index] = head; return; } else { //Insert to the list table->overflow_buckets[index] = linkedlist_insert(head, item); return; } } void ht_insert(HashTable* table, char* key, char* value) { //Create the item Ht_item* item = create_item(key, value); //Compute the index unsigned long index = hash_function(key); Ht_item* current_item = table->items[index]; if (current_item == NULL) { //Key does not exist. if (table->count == table->size) { //Hash Table Full printf("Insert Error: Hash Table is full\n"); //Remove the create item free_item(item); return; } //Insert directly table->items[index] = item; table->count++; } else { //Scenario 1: We only need to update value if (strcmp(current_item->key, key) == 0) { strcpy(table->items[index]->value, value); return; } else { //Scenario 2: Collision handle_collision(table, index, item); return; } } } char* ht_search(HashTable* table, char* key) { //Searches the key in the hashtable //and returns NULL if it doesn't exist int index = hash_function(key); Ht_item* item = table->items[index]; LinkedList* head = table->overflow_buckets[index]; //Ensure that we move to items which are not NULL while (item != NULL) { if (strcmp(item->key, key) == 0) return item->value; if (head == NULL) return NULL; item = head->item; head = head->next; } return NULL; } void print_search(HashTable* table, char* key) { char* val; if ((val = ht_search(table, key)) == NULL) { printf("%s does not exist\n", key); return; } else { printf("Key:%s, Value:%s\n", key, val); } } void print_table(HashTable* table) { printf("\n-------------------\n"); for (int i=0; i<table->size; i++) { if (table->items[i]) { printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value); if (table->overflow_buckets[i]) { printf(" => Overflow Bucket => "); LinkedList* head = table->overflow_buckets[i]; while (head) { printf("Key:%s, Value:%s ", head->item->key, head->item->value); head = head->next; } } printf("\n"); } } printf("-------------------\n"); } int main() { HashTable* ht = create_table(CAPACITY); ht_insert(ht, "1", "First address"); ht_insert(ht, "2", "Second address"); ht_insert(ht, "Hel", "Third address"); ht_insert(ht, "Cau", "Fourth address"); print_search(ht, "1"); print_search(ht, "2"); print_search(ht, "3"); print_search(ht, "Hel"); print_search(ht, "Cau"); print_table(ht); free_table(ht); return 0; }
从哈希表中删除
现在,让我们最后看看删除功能:
void ht_delete(HashTable* table, char* key);
同样,该方法类似于插入。
- 计算哈希索引并获取项目
- 如果为NULL,则无需执行任何操作
- 否则,在比较键之后,该索引没有碰撞链,只需从表中删除该项目
- 如果存在碰撞链,我们必须删除该元素并相应地移动链接
我没有为您提供太多细节,因为这仅涉及更新标题项和释放内存。
我的建议是尝试自己实施。
我将为您提供一个工作版本以进行比较。
void ht_delete(HashTable* table, char* key) { //Deletes an item from the table int index = hash_function(key); Ht_item* item = table->items[index]; LinkedList* head = table->overflow_buckets[index]; if (item == NULL) { //Does not exist. Return return; } else { if (head == NULL && strcmp(item->key, key) == 0) { //No collision chain. Remove the item //and set table index to NULL table->items[index] = NULL; free_item(item); table->count--; return; } else if (head != NULL) { //Collision Chain exists if (strcmp(item->key, key) == 0) { //Remove this item and set the head of the list //as the new item free_item(item); LinkedList* node = head; head = head->next; node->next = NULL; table->items[index] = create_item(node->item->key, node->item->value); free_linkedlist(node); table->overflow_buckets[index] = head; return; } LinkedList* curr = head; LinkedList* prev = NULL; while (curr) { if (strcmp(curr->item->key, key) == 0) { if (prev == NULL) { //First element of the chain. Remove the chain free_linkedlist(head); table->overflow_buckets[index] = NULL; return; } else { //This is somewhere in the chain prev->next = curr->next; curr->next = NULL; free_linkedlist(curr); table->overflow_buckets[index] = head; return; } } curr = curr->next; prev = curr; } } } }
全部放在一起
现在,最后,我将显示哈希表的整个程序。
#include <stdio.h> #include <stdlib.h> #include <string.h> #define CAPACITY 50000 //Size of the Hash Table unsigned long hash_function(char* str) { unsigned long i = 0; for (int j=0; str[j]; j++) i += str[j]; return i % CAPACITY; } typedef struct Ht_item Ht_item; //Define the Hash Table Item here struct Ht_item { char* key; char* value; }; typedef struct LinkedList LinkedList; //Define the Linkedlist here struct LinkedList { Ht_item* item; LinkedList* next; }; typedef struct HashTable HashTable; //Define the Hash Table here struct HashTable { //Contains an array of pointers //to items Ht_item** items; LinkedList** overflow_buckets; int size; int count; }; static LinkedList* allocate_list () { //Allocates memory for a Linkedlist pointer LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList)); return list; } static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) { //Inserts the item onto the Linked List if (!list) { LinkedList* head = allocate_list(); head->item = item; head->next = NULL; list = head; return list; } else if (list->next == NULL) { LinkedList* node = allocate_list(); node->item = item; node->next = NULL; list->next = node; return list; } LinkedList* temp = list; while (temp->next->next) { temp = temp->next; } LinkedList* node = allocate_list(); node->item = item; node->next = NULL; temp->next = node; return list; } static Ht_item* linkedlist_remove(LinkedList* list) { //Removes the head from the linked list //and returns the item of the popped element if (!list) return NULL; if (!list->next) return NULL; LinkedList* node = list->next; LinkedList* temp = list; temp->next = NULL; list = node; Ht_item* it = NULL; memcpy(temp->item, it, sizeof(Ht_item)); free(temp->item->key); free(temp->item->value); free(temp->item); free(temp); return it; } static void free_linkedlist(LinkedList* list) { LinkedList* temp = list; while (list) { temp = list; list = list->next; free(temp->item->key); free(temp->item->value); free(temp->item); free(temp); } } static LinkedList** create_overflow_buckets(HashTable* table) { //Create the overflow buckets; an array of linkedlists LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*)); for (int i=0; i<table->size; i++) buckets[i] = NULL; return buckets; } static void free_overflow_buckets(HashTable* table) { //Free all the overflow bucket lists LinkedList** buckets = table->overflow_buckets; for (int i=0; i<table->size; i++) free_linkedlist(buckets[i]); free(buckets); } Ht_item* create_item(char* key, char* value) { //Creates a pointer to a new hash table item Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item)); item->key = (char*) malloc (strlen(key) + 1); item->value = (char*) malloc (strlen(value) + 1); strcpy(item->key, key); strcpy(item->value, value); return item; } HashTable* create_table(int size) { //Creates a new HashTable HashTable* table = (HashTable*) malloc (sizeof(HashTable)); table->size = size; table->count = 0; table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*)); for (int i=0; i<table->size; i++) table->items[i] = NULL; table->overflow_buckets = create_overflow_buckets(table); return table; } void free_item(Ht_item* item) { //Frees an item free(item->key); free(item->value); free(item); } void free_table(HashTable* table) { //Frees the table for (int i=0; i<table->size; i++) { Ht_item* item = table->items[i]; if (item != NULL) free_item(item); } free_overflow_buckets(table); free(table->items); free(table); } void handle_collision(HashTable* table, unsigned long index, Ht_item* item) { LinkedList* head = table->overflow_buckets[index]; if (head == NULL) { //We need to create the list head = allocate_list(); head->item = item; table->overflow_buckets[index] = head; return; } else { //Insert to the list table->overflow_buckets[index] = linkedlist_insert(head, item); return; } } void ht_insert(HashTable* table, char* key, char* value) { //Create the item Ht_item* item = create_item(key, value); //Compute the index unsigned long index = hash_function(key); Ht_item* current_item = table->items[index]; if (current_item == NULL) { //Key does not exist. if (table->count == table->size) { //Hash Table Full printf("Insert Error: Hash Table is full\n"); //Remove the create item free_item(item); return; } //Insert directly table->items[index] = item; table->count++; } else { //Scenario 1: We only need to update value if (strcmp(current_item->key, key) == 0) { strcpy(table->items[index]->value, value); return; } else { //Scenario 2: Collision handle_collision(table, index, item); return; } } } char* ht_search(HashTable* table, char* key) { //Searches the key in the hashtable //and returns NULL if it doesn't exist int index = hash_function(key); Ht_item* item = table->items[index]; LinkedList* head = table->overflow_buckets[index]; //Ensure that we move to items which are not NULL while (item != NULL) { if (strcmp(item->key, key) == 0) return item->value; if (head == NULL) return NULL; item = head->item; head = head->next; } return NULL; } void ht_delete(HashTable* table, char* key) { //Deletes an item from the table int index = hash_function(key); Ht_item* item = table->items[index]; LinkedList* head = table->overflow_buckets[index]; if (item == NULL) { //Does not exist. Return return; } else { if (head == NULL && strcmp(item->key, key) == 0) { //No collision chain. Remove the item //and set table index to NULL table->items[index] = NULL; free_item(item); table->count--; return; } else if (head != NULL) { //Collision Chain exists if (strcmp(item->key, key) == 0) { //Remove this item and set the head of the list //as the new item free_item(item); LinkedList* node = head; head = head->next; node->next = NULL; table->items[index] = create_item(node->item->key, node->item->value); free_linkedlist(node); table->overflow_buckets[index] = head; return; } LinkedList* curr = head; LinkedList* prev = NULL; while (curr) { if (strcmp(curr->item->key, key) == 0) { if (prev == NULL) { //First element of the chain. Remove the chain free_linkedlist(head); table->overflow_buckets[index] = NULL; return; } else { //This is somewhere in the chain prev->next = curr->next; curr->next = NULL; free_linkedlist(curr); table->overflow_buckets[index] = head; return; } } curr = curr->next; prev = curr; } } } } void print_search(HashTable* table, char* key) { char* val; if ((val = ht_search(table, key)) == NULL) { printf("%s does not exist\n", key); return; } else { printf("Key:%s, Value:%s\n", key, val); } } void print_table(HashTable* table) { printf("\n-------------------\n"); for (int i=0; i<table->size; i++) { if (table->items[i]) { printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value); if (table->overflow_buckets[i]) { printf(" => Overflow Bucket => "); LinkedList* head = table->overflow_buckets[i]; while (head) { printf("Key:%s, Value:%s ", head->item->key, head->item->value); head = head->next; } } printf("\n"); } } printf("-------------------\n"); } int main() { HashTable* ht = create_table(CAPACITY); ht_insert(ht, "1", "First address"); ht_insert(ht, "2", "Second address"); ht_insert(ht, "Hel", "Third address"); ht_insert(ht, "Cau", "Fourth address"); print_search(ht, "1"); print_search(ht, "2"); print_search(ht, "3"); print_search(ht, "Hel"); print_search(ht, "Cau"); //Collision! print_table(ht); ht_delete(ht, "1"); ht_delete(ht, "Cau"); print_table(ht); free_table(ht); return 0; }
输出
Key:1, Value:First address Key:2, Value:Second address 3 does not exist Key:Hel, Value:Third address Key:Cau, Value:Fourth address ------------------ Index:49, Key:1, Value:First address Index:50, Key:2, Value:Second address Index:281, Key:Hel, Value:Third address => Overflow Bucket => Key:Cau, Value:Fourth address ------------------ ------------------ Index:50, Key:2, Value:Second address Index:281, Key:Hel, Value:Third address ------------------