C/C++中的哈希表实现

时间:2020-02-23 14:29:58  来源:igfitidea点击:

C/C++(关联数组)中的哈希表是一种将键映射到值的数据结构。
这使用哈希函数来计算键的索引。

基于哈希表索引,我们可以将值存储在适当的位置。

如果两个不同的键获得相同的索引,则需要使用其他数据结构(存储桶)来解决这些冲突。

使用哈希表的全部好处在于访问时间非常快。
虽然可能会发生冲突,但是如果我们选择一个很好的哈希函数,则该机会几乎为零。

因此,平均而言,时间复杂度是常数O(1)访问时间。
这称为摊销时间复杂度。

C++ STL(标准模板库)具有" std :: unordered_map()"数据结构,该数据结构实现了所有这些哈希表功能。

但是,了解如何从头开始构建哈希表是一项至关重要的技能,而这正是我们旨在向您展示的内容。

任何哈希表实现都具有以下三个组成部分:

  • 良好的哈希函数,可将键映射到值

  • 支持插入,搜索和删除操作的哈希表数据结构。

  • 解释键冲突的数据结构

选择哈希函数

第一步是选择冲突可能性较低的合理的哈希函数。
但是,由于这只是为了说明,所以我会做相反的事情!逆向心理学,是吗?

在本文中,我们将仅处理字符串(或者C中的字符数组)。

我将使用一个非常简单的哈希函数,该函数简单地将字符串的ASCII值求和。
我正在使用它来向您展示我们将如何处理冲突案例。

#define CAPACITY 50000 //Size of the Hash Table

unsigned long hash_function(char* str) {
  unsigned long i = 0;
  for (int j=0; str[j]; j++)
      i += str[j];
  return i % CAPACITY;
}

您可以针对不同的字符串进行测试,并检查它们是否冲突。
例如,字符串" Hel"和" Cau"将冲突,因为它们具有相同的ASCII值。

注意:我们必须返回哈希表容量之内的数字。
否则,我们可能会访问未绑定的内存位置,从而导致错误。

定义哈希表数据结构

哈希表是一组项目,它们本身是{key:value}对。

让我们先定义项目结构。

typedef struct Ht_item Ht_item;

//Define the Hash Table Item here
struct Ht_item {
  char* key;
  char* value;
};

现在,哈希表具有一个指向" Ht_item"的指针数组,因此它是一个双指针。

除此之外,我们还将使用count来跟踪Hash表中的元素数量,并将表的大小存储在size中。

typedef struct HashTable HashTable;

//Define the Hash Table here
struct HashTable {
  //Contains an array of pointers
  //to items
  Ht_item** items;
  int size;
  int count;
};

创建哈希表及其项

我们需要函数来在内存中创建一个新的哈希表,并创建其项。

首先创建一个项目。
这非常简单,因为我们只需要为其键和值分配内存并返回指向该项的指针。

Ht_item* create_item(char* key, char* value) {
  //Creates a pointer to a new hash table item
  Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
  item->key = (char*) malloc (strlen(key) + 1);
  item->value = (char*) malloc (strlen(value) + 1);
  
  strcpy(item->key, key);
  strcpy(item->value, value);

  return item;
}

现在,让我们编写用于创建表格的代码。
这将为包装器结构" HashTable"分配内存,并将其所有项目设置为" NULL"(因为未使用它们)。

HashTable* create_table(int size) {
  //Creates a new HashTable
  HashTable* table = (HashTable*) malloc (sizeof(HashTable));
  table->size = size;
  table->count = 0;
  table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
  for (int i=0; i<table->size; i++)
      table->items[i] = NULL;

  return table;
}

现在,我们几乎完成了这部分。
作为C/C++程序员,您有责任使用malloc()calloc()释放已在堆上分配的内存。

因此,让我们编写一些函数,这些函数可以释放表项以及整个表。

void free_item(Ht_item* item) {
  //Frees an item
  free(item->key);
  free(item->value);
  free(item);
}

void free_table(HashTable* table) {
  //Frees the table
  for (int i=0; i<table->size; i++) {
      Ht_item* item = table->items[i];
      if (item != NULL)
          free_item(item);
  }

  free(table->items);
  free(table);
}

现在,我们已经完成了构建功能哈希表的基础工作。
现在让我们开始编写我们的insert()search()delete()方法。

插入哈希表

我们将创建一个函数ht_insert()来为我们执行此任务。

这需要一个"哈希表"指针,一个"键"和一个"值"作为参数。

void ht_insert(HashTable* table, char* key, char* value);

现在,插入功能涉及某些步骤。

  • 根据{key:value}对创建项目
  • 基于哈希函数计算索引
  • 通过比较键检查索引是否已被占用,如果没有被占用。
    我们可以直接将其插入index
  • 否则,这是一次碰撞,我们需要处理它

我将说明创建初始模型后如何处理碰撞,因此让我们稍等一下。

第一步很简单。
我们直接调用create_item(key,value)

int index = hash_function(key);

第二步和第三步使用" hash_function(key)"来获取索引。
如果是第一次插入密钥,则该项必须为" NULL"。
否则,确切的key:value对已经存在,或者是冲突。

在这种情况下,我们将定义另一个函数handle_collision(),顾名思义,它将为我们处理这种潜在的碰撞。

//Create the item
Ht_item* item = create_item(key, value);

//Compute the index
int index = hash_function(key);

Ht_item* current_item = table->items[index];
if (current_item == NULL) {
  //Key does not exist.
  if (table->count == table->size) {
      //Hash Table Full
      printf("Insert Error: Hash Table is full\n");
      free_item(item);
      return;
  }
  
  //Insert directly
  table->items[index] = item; 
  table->count++;
}

让我们考虑第一种情况,其中存在key:value对(即,之前已插入相同的项目)。
在这种情况下,我们必须仅将项目值更新为新值。

if (current_item == NULL) {
  ....
  ....
}
else {
      //Scenario 1: We only need to update value
      if (strcmp(current_item->key, key) == 0) {
          strcpy(table->items[index]->value, value);
          return;
      }
  else {
      //Scenario 2: Collision
      //We will handle case this a bit later
      handle_collision(table, item);
      return;
  }
}

好的,所以我们的插入函数(没有冲突)现在看起来像这样:

void handle_collision(HashTable* table, Ht_item* item) {
}

void ht_insert(HashTable* table, char* key, char* value) {
	//Create the item
	Ht_item* item = create_item(key, value);

	Ht_item* current_item = table->items[index];
	
	if (current_item == NULL) {
	    //Key does not exist.
	    if (table->count == table->size) {
	        //Hash Table Full
	        printf("Insert Error: Hash Table is full\n");
	        return;
	    }
	    
	    //Insert directly
	    table->items[index] = item; 
	    table->count++;
	}

	else {
  		//Scenario 1: We only need to update value
  		if (strcmp(current_item->key, key) == 0) {
      		strcpy(table->items[index]->value, value);
      		return;
  		}
  
  	else {
			//Scenario 2: Collision
		  	//We will handle case this a bit later
        	handle_collision(table, item);
        	return;
  	}
	}
}

在哈希表中搜索项目

如果要检查插入是否正确完成,则还需要定义一个搜索功能,该功能检查键是否存在,如果存在则返回相应的值。

char* ht_search(HastTable* table, char* key);

逻辑很简单。
它只是移至非NULL项并比较键。
否则,我们将返回NULL

char* ht_search(HashTable* table, char* key) {
  //Searches the key in the hashtable
  //and returns NULL if it doesn't exist
  int index = hash_function(key);
  Ht_item* item = table->items[index];

  //Ensure that we move to a non NULL item
  if (item != NULL) {
      if (strcmp(item->key, key) == 0)
          return item->value;
  }
  return NULL;
}

测试我们的基本模型

让我们通过编写一个" main()"驱动程序来检查我们编写的内容是否正确。

我们将添加另一个函数print_table()来打印哈希表,以供说明。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CAPACITY 50000 //Size of the Hash Table

unsigned long hash_function(char* str) {
  unsigned long i = 0;
  for (int j=0; str[j]; j++)
      i += str[j];
  return i % CAPACITY;
}

typedef struct Ht_item Ht_item;

//Define the Hash Table Item here
struct Ht_item {
  char* key;
  char* value;
};

typedef struct HashTable HashTable;

//Define the Hash Table here
struct HashTable {
  //Contains an array of pointers
  //to items
  Ht_item** items;
  int size;
  int count;
};

Ht_item* create_item(char* key, char* value) {
  //Creates a pointer to a new hash table item
  Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
  item->key = (char*) malloc (strlen(key) + 1);
  item->value = (char*) malloc (strlen(value) + 1);
  
  strcpy(item->key, key);
  strcpy(item->value, value);

  return item;
}

HashTable* create_table(int size) {
  //Creates a new HashTable
  HashTable* table = (HashTable*) malloc (sizeof(HashTable));
  table->size = size;
  table->count = 0;
  table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
  for (int i=0; i<table->size; i++)
      table->items[i] = NULL;

  return table;
}

void free_item(Ht_item* item) {
  //Frees an item
  free(item->key);
  free(item->value);
  free(item);
}

void free_table(HashTable* table) {
  //Frees the table
  for (int i=0; i<table->size; i++) {
      Ht_item* item = table->items[i];
      if (item != NULL)
          free_item(item);
  }

  free(table->items);
  free(table);
}

void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
}

void ht_insert(HashTable* table, char* key, char* value) {
  //Create the item
  Ht_item* item = create_item(key, value);

  //Compute the index
  unsigned long index = hash_function(key);

  Ht_item* current_item = table->items[index];
  
  if (current_item == NULL) {
      //Key does not exist.
      if (table->count == table->size) {
          //Hash Table Full
          printf("Insert Error: Hash Table is full\n");
          //Remove the create item
          free_item(item);
          return;
      }
      
      //Insert directly
      table->items[index] = item; 
      table->count++;
  }

  else {
          //Scenario 1: We only need to update value
          if (strcmp(current_item->key, key) == 0) {
              strcpy(table->items[index]->value, value);
              return;
          }
  
      else {
          //Scenario 2: Collision
          //We will handle case this a bit later
          handle_collision(table, index, item);
          return;
      }
  }
}

char* ht_search(HashTable* table, char* key) {
  //Searches the key in the hashtable
  //and returns NULL if it doesn't exist
  int index = hash_function(key);
  Ht_item* item = table->items[index];

  //Ensure that we move to a non NULL item
  if (item != NULL) {
      if (strcmp(item->key, key) == 0)
          return item->value;
  }
  return NULL;
}

void print_search(HashTable* table, char* key) {
  char* val;
  if ((val = ht_search(table, key)) == NULL) {
      printf("Key:%s does not exist\n", key);
      return;
  }
  else {
      printf("Key:%s, Value:%s\n", key, val);
  }
}

void print_table(HashTable* table) {
  printf("\nHash Table\n-------------------\n");
  for (int i=0; i<table->size; i++) {
      if (table->items[i]) {
          printf("Index:%d, Key:%s, Value:%s\n", i, table->items[i]->key, table->items[i]->value);
      }
  }
  printf("-------------------\n\n");
}

int main() {
  HashTable* ht = create_table(CAPACITY);
  ht_insert(ht, "1", "First address");
  ht_insert(ht, "2", "Second address");
  print_search(ht, "1");
  print_search(ht, "2");
  print_search(ht, "3");
  print_table(ht);
  free_table(ht);
  return 0;
}

输出

Key:1, Value:First address
Key:2, Value:Second address
Key:3 does not exist

Hash Table
------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
------------------

这似乎按预期工作,所以现在让我们继续处理冲突。

处理碰撞

有多种解决冲突的方法。
我们将研究一种称为"独立链接"的方法,该方法旨在为具有相同哈希索引的所有项目创建独立的链。

我们将使用链接列表创建这些链。

每当发生碰撞时,我们都会添加其他项目,这些项目会在"溢出存储桶列表"的同一索引上发生碰撞。
因此,我们将不必删除哈希表上的任何现有记录。

由于链表在插入,搜索和删除时具有O(n)时间复杂度,因此在发生冲突的情况下,我们也将具有O(n)的最坏情况访问时间。
此方法的优点是,如果您的哈希表的容量较低,则是一个不错的选择。

使用链表进行单独链接-维基百科

涵盖了这一点,让我们开始实施我们的旧链接列表!

typedef struct LinkedList LinkedList;

//Define the Linkedlist here
struct LinkedList {
  Ht_item* item; 
  LinkedList* next;
};

LinkedList* allocate_list () {
  //Allocates memory for a Linkedlist pointer
  LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
  return list;
}

LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
  //Inserts the item onto the Linked List
  if (!list) {
      LinkedList* head = allocate_list();
      head->item = item;
      head->next = NULL;
      list = head;
      return list;
  } 
  
  else if (list->next == NULL) {
      LinkedList* node = allocate_list();
      node->item = item;
      node->next = NULL;
      list->next = node;
      return list;
  }

  LinkedList* temp = list;
  while (temp->next->next) {
      temp = temp->next;
  }
  
  LinkedList* node = allocate_list();
  node->item = item;
  node->next = NULL;
  temp->next = node;
  
  return list;
}

Ht_item* linkedlist_remove(LinkedList* list) {
  //Removes the head from the linked list
  //and returns the item of the popped element
  if (!list)
      return NULL;
  if (!list->next)
      return NULL;
  LinkedList* node = list->next;
  LinkedList* temp = list;
  temp->next = NULL;
  list = node;
  Ht_item* it = NULL;
  memcpy(temp->item, it, sizeof(Ht_item));
  free(temp->item->key);
  free(temp->item->value);
  free(temp->item);
  free(temp);
  return it;
}

void free_linkedlist(LinkedList* list) {
  LinkedList* temp = list;
  while (list) {
      temp = list;
      list = list->next;
      free(temp->item->key);
      free(temp->item->value);
      free(temp->item);
      free(temp);
  }
}

现在,我们需要将这些"溢出存储桶"列表添加到哈希表中。
我们希望每个项目都有一个这样的链,因此对于整个表,它是LinkedList指针的数组。

typedef struct HashTable HashTable;

//Define the Hash Table here
struct HashTable {
  //Contains an array of pointers
  //to items
  Ht_item** items;
  LinkedList** overflow_buckets;
  int size;
  int count;
};

现在,我们定义了overflow_buckets,我们将添加函数来创建和删除它们。
我们还需要在旧的create_table()和free_table()函数中考虑它们。

LinkedList** create_overflow_buckets(HashTable* table) {
  //Create the overflow buckets; an array of linkedlists
  LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
  for (int i=0; i<table->size; i++)
                 buckets[i] = NULL;
  return buckets;
}

void free_overflow_buckets(HashTable* table) {
  //Free all the overflow bucket lists
  LinkedList** buckets = table->overflow_buckets;
  for (int i=0; i<table->size; i++)
        free_linkedlist(buckets[i]);
  free(buckets);
}

HashTable* create_table(int size) {
  //Creates a new HashTable
  HashTable* table = (HashTable*) malloc (sizeof(HashTable));
  table->size = size;
  table->count = 0;
  table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
  for (int i=0; i<table->size; i++)
      table->items[i] = NULL;
  table->overflow_buckets = create_overflow_buckets(table);

  return table;
}

void free_table(HashTable* table) {
  //Frees the table
  for (int i=0; i<table->size; i++) {
      Ht_item* item = table->items[i];
      if (item != NULL)
          free_item(item);
  }

  //Free the overflow bucket linked linkedlist and it's items
  free_overflow_buckets(table);
  free(table->items);
  free(table);
}

!我们还有工作要做。
现在,转到" handle_collision()"函数。

这里有两种情况。
如果该项目的溢出存储区列表不存在,我们需要创建一个这样的列表并将该项目添加到其中。

否则,我们可以简单地将项目插入列表

void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
  LinkedList* head = table->overflow_buckets[index];

  if (head == NULL) {
      //We need to create the list
      head = allocate_list();
      head->item = item;
      table->overflow_buckets[index] = head;
      return;
  }
  else {
      //Insert to the list
      table->overflow_buckets[index] = linkedlist_insert(head, item);
      return;
  }
 }

插入完成了,但是现在,我们还需要更新搜索功能,因为我们可能还需要查看溢出存储桶。

char* ht_search(HashTable* table, char* key) {
  //Searches the key in the hashtable
  //and returns NULL if it doesn't exist
  int index = hash_function(key);
  Ht_item* item = table->items[index];
  LinkedList* head = table->overflow_buckets[index];

  //Ensure that we move to items which are not NULL
  while (item != NULL) {
      if (strcmp(item->key, key) == 0)
          return item->value;
      if (head == NULL)
          return NULL;
      item = head->item;
      head = head->next;
  }
  return NULL;
}

最后,我们考虑了" insert()"和" search()"中的冲突!

为了提醒您代码的当前状态,我将完整的程序发布到现在。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CAPACITY 50000 //Size of the Hash Table

unsigned long hash_function(char* str) {
  unsigned long i = 0;
  for (int j=0; str[j]; j++)
      i += str[j];
  return i % CAPACITY;
}

typedef struct Ht_item Ht_item;

//Define the Hash Table Item here
struct Ht_item {
  char* key;
  char* value;
};

typedef struct LinkedList LinkedList;

//Define the Linkedlist here
struct LinkedList {
  Ht_item* item; 
  LinkedList* next;
};

typedef struct HashTable HashTable;

//Define the Hash Table here
struct HashTable {
  //Contains an array of pointers
  //to items
  Ht_item** items;
  LinkedList** overflow_buckets;
  int size;
  int count;
};

static LinkedList* allocate_list () {
  //Allocates memory for a Linkedlist pointer
  LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
  return list;
}

static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
  //Inserts the item onto the Linked List
  if (!list) {
      LinkedList* head = allocate_list();
      head->item = item;
      head->next = NULL;
      list = head;
      return list;
  } 
  
  else if (list->next == NULL) {
      LinkedList* node = allocate_list();
      node->item = item;
      node->next = NULL;
      list->next = node;
      return list;
  }

  LinkedList* temp = list;
  while (temp->next->next) {
      temp = temp->next;
  }
  
  LinkedList* node = allocate_list();
  node->item = item;
  node->next = NULL;
  temp->next = node;
  
  return list;
}

static Ht_item* linkedlist_remove(LinkedList* list) {
  //Removes the head from the linked list
  //and returns the item of the popped element
  if (!list)
      return NULL;
  if (!list->next)
      return NULL;
  LinkedList* node = list->next;
  LinkedList* temp = list;
  temp->next = NULL;
  list = node;
  Ht_item* it = NULL;
  memcpy(temp->item, it, sizeof(Ht_item));
  free(temp->item->key);
  free(temp->item->value);
  free(temp->item);
  free(temp);
  return it;
}

static void free_linkedlist(LinkedList* list) {
  LinkedList* temp = list;
  while (list) {
      temp = list;
      list = list->next;
      free(temp->item->key);
      free(temp->item->value);
      free(temp->item);
      free(temp);
  }
}

static LinkedList** create_overflow_buckets(HashTable* table) {
  //Create the overflow buckets; an array of linkedlists
  LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
  for (int i=0; i<table->size; i++)
      buckets[i] = NULL;
  return buckets;
}

static void free_overflow_buckets(HashTable* table) {
  //Free all the overflow bucket lists
  LinkedList** buckets = table->overflow_buckets;
  for (int i=0; i<table->size; i++)
      free_linkedlist(buckets[i]);
  free(buckets);
}

Ht_item* create_item(char* key, char* value) {
  //Creates a pointer to a new hash table item
  Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
  item->key = (char*) malloc (strlen(key) + 1);
  item->value = (char*) malloc (strlen(value) + 1);
  
  strcpy(item->key, key);
  strcpy(item->value, value);

  return item;
}

HashTable* create_table(int size) {
  //Creates a new HashTable
  HashTable* table = (HashTable*) malloc (sizeof(HashTable));
  table->size = size;
  table->count = 0;
  table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
  for (int i=0; i<table->size; i++)
      table->items[i] = NULL;
  table->overflow_buckets = create_overflow_buckets(table);

  return table;
}

void free_item(Ht_item* item) {
  //Frees an item
  free(item->key);
  free(item->value);
  free(item);
}

void free_table(HashTable* table) {
  //Frees the table
  for (int i=0; i<table->size; i++) {
      Ht_item* item = table->items[i];
      if (item != NULL)
          free_item(item);
  }

  free_overflow_buckets(table);
  free(table->items);
  free(table);
}

void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
  LinkedList* head = table->overflow_buckets[index];

  if (head == NULL) {
      //We need to create the list
      head = allocate_list();
      head->item = item;
      table->overflow_buckets[index] = head;
      return;
  }
  else {
      //Insert to the list
      table->overflow_buckets[index] = linkedlist_insert(head, item);
      return;
  }
 }

void ht_insert(HashTable* table, char* key, char* value) {
  //Create the item
  Ht_item* item = create_item(key, value);

  //Compute the index
  unsigned long index = hash_function(key);

  Ht_item* current_item = table->items[index];
  
  if (current_item == NULL) {
      //Key does not exist.
      if (table->count == table->size) {
          //Hash Table Full
          printf("Insert Error: Hash Table is full\n");
          //Remove the create item
          free_item(item);
          return;
      }
      
      //Insert directly
      table->items[index] = item; 
      table->count++;
  }

  else {
          //Scenario 1: We only need to update value
          if (strcmp(current_item->key, key) == 0) {
              strcpy(table->items[index]->value, value);
              return;
          }
  
      else {
          //Scenario 2: Collision
          handle_collision(table, index, item);
          return;
      }
  }
}

char* ht_search(HashTable* table, char* key) {
  //Searches the key in the hashtable
  //and returns NULL if it doesn't exist
  int index = hash_function(key);
  Ht_item* item = table->items[index];
  LinkedList* head = table->overflow_buckets[index];

  //Ensure that we move to items which are not NULL
  while (item != NULL) {
      if (strcmp(item->key, key) == 0)
          return item->value;
      if (head == NULL)
          return NULL;
      item = head->item;
      head = head->next;
  }
  return NULL;
}

void print_search(HashTable* table, char* key) {
  char* val;
  if ((val = ht_search(table, key)) == NULL) {
      printf("%s does not exist\n", key);
      return;
  }
  else {
      printf("Key:%s, Value:%s\n", key, val);
  }
}

void print_table(HashTable* table) {
  printf("\n-------------------\n");
  for (int i=0; i<table->size; i++) {
      if (table->items[i]) {
          printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
          if (table->overflow_buckets[i]) {
              printf(" => Overflow Bucket => ");
              LinkedList* head = table->overflow_buckets[i];
              while (head) {
                  printf("Key:%s, Value:%s ", head->item->key, head->item->value);
                  head = head->next;
              }
          }
          printf("\n");
      }
  }
  printf("-------------------\n");
}

int main() {
  HashTable* ht = create_table(CAPACITY);
  ht_insert(ht, "1", "First address");
  ht_insert(ht, "2", "Second address");
  ht_insert(ht, "Hel", "Third address");
  ht_insert(ht, "Cau", "Fourth address");
  print_search(ht, "1");
  print_search(ht, "2");
  print_search(ht, "3");
  print_search(ht, "Hel");
  print_search(ht, "Cau");
  print_table(ht);
  free_table(ht);
  return 0;
}

从哈希表中删除

现在,让我们最后看看删除功能:

void ht_delete(HashTable* table, char* key);

同样,该方法类似于插入。

  • 计算哈希索引并获取项目
  • 如果为NULL,则无需执行任何操作
  • 否则,在比较键之后,该索引没有碰撞链,只需从表中删除该项目
  • 如果存在碰撞链,我们必须删除该元素并相应地移动链接

我没有为您提供太多细节,因为这仅涉及更新标题项和释放内存。
我的建议是尝试自己实施。

我将为您提供一个工作版本以进行比较。

void ht_delete(HashTable* table, char* key) {
  //Deletes an item from the table
  int index = hash_function(key);
  Ht_item* item = table->items[index];
  LinkedList* head = table->overflow_buckets[index];

  if (item == NULL) {
      //Does not exist. Return
      return;
  }
  else {
      if (head == NULL && strcmp(item->key, key) == 0) {
          //No collision chain. Remove the item
          //and set table index to NULL
          table->items[index] = NULL;
          free_item(item);
          table->count--;
          return;
      }
      else if (head != NULL) {
          //Collision Chain exists
          if (strcmp(item->key, key) == 0) {
              //Remove this item and set the head of the list
              //as the new item
              
              free_item(item);
              LinkedList* node = head;
              head = head->next;
              node->next = NULL;
              table->items[index] = create_item(node->item->key, node->item->value);
              free_linkedlist(node);
              table->overflow_buckets[index] = head;
              return;
          }

          LinkedList* curr = head;
          LinkedList* prev = NULL;
          
          while (curr) {
              if (strcmp(curr->item->key, key) == 0) {
                  if (prev == NULL) {
                      //First element of the chain. Remove the chain
                      free_linkedlist(head);
                      table->overflow_buckets[index] = NULL;
                      return;
                  }
                  else {
                      //This is somewhere in the chain
                      prev->next = curr->next;
                      curr->next = NULL;
                      free_linkedlist(curr);
                      table->overflow_buckets[index] = head;
                      return;
                  }
              }
              curr = curr->next;
              prev = curr;
          }

      }
  }
}

全部放在一起

现在,最后,我将显示哈希表的整个程序。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define CAPACITY 50000 //Size of the Hash Table

unsigned long hash_function(char* str) {
  unsigned long i = 0;
  for (int j=0; str[j]; j++)
      i += str[j];
  return i % CAPACITY;
}

typedef struct Ht_item Ht_item;

//Define the Hash Table Item here
struct Ht_item {
  char* key;
  char* value;
};

typedef struct LinkedList LinkedList;

//Define the Linkedlist here
struct LinkedList {
  Ht_item* item; 
  LinkedList* next;
};

typedef struct HashTable HashTable;

//Define the Hash Table here
struct HashTable {
  //Contains an array of pointers
  //to items
  Ht_item** items;
  LinkedList** overflow_buckets;
  int size;
  int count;
};

static LinkedList* allocate_list () {
  //Allocates memory for a Linkedlist pointer
  LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
  return list;
}

static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
  //Inserts the item onto the Linked List
  if (!list) {
      LinkedList* head = allocate_list();
      head->item = item;
      head->next = NULL;
      list = head;
      return list;
  } 
  
  else if (list->next == NULL) {
      LinkedList* node = allocate_list();
      node->item = item;
      node->next = NULL;
      list->next = node;
      return list;
  }

  LinkedList* temp = list;
  while (temp->next->next) {
      temp = temp->next;
  }
  
  LinkedList* node = allocate_list();
  node->item = item;
  node->next = NULL;
  temp->next = node;
  
  return list;
}

static Ht_item* linkedlist_remove(LinkedList* list) {
  //Removes the head from the linked list
  //and returns the item of the popped element
  if (!list)
      return NULL;
  if (!list->next)
      return NULL;
  LinkedList* node = list->next;
  LinkedList* temp = list;
  temp->next = NULL;
  list = node;
  Ht_item* it = NULL;
  memcpy(temp->item, it, sizeof(Ht_item));
  free(temp->item->key);
  free(temp->item->value);
  free(temp->item);
  free(temp);
  return it;
}

static void free_linkedlist(LinkedList* list) {
  LinkedList* temp = list;
  while (list) {
      temp = list;
      list = list->next;
      free(temp->item->key);
      free(temp->item->value);
      free(temp->item);
      free(temp);
  }
}

static LinkedList** create_overflow_buckets(HashTable* table) {
  //Create the overflow buckets; an array of linkedlists
  LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
  for (int i=0; i<table->size; i++)
      buckets[i] = NULL;
  return buckets;
}

static void free_overflow_buckets(HashTable* table) {
  //Free all the overflow bucket lists
  LinkedList** buckets = table->overflow_buckets;
  for (int i=0; i<table->size; i++)
      free_linkedlist(buckets[i]);
  free(buckets);
}

Ht_item* create_item(char* key, char* value) {
  //Creates a pointer to a new hash table item
  Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
  item->key = (char*) malloc (strlen(key) + 1);
  item->value = (char*) malloc (strlen(value) + 1);
  
  strcpy(item->key, key);
  strcpy(item->value, value);

  return item;
}

HashTable* create_table(int size) {
  //Creates a new HashTable
  HashTable* table = (HashTable*) malloc (sizeof(HashTable));
  table->size = size;
  table->count = 0;
  table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
  for (int i=0; i<table->size; i++)
      table->items[i] = NULL;
  table->overflow_buckets = create_overflow_buckets(table);

  return table;
}

void free_item(Ht_item* item) {
  //Frees an item
  free(item->key);
  free(item->value);
  free(item);
}

void free_table(HashTable* table) {
  //Frees the table
  for (int i=0; i<table->size; i++) {
      Ht_item* item = table->items[i];
      if (item != NULL)
          free_item(item);
  }

  free_overflow_buckets(table);
  free(table->items);
  free(table);
}

void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
  LinkedList* head = table->overflow_buckets[index];

  if (head == NULL) {
      //We need to create the list
      head = allocate_list();
      head->item = item;
      table->overflow_buckets[index] = head;
      return;
  }
  else {
      //Insert to the list
      table->overflow_buckets[index] = linkedlist_insert(head, item);
      return;
  }
 }

void ht_insert(HashTable* table, char* key, char* value) {
  //Create the item
  Ht_item* item = create_item(key, value);

  //Compute the index
  unsigned long index = hash_function(key);

  Ht_item* current_item = table->items[index];
  
  if (current_item == NULL) {
      //Key does not exist.
      if (table->count == table->size) {
          //Hash Table Full
          printf("Insert Error: Hash Table is full\n");
          //Remove the create item
          free_item(item);
          return;
      }
      
      //Insert directly
      table->items[index] = item; 
      table->count++;
  }

  else {
          //Scenario 1: We only need to update value
          if (strcmp(current_item->key, key) == 0) {
              strcpy(table->items[index]->value, value);
              return;
          }
  
      else {
          //Scenario 2: Collision
          handle_collision(table, index, item);
          return;
      }
  }
}

char* ht_search(HashTable* table, char* key) {
  //Searches the key in the hashtable
  //and returns NULL if it doesn't exist
  int index = hash_function(key);
  Ht_item* item = table->items[index];
  LinkedList* head = table->overflow_buckets[index];

  //Ensure that we move to items which are not NULL
  while (item != NULL) {
      if (strcmp(item->key, key) == 0)
          return item->value;
      if (head == NULL)
          return NULL;
      item = head->item;
      head = head->next;
  }
  return NULL;
}

void ht_delete(HashTable* table, char* key) {
  //Deletes an item from the table
  int index = hash_function(key);
  Ht_item* item = table->items[index];
  LinkedList* head = table->overflow_buckets[index];

  if (item == NULL) {
      //Does not exist. Return
      return;
  }
  else {
      if (head == NULL && strcmp(item->key, key) == 0) {
          //No collision chain. Remove the item
          //and set table index to NULL
          table->items[index] = NULL;
          free_item(item);
          table->count--;
          return;
      }
      else if (head != NULL) {
          //Collision Chain exists
          if (strcmp(item->key, key) == 0) {
              //Remove this item and set the head of the list
              //as the new item
              
              free_item(item);
              LinkedList* node = head;
              head = head->next;
              node->next = NULL;
              table->items[index] = create_item(node->item->key, node->item->value);
              free_linkedlist(node);
              table->overflow_buckets[index] = head;
              return;
          }

          LinkedList* curr = head;
          LinkedList* prev = NULL;
          
          while (curr) {
              if (strcmp(curr->item->key, key) == 0) {
                  if (prev == NULL) {
                      //First element of the chain. Remove the chain
                      free_linkedlist(head);
                      table->overflow_buckets[index] = NULL;
                      return;
                  }
                  else {
                      //This is somewhere in the chain
                      prev->next = curr->next;
                      curr->next = NULL;
                      free_linkedlist(curr);
                      table->overflow_buckets[index] = head;
                      return;
                  }
              }
              curr = curr->next;
              prev = curr;
          }

      }
  }
}

void print_search(HashTable* table, char* key) {
  char* val;
  if ((val = ht_search(table, key)) == NULL) {
      printf("%s does not exist\n", key);
      return;
  }
  else {
      printf("Key:%s, Value:%s\n", key, val);
  }
}

void print_table(HashTable* table) {
  printf("\n-------------------\n");
  for (int i=0; i<table->size; i++) {
      if (table->items[i]) {
          printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
          if (table->overflow_buckets[i]) {
              printf(" => Overflow Bucket => ");
              LinkedList* head = table->overflow_buckets[i];
              while (head) {
                  printf("Key:%s, Value:%s ", head->item->key, head->item->value);
                  head = head->next;
              }
          }
          printf("\n");
      }
  }
  printf("-------------------\n");
}

int main() {
  HashTable* ht = create_table(CAPACITY);
  ht_insert(ht, "1", "First address");
  ht_insert(ht, "2", "Second address");
  ht_insert(ht, "Hel", "Third address");
  ht_insert(ht, "Cau", "Fourth address");
  print_search(ht, "1");
  print_search(ht, "2");
  print_search(ht, "3");
  print_search(ht, "Hel");
  print_search(ht, "Cau");  //Collision!
  print_table(ht);
  ht_delete(ht, "1");
  ht_delete(ht, "Cau");
  print_table(ht);
  free_table(ht);
  return 0;
}

输出

Key:1, Value:First address
Key:2, Value:Second address
3 does not exist
Key:Hel, Value:Third address
Key:Cau, Value:Fourth address

------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address => Overflow Bucket => Key:Cau, Value:Fourth address 
------------------

------------------
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address
------------------