unordered_map的哈希函数如何自定义？

八股文_C++STL面试题 0 68

参考回答

在 C++ 中，std::unordered_map 是一个基于哈希表的数据结构，它使用哈希函数来决定如何存储和查找键值对。如果你需要使用自定义的类型作为键，通常需要自定义哈希函数。自定义哈希函数的方式是通过特化 std::hash 模板，或者直接在 unordered_map 中提供一个哈希函数对象。

详细讲解与拓展

1. 自定义哈希函数的基本方法

假设我们有一个自定义的类型 MyKey，我们想要在 unordered_map 中使用它作为键。为了使哈希表能够正确地处理这种类型，我们需要提供一个哈希函数。哈希函数必须是一个接受键类型并返回 size_t 的函数。

有两种常见的方式来为自定义类型提供哈希函数：

方法 1：特化 std::hash 模板
这是最常见的方式，通过特化 std::hash 模板来定义哈希函数。
方法 2：直接传递哈希函数对象
另一种方式是在创建 unordered_map 时直接传入一个自定义的哈希函数对象。

2. 方法 1：特化 `std::hash` 模板

如果你希望在整个程序中对某个类型使用自定义的哈希函数，可以通过特化 std::hash 模板来实现。例如，假设我们有一个自定义的 MyKey 类型，定义如下：

#include <iostream>
#include <unordered_map>

struct MyKey {
    int x;
    int y;

    bool operator==(const MyKey& other) const {
        return x == other.x && y == other.y;
    }
};

// 特化 std::hash
namespace std {
    template <>
    struct hash<MyKey> {
        size_t operator()(const MyKey& key) const {
            // 一个简单的哈希函数，结合了 x 和 y 的值
            return std::hash<int>{}(key.x) ^ std::hash<int>{}(key.y);
        }
    };
}

int main() {
    std::unordered_map<MyKey, std::string> map;

    MyKey key1 = {1, 2};
    map[key1] = "Point (1, 2)";

    MyKey key2 = {3, 4};
    map[key2] = "Point (3, 4)";

    std::cout << map[key1] << std::endl; // 输出 "Point (1, 2)"
    std::cout << map[key2] << std::endl; // 输出 "Point (3, 4)"

    return 0;
}

在这个例子中，我们通过特化 std::hash 为 MyKey 类型提供了自定义的哈希函数。我们将 x 和 y 的哈希值组合起来，形成一个新的哈希值。注意，组合哈希值的方式通常是通过位运算（如异或）来实现。

3. 方法 2：传递哈希函数对象

如果你只希望在某个特定的 unordered_map 中使用自定义哈希函数，而不想改变全局的 std::hash 模板，可以在创建 unordered_map 时直接传入一个自定义的哈希函数对象。例如：

#include <iostream>
#include <unordered_map>

struct MyKey {
    int x;
    int y;

    bool operator==(const MyKey& other) const {
        return x == other.x && y == other.y;
    }
};

// 自定义哈希函数
struct MyKeyHash {
    size_t operator()(const MyKey& key) const {
        return std::hash<int>{}(key.x) ^ std::hash<int>{}(key.y);
    }
};

int main() {
    std::unordered_map<MyKey, std::string, MyKeyHash> map;

    MyKey key1 = {1, 2};
    map[key1] = "Point (1, 2)";

    MyKey key2 = {3, 4};
    map[key2] = "Point (3, 4)";

    std::cout << map[key1] << std::endl; // 输出 "Point (1, 2)"
    std::cout << map[key2] << std::endl; // 输出 "Point (3, 4)"

    return 0;
}

在这个例子中，我们创建了一个 MyKeyHash 类型的哈希函数对象，并在 unordered_map 的声明中传入该对象。这种方式只影响当前 unordered_map 实例，其他地方仍然使用默认的哈希函数。

4. 哈希函数设计

设计一个好的哈希函数是确保哈希表性能的关键。理想的哈希函数应该满足以下条件：

均匀分布：哈希函数应该将不同的键均匀地映射到哈希表的桶中，避免哈希冲突。
计算简单：哈希函数应尽量简单，避免复杂的计算，以提高性能。
冲突最小：冲突越少，哈希表的性能越好。通常，哈希函数会结合多个字段的哈希值（如上面的 x 和 y），并进行位运算来减少冲突。

5. 总结

方法 1：通过特化 std::hash 来为全局范围内的自定义类型提供哈希函数。
方法 2：通过自定义哈希函数对象，灵活地为特定的 unordered_map 提供哈希函数。
设计好的哈希函数是确保哈希表性能的关键，它应该保证键的均匀分布并尽量避免冲突。