当前位置：首页 > news >正文

Redis HASH+RediSearch 设计自定义索引

news 2025/9/28 17:50:40

一、核心设计：用 Hash 替代 Set 的结构定义
- 1. 结构设计原则
- 2. 为何此结构能替代 Set？
二、关键操作实现：用 Hash 完成唯一性校验
- 1. 依赖注入（基础配置）
- 2. 业务层调用示例（以 Customer 表为例）
三、集成 RediSearch：无需转换，直接索引 Hash
- 1. 创建 RediSearch 索引
- 2. 模糊搜索实现（Java 代码）
四、Hash 方案 vs Set 方案：核心优势
五、注意事项
总结

在 Redis 作为 MySQL 外部自定义索引存储的场景中，完全可以用 Hash 存储代替 Set，且能直接省去“Set 转 Hash”的转换步骤——因为 Hash 的特性可以同时满足“多租户唯一性校验”和“后续集成 RediSearch 搜索”的核心需求，无需额外维护两种结构。

要理解这一点，需先明确：Set 的核心价值是“保证元素唯一性”，而 Hash 可以通过 field 的唯一性+原子操作，完全模拟 Set 的这一核心能力，同时天然适配 RediSearch（无需转换）。以下是详细设计方案、实现逻辑及对比分析：

一、核心设计：用 Hash 替代 Set 的结构定义

1. 结构设计原则

保留原 Set 方案中“租户-表-字段组合”的隔离逻辑，但将存储载体从 Set 改为 Hash，利用 Hash 的 field 特性存储“唯一值组合”，具体结构如下：

维度	Hash 结构设计	作用说明
Hash 键（Key）	`unique:{tenantId}:{tableName}:{fields}`	与原 Set 键完全一致，确保多租户、表、字段组合的隔离（如 `unique:t1:customer:a,b`）。
Hash 字段（Field）	`a=值&b=值`（如 `a=1&b=2`）	存储“字段值组合”，利用 Hash 的 field 唯一性保证“同一租户-表-字段组合下，不会有重复的字段值”（Hash 不允许同一键下有重复 field）。
Hash 值（Value）	占位符（如 `1` 或 `exists`）	无实际业务意义，仅用于满足 Hash “field-value” 的键值对结构（Hash 必须存储 field 和 value 的映射，不能只存 field）。

示例：
租户 t1 的 customer 表 (a,b) 字段索引，存储 a=1&b=2 和 a=3&b=4 两个唯一值，Hash 结构如下：

Hash 键：unique:t1:customer:a,b
Hash 内容：{"a=1&b=2": "1", "a=3&b=4": "1"}

2. 为何此结构能替代 Set？

对比 Set 和 Hash 在“唯一性校验”核心需求上的能力，两者完全等价：

核心需求	Set 实现方式	Hash 实现方式	结论（等价性）
原子性添加	`SADD key value`：不存在则添加，返回1；存在则不添加，返回0。	`HSETNX key field value`：不存在则添加field，返回1；存在则不添加，返回0。（`NX`=Not Exists）	完全等价，均支持原子性校验+添加
存在性判断	`SISMEMBER key value`：判断value是否在Set中，O(1)。	`HEXISTS key field`：判断field是否在Hash中，O(1)。	完全等价，均为O(1)低延迟
去重能力	Set 天然去重，不允许重复value。	Hash 天然去重，不允许同一key下重复field。	完全等价，均通过结构特性保证唯一性
批量获取所有值	`SMEMBERS key`：返回所有value。	`HKEYS key`：返回所有field（即字段值组合）。	功能等价，仅命令不同
删除指定值	`SREM key value`：删除指定value。	`HDEL key field`：删除指定field。	功能等价，仅命令不同

可见，Hash 通过 field 替代 Set 的 value，通过 HSETNX/HEXISTS 替代 Set 的 SADD/SISMEMBER，完全具备“唯一性校验”的核心能力，且结构更贴合 RediSearch 需求。

二、关键操作实现：用 Hash 完成唯一性校验

以下是基于 Hash 的核心业务操作（新增、校验、删除）的具体实现，以 Java + Spring Data Redis 为例：

1. 依赖注入（基础配置）

import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.stereotype.Component;@Component
public class HashUniqueIndexUtil {// 注入RedisTemplate（需提前配置序列化方式，如StringRedisSerializer）private final RedisTemplate<String, String> redisTemplate;public HashUniqueIndexUtil(RedisTemplate<String, String> redisTemplate) {this.redisTemplate = redisTemplate;}// 1. 构建Hash键（与原Set键一致）private String buildHashKey(String tenantId, String tableName, String fieldList) {return String.format("unique:%s:%s:%s", tenantId, tableName, fieldList);}// 2. 原子性添加唯一值（校验+添加）public boolean addUniqueValue(String tenantId, String tableName, String fieldList, String valueStr) {String hashKey = buildHashKey(tenantId, tableName, fieldList);// HSETNX：原子操作，field不存在则添加，返回true；存在则返回falsereturn Boolean.TRUE.equals(redisTemplate.opsForHash().putIfAbsent(hashKey, valueStr, "1"));}// 3. 唯一性校验（判断是否已存在）public boolean isUniqueValueExists(String tenantId, String tableName, String fieldList, String valueStr) {String hashKey = buildHashKey(tenantId, tableName, fieldList);// HEXISTS：判断field是否存在，O(1)时间复杂度return Boolean.TRUE.equals(redisTemplate.opsForHash().hasKey(hashKey, valueStr));}// 4. 删除指定唯一值public void deleteUniqueValue(String tenantId, String tableName, String fieldList, String valueStr) {String hashKey = buildHashKey(tenantId, tableName, fieldList);// HDEL：删除指定fieldredisTemplate.opsForHash().delete(hashKey, valueStr);}// 5. 批量获取某租户-表-字段组合下的所有唯一值public List<String> getAllUniqueValues(String tenantId, String tableName, String fieldList) {String hashKey = buildHashKey(tenantId, tableName, fieldList);// HKEYS：获取所有field（即字段值组合），等价于Set的SMEMBERSreturn new ArrayList<>(redisTemplate.opsForHash().keys(hashKey));}
}

2. 业务层调用示例（以 Customer 表为例）

@Service
public class CustomerService {private final HashUniqueIndexUtil hashUniqueIndexUtil;private final CustomerRepository customerRepository;// 构造函数注入依赖public CustomerService(HashUniqueIndexUtil hashUniqueIndexUtil, CustomerRepository customerRepository) {this.hashUniqueIndexUtil = hashUniqueIndexUtil;this.customerRepository = customerRepository;}// 新增客户（含唯一性校验）@Transactionalpublic Customer createCustomer(String tenantId, Customer customer) {// 1. 构建字段值组合（如a=1&b=2）String valueStr = String.format("a=%s&b=%s", customer.getA(), customer.getB());// 2. 唯一性校验：若已存在，抛异常if (hashUniqueIndexUtil.isUniqueValueExists(tenantId, "customer", "a,b", valueStr)) {throw new RuntimeException("租户" + tenantId + "的customer表中，a=" + customer.getA() + "&b=" + customer.getB() + "已存在");}// 3. 保存数据库customer.setTenantId(tenantId);Customer saved = customerRepository.save(customer);// 4. 原子性添加到Redis Hash（确保数据库与Redis一致）hashUniqueIndexUtil.addUniqueValue(tenantId, "customer", "a,b", valueStr);return saved;}// 删除客户（同步删除Redis Hash）@Transactionalpublic void deleteCustomer(String tenantId, Long customerId) {// 1. 查询待删除客户Customer customer = customerRepository.findByIdAndTenantId(customerId, tenantId).orElseThrow(() -> new RuntimeException("客户不存在"));// 2. 构建字段值组合String valueStr = String.format("a=%s&b=%s", customer.getA(), customer.getB());// 3. 删除Redis Hash中的fieldhashUniqueIndexUtil.deleteUniqueValue(tenantId, "customer", "a,b", valueStr);// 4. 删除数据库记录customerRepository.delete(customer);}
}

三、集成 RediSearch：无需转换，直接索引 Hash

这是 Hash 方案的核心优势——无需任何结构转换，可直接为 Hash 建立 RediSearch 索引，实现对“字段值组合”（Hash 的 field）的模糊搜索。

1. 创建 RediSearch 索引

针对 Hash 结构，直接为 field（字段值组合）或 value（占位符，无意义）建立索引。由于 Hash 的 field 是我们需要搜索的内容（如 a=1&b=2），需通过 RediSearch 的 FIELDS 配置指定索引 field（默认 RediSearch 索引 Hash 的 field-value 中的 value，需显式配置索引 field）。

索引创建命令（Redis CLI）：

# 创建索引：idx_unique_hash（针对Hash类型）
FT.CREATE idx_unique_hashON HASH  # 索引对象为HashPREFIX 1 "unique:"  # 只索引键名以"unique:"开头的Hash（即我们的自定义索引）SCHEMA  # 索引字段配置__key__ AS hash_key TEXT  # 可选：索引Hash键（如unique:t1:customer:a,b），用于过滤租户/表/字段组合# 关键：索引Hash的field（字段值组合，如a=1&b=2），需用特殊语法 `@field` 或通过 `FIELDS` 配置# RediSearch 2.4+ 支持通过 `FIELD` 关键字指定索引Hash的field，语法如下：FIELD value_str AS field TEXT  # 将Hash的field（字段值组合）映射为索引字段 `value_str`，设为TEXT类型支持模糊搜索# 同时索引元数据（从Hash键中提取，或在Hash中新增field存储）# 若需更灵活的过滤，可在Hash中新增tenant_id、table_name、field_list字段，示例：tenant_id TAG  # 新增Hash的field：tenant_id，存储租户ID，设为TAG类型table_name TAG  # 新增Hash的field：table_name，存储表名field_list TAG  # 新增Hash的field：field_list，存储字段组合（如a,b）

补充说明：
若你的 RediSearch 版本不支持直接索引 Hash 的 field，可在 Hash 中新增一个与 field 内容完全一致的 value_str 字段（如 Hash 的 field="a=1&b=2"，同时存储 value_str="a=1&b=2"），然后索引 value_str 字段——这种方式更兼容低版本 RediSearch，且无需修改核心逻辑。

2. 模糊搜索实现（Java 代码）

基于上述索引，直接执行模糊搜索，过滤条件包含租户、表名、字段组合，确保多租户隔离：

import org.springframework.data.redis.search.SearchResult;
import org.springframework.data.redis.search.SearchQuery;
import org.springframework.data.redis.search.impl.SearchQueryBuilder;@Component
public class HashUniqueSearchUtil {private final RedisTemplate<String, String> redisTemplate;public HashUniqueSearchUtil(RedisTemplate<String, String> redisTemplate) {this.redisTemplate = redisTemplate;}// 模糊搜索唯一值（如搜索a=1开头的组合）public List<String> searchUniqueValues(String tenantId, String tableName, String fieldList, String keyword) {// 1. 构建查询条件：模糊匹配 + 多租户/表/字段组合过滤// 示例：搜索value_str（Hash的field）包含keyword，且租户=tenantId、表=tableName、字段组合=fieldListString queryStr = String.format("value_str:%s @tenant_id:{%s} @table_name:{%s} @field_list:{%s}",keyword,          // 模糊关键词（如"a=1*"表示前缀匹配，"*b=2"表示后缀匹配）tenantId,         // 租户过滤（TAG类型，精确匹配）tableName,        // 表名过滤fieldList         // 字段组合过滤);// 2. 构建搜索查询（分页、返回指定字段）SearchQuery query = SearchQueryBuilder.query(queryStr).returnFields("value_str")  // 只返回搜索结果的value_str字段（即a=值&b=值）.limit(0, 50);  // 分页：从第0条开始，最多返回50条// 3. 执行搜索（索引名为idx_unique_hash）SearchResult result = redisTemplate.opsForSearch().search("idx_unique_hash", query);// 4. 提取结果并返回return result.getDocuments().stream().map(doc -> (String) doc.getFieldValue("value_str")).collect(Collectors.toList());}
}

调用示例：
搜索租户 t1 的 customer 表 (a,b) 字段中，a 以 1 开头的所有唯一值组合：

List<String> results = hashUniqueSearchUtil.searchUniqueValues("t1", "customer", "a,b", "a=1*"
);
// 结果可能为：["a=1&b=2", "a=10&b=3", "a=11&b=5"]

四、Hash 方案 vs Set 方案：核心优势

对比维度	Set 方案	Hash 方案	优势结论
结构复杂度	需维护 Set + Hash（为了搜索），数据冗余。	仅需维护 Hash，无冗余。	Hash 更简洁，减少维护成本
RediSearch 集成	需先将 Set 转为 Hash，才能索引。	直接索引 Hash，无需转换。	Hash 集成更高效，无中间步骤
扩展性	若需存储额外元数据（如创建时间），需新增结构。	可直接在 Hash 中新增 field（如 create_time），无需修改核心逻辑。	Hash 扩展性更强
操作一致性	Set 与 Hash 需同步更新，存在不一致风险。	仅操作 Hash，无同步风险。	Hash 一致性更高

五、注意事项

Hash 的 field 长度限制：
Redis 中 Hash 的 field 最大长度为 512MB，实际业务中“字段值组合”（如 a=1&b=2）通常很短，完全满足需求，无需担心长度问题。
原子性保障：
务必使用 HSETNX（putIfAbsent）而非 HSET 添加 field，确保“判断不存在+添加”的原子性，避免并发场景下的重复数据。
数据一致性修复：
若数据库与 Redis 因故障（如网络中断）出现不一致，可通过“全量比对”修复：遍历数据库中某租户-表-字段组合的所有记录，构建 valueStr，检查 Redis Hash 中是否存在；若不存在则添加，若存在但数据库无对应记录则删除。
RediSearch 版本兼容性：
低版本 RediSearch 可能不支持直接索引 Hash 的 field，需在 Hash 中新增 value_str 字段存储相同内容，再索引 value_str，兼容性更好。