Spring Boot缓存中数据验证与缺失键处理策略-java教程-PHP中文网

Spring Boot缓存中数据验证与缺失键处理策略

本文探讨了在Spring Boot缓存中，如何实现先验证现有缓存数据，再查询数据库获取缺失数据，并最终更新缓存的需求。我们分析了Spring Cache Abstraction的“全有或全无”特性，以及其默认行为在处理部分缓存、部分数据库查询场景时的局限性。文章将提供一种手动管理缓存的实现方案，并讨论其性能考量和替代策略，帮助开发者构建更灵活的缓存逻辑。

1. Spring Cache Abstraction 的默认行为与局限性

spring boot通过其核心框架的 cache abstraction 提供了声明式缓存支持，通常通过 @cacheable 等注解实现。然而，对于“先检查缓存中部分数据，再从数据库获取缺失数据”这种细粒度需求，spring开箱即用的 @cacheable 机制存在一些局限性。

1.1 “全有或全无”的缓存策略

Spring的缓存抽象机制本质上是对昂贵的服务方法或数据访问调用进行装饰（通过AOP）。这意味着，一个被 @Cacheable 注解的方法，其行为是“全有或全无”的：

如果方法的整个缓存键（通常由所有方法参数组成）在缓存中存在，则该方法不会被执行，直接返回缓存值。
如果缓存键不存在，则方法会被完全执行，其返回值会被缓存起来。这种机制类似于 Map.computeIfAbsent(:KEY, :Function)。它无法在方法执行前检查部分键是否存在于缓存中，并根据检查结果动态调整数据库查询。

1.2 默认的缓存键生成策略

默认情况下，Spring Cache Abstraction使用方法的所有参数来生成缓存键。例如，对于 findByIds(Set ids) 方法：

@Cacheable("Students")
List<Student> findByIds(Set<Integer> ids) {
  // ...
  return repository.findByIds(ids);
}

登录后复制

此时，缓存中存储的将是：

缓存键          | 缓存值
----------------|--------------
Set<Integer>    | List<Student>

登录后复制

这意味着整个 Set 将作为单个键，对应的缓存值是整个 List。这与我们期望的按单个 Student ID进行缓存（例如：ID 1 -> Student A，ID 2 -> Student B）的粒度不符。虽然可以自定义键生成策略，但对于这种“批查询、单缓存”的需求，默认注解仍然难以直接实现。

1.3 Cache 接口的单键访问限制

Spring的 org.springframework.cache.Cache 接口是对底层缓存提供商（如EhCache, Redis, Hazelcast等）的适配器。该接口主要提供单键的 get、put、evict 等操作。这意味着，即使我们想手动查询缓存，也需要对每个ID进行单独的 get 调用：

Cache cache = cacheManager.getCache("Students");
Student student = cache.get(id, Student.class); // 逐个获取

登录后复制

对于需要查询大量ID的场景，这种逐个访问的方式可能会导致性能问题，尤其是在分布式缓存环境中。虽然可以通过 Cache.getNativeCache() 获取底层缓存提供商的原生API（例如Hazelcast的 IMap.getAll(:Set)），但这会引入与特定缓存提供商的强耦合，降低代码的可移植性。

西语写作助手

西语助手旗下的AI智能写作平台，支持西语语法纠错润色、论文批改写作

查看详情

2. 实现“部分缓存、部分数据库”查询的自定义方案

鉴于Spring Cache Abstraction的上述局限性，要实现先验证缓存中现有数据，再查询数据库获取缺失数据，并缓存新结果的需求，通常需要手动管理缓存逻辑。以下是一个详细的实现示例：

import org.springframework.cache.Cache;
import org.springframework.cache.CacheManager;
import org.springframework.stereotype.Service;

import java.util.*;
import java.util.stream.Collectors;

@Service
public class StudentService {

    private final StudentRepository studentRepository;
    private final Cache studentsCache; // 直接获取特定缓存实例

    public StudentService(StudentRepository studentRepository, CacheManager cacheManager) {
        this.studentRepository = studentRepository;
        this.studentsCache = cacheManager.getCache("Students"); // 假设缓存名为"Students"
    }

    /**
     * 根据给定的学生ID集合，从缓存或数据库中获取学生信息。
     * 优先从缓存获取，缺失部分再从数据库查询并更新缓存。
     *
     * @param studentIds 待查询的学生ID集合
     * @return 匹配的学生列表
     */
    public List<Student> findStudentsByIds(Set<Integer> studentIds) {
        List<Student> result = new ArrayList<>();
        Set<Integer> idsToQueryDb = new HashSet<>(studentIds); // 存储需要从数据库查询的ID

        // 1. 尝试从缓存中加载学生数据
        // 注意：这里是逐个ID查询缓存，可能存在性能瓶颈
        for (Integer id : studentIds) {
            Cache.ValueWrapper valueWrapper = studentsCache.get(id);
            if (valueWrapper != null) {
                Object cachedObject = valueWrapper.get();
                if (cachedObject instanceof Student) {
                    result.add((Student) cachedObject);
                    idsToQueryDb.remove(id); // 从待查询数据库的ID集合中移除已缓存的ID
                }
            }
        }

        // 2. 查询数据库获取缓存中缺失的学生数据
        if (!idsToQueryDb.isEmpty()) {
            List<Student> studentsFromDb = studentRepository.findByIdIn(idsToQueryDb);

            // 3. 将从数据库获取的数据存入缓存，并添加到结果列表
            for (Student student : studentsFromDb) {
                studentsCache.put(student.getId(), student); // 缓存单个学生
                result.add(student);
            }
        }

        // 确保返回的结果只包含请求的ID，并且顺序不固定
        // 如果需要特定顺序，可能需要额外处理，例如按原始 studentIds 排序
        return result.stream()
                .filter(s -> studentIds.contains(s.getId())) // 过滤掉可能存在的意外数据
                .collect(Collectors.toList());
    }

    // 示例：StudentRepository 接口
    public interface StudentRepository {
        List<Student> findByIdIn(Set<Integer> ids);
    }

    // 示例：Student 实体类
    public static class Student {
        private int id;
        private String name;

        public Student(int id, String name) {
            this.id = id;
            this.name = name;
        }

        public int getId() {
            return id;
        }

        public void setId(int id) {
            this.id = id;
        }

        public String getName() {
            return name;
        }

        public void setName(String name) {
            this.name = name;
        }

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            Student student = (Student) o;
            return id == student.id;
        }

        @Override
        public int hashCode() {
            return Objects.hash(id);
        }
    }
}

登录后复制

2.1 代码解析与注意事项

手动获取 Cache 实例: 通过 CacheManager 获取特定名称的 Cache 实例 (studentsCache)，以便直接进行 get 和 put 操作。
缓存优先查询:
- 遍历所有请求的 studentIds。
- 对每个 id 调用 studentsCache.get(id) 尝试从缓存中获取数据。
- 如果缓存命中，则将数据添加到 result 列表，并从 idsToQueryDb 集合中移除该 id，表示此数据无需再查询数据库。
数据库查询:
- 在完成缓存查询后，idsToQueryDb 集合中将只包含那些缓存中不存在的ID。
- 对 idsToQueryDb 执行一次数据库批量查询 (studentRepository.findByIdIn(idsToQueryDb))，以获取缺失的数据。
更新缓存与合并结果:
- 将从数据库获取的 studentsFromDb 中的每个 Student 对象，通过 studentsCache.put(student.getId(), student) 逐一存入缓存。
- 将这些新获取的数据也添加到 result 列表中。
结果返回: 返回合并后的 result 列表。

2.2 性能考量

缓存多次 get 调用: 上述方案中，对每个请求的ID都会进行一次 studentsCache.get(id) 调用。如果 studentIds 集合非常大（例如数百甚至上千个ID），这可能会导致大量的缓存网络往返（对于分布式缓存而言），影响性能。
原生缓存API优化: 如果性能成为瓶颈，并且您不介意与特定缓存提供商耦合，可以考虑使用 studentsCache.getNativeCache() 获取底层缓存的原生API。例如，对于Hazelcast，可以将其转换为 IMap 并使用 IMap.getAll(idsToQuery) 进行批量获取，这将大大减少网络开销。
```
// 示例：使用Hazelcast原生API进行批量获取
// 假设studentsCache的底层实现是HazelcastCache
if (studentsCache.getNativeCache() instanceof IMap) {
    IMap<Integer, Student> nativeMap = (IMap<Integer, Student>) studentsCache.getNativeCache();
    Map<Integer, Student> cachedStudentsMap = nativeMap.getAll(studentIds); // 批量获取
    // ... 然后处理 cachedStudentsMap
}
```
登录后复制
但请注意，这种方式会牺牲缓存抽象带来的灵活性和可移植性。