Skip to content

Task 2-1: In-Memory Cache cho Machine State

Phase: 2 Priority: Medium Module: RedisMachineRepository Depends on: task-1-1 Reference: docs/BountyHunter-ControlServer/details/feature-machine-management/SPEC.md

Background

Với 500+ machines gửi heartbeat mỗi 5-10 giây, RedisMachineRepository.save() phải thực hiện ~50-100 Redis writes/second liên tục. Mỗi write bao gồm HSET + EXPIRE — 2 round-trips đến Redis server (hoặc 1 nếu pipeline). Trong burst scenarios (ví dụ: 500 machines reconnect sau network outage), Redis có thể bị overwhelm. PLAN.md Task 1 đề cập thêm local cache để giảm Redis pressure. Caffeine là high-performance in-memory cache library được Spring Boot sử dụng internally.

Tasks

Note: Local cache phù hợp chỉ khi single ControlServer instance per deployment. Nếu chạy multi-instance (load balanced), local cache sẽ outdated khi instance khác update state — cần invalidation strategy hoặc disable cache. Document rõ ràng limitation này. Caffeine expireAfterWrite phải > Redis TTL (task-1-1) để tránh cache serving stale data sau khi Redis key đã expire. Async flush scheduler phải handle JedisConnectionException gracefully — không crash scheduler thread.

  • [ ] Thêm Caffeine dependency trong build.gradle (verify version compatibility với Java version):
// build.gradle
implementation 'com.github.ben-manes.caffeine:caffeine:3.1.8'
  • [ ] Thêm LoadingCache trong RedisMachineRepository:
import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;

public class RedisMachineRepository {
    private final JedisPool jedisPool;
    private final int ttlSeconds;

    // Local cache: key = macIp, value = status integer
    // Expires after write to prevent stale data
    private final Cache<String, Integer> localCache = Caffeine.newBuilder()
        .expireAfterWrite(Duration.ofSeconds(ttlSeconds + 5))  // slightly longer than Redis TTL
        .maximumSize(1000)  // max machines
        .build();

    // Dirty set: machines whose state needs to be flushed to Redis
    private final Set<String> dirtyKeys = ConcurrentHashMap.newKeySet();
  • [ ] save() method: update cache immediately, mark dirty, defer Redis write:
public void save(int status, String macIp) {
    // Update local cache immediately (fast, no I/O)
    localCache.put(macIp, status);
    dirtyKeys.add(macIp);
    // Redis write happens in scheduled flush (every 5s)
}
  • [ ] getStatus() method: cache-first, Redis fallback:
public Integer getStatus(String macIp) {
    Integer cached = localCache.getIfPresent(macIp);
    if (cached != null) return cached;  // cache hit

    // Cache miss: read from Redis
    String key = "RCraneMachineStatus:" + macIp;
    try (Jedis jedis = jedisPool.getResource()) {
        String value = jedis.hget(key, "status");
        if (value != null) {
            int status = Integer.parseInt(value);
            localCache.put(macIp, status);  // warm cache
            return status;
        }
    }
    return null; // machine not found
}
  • [ ] Scheduled flush task (mỗi 5s): batch-write dirty keys to Redis:
@Scheduled(fixedDelay = 5000)  // nếu dùng Spring Scheduling
public void flushDirtyKeysToRedis() {
    Set<String> toFlush = new HashSet<>(dirtyKeys);
    dirtyKeys.removeAll(toFlush);

    if (toFlush.isEmpty()) return;

    try (Jedis jedis = jedisPool.getResource()) {
        Pipeline pipeline = jedis.pipelined();
        for (String macIp : toFlush) {
            Integer status = localCache.getIfPresent(macIp);
            if (status != null) {
                String key = "RCraneMachineStatus:" + macIp;
                pipeline.hset(key, "status", String.valueOf(status));
                pipeline.expire(key, ttlSeconds);
            }
        }
        pipeline.sync();
        log.debug("Flushed {} machine states to Redis", toFlush.size());
    } catch (JedisConnectionException e) {
        log.error("Failed to flush to Redis — will retry next cycle: {}", e.getMessage());
        dirtyKeys.addAll(toFlush); // re-add for next flush
    }
}
  • [ ] Document limitation: // NOTE: Local cache is only correct with single ControlServer instance per deployment.
  • [ ] Thêm config:
# Local cache flush interval (ms). Must be < redis.machine.ttl-seconds * 1000
cache.flush-interval-ms=5000

Verification / Acceptance Criteria

  • [ ] 100 rapid save() calls trong 5 giây → chỉ 1 Redis write batch (verify via mock JedisPool.getResource() call count)
  • [ ] getStatus() sau save() trả correct value ngay lập tức (cache hit, no Redis round-trip)
  • [ ] Cache expiry sau ttlSeconds + 5 seconds → getStatus() falls back to Redis (cache miss)
  • [ ] JedisConnectionException trong flush → dirty keys được re-queued, không crash scheduler thread
  • [ ] Unit test: mock Redis, call save() 100 times, verify jedisPool.getResource() called ≤ 5 times trong 25 seconds (5 flush cycles)

Files to Modify

  • src/main/java/.../RedisMachineRepository.java
  • src/main/resources/config.properties
  • build.gradle hoặc pom.xml (thêm Caffeine dependency)