Skip to content

Task 3-1: Profile Job Execution Times

Phase: 3 - Performance Priority: Medium Depends on: task-1-1 Reference: docs/BountyHunter-Backend/details/feature-batch-async-processing/SPEC.md

Background

40+ scheduled jobs chạy nhưng không có metrics về thời gian thực thi. Không biết job nào chạy chậm, job nào overlap, hay tổng CPU time của batch module.

Tasks

1. Thêm execution time logging cho critical jobs

Template để apply vào mỗi job:

DI Note: Each job class that uses this template must have MeterRegistry injected. Add to the job class:

private final MeterRegistry meterRegistry;
Use @RequiredArgsConstructor (if Lombok is present) or an explicit constructor. Alternatively, if an AbstractTimedJob base class is created (see step 2), inject MeterRegistry once in the base class constructor and share it with all subclasses. Required imports: - io.micrometer.core.instrument.MeterRegistry - io.micrometer.core.instrument.Tags - java.time.Duration - java.time.Instant - java.util.concurrent.TimeUnit

@Scheduled(...)
public void runJob() {
    Instant start = Instant.now();
    LOGGER.info("[BATCH] {} started", getClass().getSimpleName());
    try {
        doRun();
    } finally {
        long ms = Duration.between(start, Instant.now()).toMillis();
        LOGGER.info("[BATCH] {} completed in {}ms", getClass().getSimpleName(), ms);
        meterRegistry.timer("batch.job.execution_time",
            Tags.of("job", getClass().getSimpleName())
        ).record(ms, TimeUnit.MILLISECONDS);
    }
}

2. Priority jobs cần instrument trước

  • [ ] MatchMakingForAllGameModeConfig (game critical)
  • [ ] NftRentalExpirationCheckBatch (money involved)
  • [ ] GiftAggregationJob (realtime sensitive)
  • [ ] CardCheckoutExpiredConfig, CryptoCheckoutExpiredConfig (payment)
  • [ ] DataArchiveBatch (potentially slow)

3. Alert nếu job chạy quá lâu

if (ms > JOB_SLOW_THRESHOLD_MS) {
    LOGGER.warn("[BATCH_SLOW] {} took {}ms (threshold={}ms)",
        getClass().getSimpleName(), ms, JOB_SLOW_THRESHOLD_MS);
}

Verification / Acceptance Criteria

  • [ ] Each of the 5 priority jobs logs [BATCH] <ClassName> started and [BATCH] <ClassName> completed in <N>ms on every execution
  • [ ] Micrometer timer batch.job.execution_time with tag job=<ClassName> is registered and readable via /actuator/metrics/batch.job.execution_time after at least one run
  • [ ] When a job exceeds JOB_SLOW_THRESHOLD_MS, a WARN-level log [BATCH_SLOW] is emitted (verifiable by setting a very low threshold in test)
  • [ ] Instrumentation does not alter job logic — jobs still perform their original work even when the timer/logger is added
  • [ ] If AbstractTimedJob base class is created, all 5 priority job classes extend it and compile without errors

Files to Modify

  • Critical job files (5 files listed above)
  • Consider creating a base AbstractTimedJob class