Flutter 实现相似照片检测:感知哈希算法实战

相册中总有大量相似照片:连拍、截图、同一场景的多次拍摄。本文分享如何在 Flutter 中实现一套完整的相似照片检测系统,使用感知哈希算法在本地高效识别相似图片。

什么是感知哈希?

传统的文件哈希(如 MD5、SHA)对任何微小改动都会产生完全不同的结果。而感知哈希(Perceptual Hash)则不同,它基于图像的视觉特征生成哈希值,相似的图片会产生相似的哈希。

传统哈希:
  原图 MD5: a1b2c3d4e5f6...
  缩放后 MD5: 9x8y7z6w5v4... (完全不同)

感知哈希:
  原图 pHash: 8f373714acfcf4d0
  缩放后 pHash: 8f373714acfcf4d0 (相同或非常接近)

算法选型

常见的感知哈希算法有:

算法 原理 优点 缺点
aHash 平均亮度比较 计算快 对亮度变化敏感
dHash 水平梯度差异 抗缩放 对旋转敏感
pHash DCT 频域分析 最准确 计算较慢

单一算法各有局限,我们采用多哈希组合策略:

组合哈希 = aHash + dHash + vHash + cHash + pHash
         = 64位 + 64位 + 64位 + 64位 + 64位
         = 320位 (80个十六进制字符)
  • aHash:整体亮度特征
  • dHash:水平边缘特征
  • vHash:垂直边缘特征(dHash 的垂直版本)
  • cHash:颜色分布特征
  • pHash:DCT 频域结构特征(最重要)

核心实现

1. 哈希计算核心

import 'dart:math';
import 'dart:typed_data';
import 'package:image/image.dart' as img;

/// 哈希计算核心类
class HashComputeCore {
  /// 计算感知哈希
  /// 返回 80 字符的十六进制哈希字符串
  static String? computeHash(Uint8List imageData) {
    try {
      final image = img.decodeImage(imageData);
      if (image == null) return null;

      final grayscale = img.grayscale(image);

      // 计算各类哈希
      final aHashBits = _computeAHash(grayscale);  // 64位
      final dHashBits = _computeDHash(grayscale);  // 64位
      final vHashBits = _computeVHash(grayscale);  // 64位
      final cHashBits = _computeCHash(image);      // 64位
      final pHashBits = _computePHash(grayscale);  // 64位

      // 组合:320位 = 80个十六进制字符
      final allBits = [...aHashBits, ...dHashBits, ...vHashBits, ...cHashBits, ...pHashBits];
      return _bitsToHex(allBits);
    } catch (e) {
      return null;
    }
  }
}

2. aHash - 平均哈希

最简单的感知哈希,基于平均亮度:

/// 计算 aHash (Average Hash) - 64位
static List<int> _computeAHash(img.Image grayscale) {
  // 1. 缩放到 8x8
  final small8x8 = img.copyResize(grayscale, width: 8, height: 8);

  // 2. 计算平均亮度
  double totalLuminance = 0;
  for (var y = 0; y < 8; y++) {
    for (var x = 0; x < 8; x++) {
      totalLuminance += small8x8.getPixel(x, y).luminance;
    }
  }
  final avgLuminance = totalLuminance / 64;

  // 3. 生成哈希:高于平均为1,低于为0
  final bits = <int>[];
  for (var y = 0; y < 8; y++) {
    for (var x = 0; x < 8; x++) {
      bits.add(small8x8.getPixel(x, y).luminance > avgLuminance ? 1 : 0);
    }
  }
  return bits;
}

3. dHash - 差异哈希

基于相邻像素的亮度差异,对缩放更鲁棒:

/// 计算 dHash (Difference Hash) - 64位
static List<int> _computeDHash(img.Image grayscale) {
  // 缩放到 9x8(宽度多1像素用于比较)
  final small9x8 = img.copyResize(grayscale, width: 9, height: 8);

  final bits = <int>[];
  for (var y = 0; y < 8; y++) {
    for (var x = 0; x < 8; x++) {
      // 比较左右相邻像素
      final left = small9x8.getPixel(x, y).luminance;
      final right = small9x8.getPixel(x + 1, y).luminance;
      bits.add(left > right ? 1 : 0);
    }
  }
  return bits;
}

/// 计算 vHash (Vertical Difference Hash) - 64位
/// dHash 的垂直版本,捕捉垂直边缘
static List<int> _computeVHash(img.Image grayscale) {
  final small8x9 = img.copyResize(grayscale, width: 8, height: 9);

  final bits = <int>[];
  for (var y = 0; y < 8; y++) {
    for (var x = 0; x < 8; x++) {
      // 比较上下相邻像素
      final top = small8x9.getPixel(x, y).luminance;
      final bottom = small8x9.getPixel(x, y + 1).luminance;
      bits.add(top > bottom ? 1 : 0);
    }
  }
  return bits;
}

4. cHash - 颜色哈希

捕捉颜色分布特征:

/// 计算 cHash (Color Histogram Hash) - 64位
static List<int> _computeCHash(img.Image image) {
  final bits = <int>[];
  final small16x16 = img.copyResize(image, width: 16, height: 16);

  // 计算全图平均饱和度和亮度
  double totalSaturation = 0, totalBrightness = 0;
  for (var y = 0; y < 16; y++) {
    for (var x = 0; x < 16; x++) {
      final pixel = small16x16.getPixel(x, y);
      final hsv = _rgbToHsv(pixel.r.toInt(), pixel.g.toInt(), pixel.b.toInt());
      totalSaturation += hsv[1];
      totalBrightness += hsv[2];
    }
  }
  final avgSaturation = totalSaturation / 256;
  final avgBrightness = totalBrightness / 256;

  // 将图像分成 4x4 = 16 个块
  for (var blockY = 0; blockY < 4; blockY++) {
    for (var blockX = 0; blockX < 4; blockX++) {
      // 统计每个块的色相直方图
      final hueHistogram = List<int>.filled(4, 0);
      double blockSaturation = 0, blockBrightness = 0;

      for (var y = blockY * 4; y < (blockY + 1) * 4; y++) {
        for (var x = blockX * 4; x < (blockX + 1) * 4; x++) {
          final pixel = small16x16.getPixel(x, y);
          final hsv = _rgbToHsv(pixel.r.toInt(), pixel.g.toInt(), pixel.b.toInt());

          // 色相分成4个区间
          final hueIndex = (hsv[0] / 90).floor().clamp(0, 3);
          hueHistogram[hueIndex]++;
          blockSaturation += hsv[1];
          blockBrightness += hsv[2];
        }
      }

      blockSaturation /= 16;
      blockBrightness /= 16;

      // 找出主导色相
      int dominantHue = 0, maxCount = hueHistogram[0];
      for (var i = 1; i < 4; i++) {
        if (hueHistogram[i] > maxCount) {
          maxCount = hueHistogram[i];
          dominantHue = i;
        }
      }

      // 每个块生成4位:2位色相 + 1位饱和度 + 1位亮度
      bits.add((dominantHue >> 1) & 1);
      bits.add(dominantHue & 1);
      bits.add(blockSaturation > avgSaturation ? 1 : 0);
      bits.add(blockBrightness > avgBrightness ? 1 : 0);
    }
  }

  return bits;
}

5. pHash - 感知哈希(DCT)

最准确的感知哈希,基于离散余弦变换(DCT):

/// 计算 pHash (DCT Perceptual Hash) - 64位
static List<int> _computePHash(img.Image grayscale) {
  // 1. 缩放到 32x32
  final small32x32 = img.copyResize(grayscale, width: 32, height: 32);

  // 2. 转换为亮度矩阵
  final matrix = List.generate(32, (y) {
    return List.generate(32, (x) {
      return small32x32.getPixel(x, y).luminance.toDouble();
    });
  });

  // 3. 计算 2D DCT
  final dctMatrix = _computeDCT(matrix);

  // 4. 取左上角 8x8 的低频系数(排除 DC 分量)
  final lowFreq = <double>[];
  for (var y = 0; y < 8; y++) {
    for (var x = 0; x < 8; x++) {
      if (x == 0 && y == 0) continue;  // 跳过 DC 分量
      lowFreq.add(dctMatrix[y][x]);
    }
  }

  // 5. 计算中位数
  final sorted = List<double>.from(lowFreq)..sort();
  final median = sorted[sorted.length ~/ 2];

  // 6. 生成哈希:高于中位数为1,低于为0
  final bits = <int>[];
  for (var y = 0; y < 8; y++) {
    for (var x = 0; x < 8; x++) {
      if (x == 0 && y == 0) {
        bits.add(0);  // DC 分量固定为0
      } else {
        bits.add(dctMatrix[y][x] > median ? 1 : 0);
      }
    }
  }

  return bits;
}

/// 计算 2D 离散余弦变换 (DCT-II)
static List<List<double>> _computeDCT(List<List<double>> matrix) {
  final n = matrix.length;
  final result = List.generate(n, (_) => List<double>.filled(n, 0));

  // 预计算余弦值以提高性能
  final cosTable = List.generate(n, (i) {
    return List.generate(n, (j) {
      return cos((2 * j + 1) * i * pi / (2 * n));
    });
  });

  for (var u = 0; u < n; u++) {
    for (var v = 0; v < n; v++) {
      double sum = 0;
      for (var x = 0; x < n; x++) {
        for (var y = 0; y < n; y++) {
          sum += matrix[y][x] * cosTable[u][x] * cosTable[v][y];
        }
      }

      final cu = u == 0 ? 1 / sqrt(2) : 1.0;
      final cv = v == 0 ? 1 / sqrt(2) : 1.0;
      result[u][v] = 0.25 * cu * cv * sum;
    }
  }

  return result;
}

相似度计算

汉明距离

两个哈希的相似度通过汉明距离(不同位的数量)来衡量:

class HashUtils {
  /// 计算汉明距离
  static int hammingDistance(String hash1, String hash2) {
    int distance = 0;
    for (var i = 0; i < hash1.length; i++) {
      final v1 = int.tryParse(hash1[i], radix: 16) ?? 0;
      final v2 = int.tryParse(hash2[i], radix: 16) ?? 0;
      var xor = v1 ^ v2;
      while (xor > 0) {
        distance += xor & 1;
        xor >>= 1;
      }
    }
    return distance;
  }
}

加权汉明距离

不同哈希对相似度的贡献不同,pHash 最准确,应该赋予更高权重:

/// 各哈希段的权重(总权重 = 10)
static const double _weightAHash = 1.0;  // 整体亮度 (10%)
static const double _weightDHash = 1.7;  // 水平边缘 (17%)
static const double _weightVHash = 1.7;  // 垂直边缘 (17%)
static const double _weightCHash = 0.5;  // 颜色 (5%),滤镜影响大
static const double _weightPHash = 5.1;  // DCT结构 (51%),主导

/// 加权后的最大距离 = 64 * 10 = 640
static const double maxWeightedDistance = 640.0;

/// 计算加权汉明距离
static double weightedHammingDistance(String hash1, String hash2) {
  // 分段计算(每段 16 个十六进制字符 = 64 位)
  final aHashDist = _calculateDistance(hash1.substring(0, 16), hash2.substring(0, 16));
  final dHashDist = _calculateDistance(hash1.substring(16, 32), hash2.substring(16, 32));
  final vHashDist = _calculateDistance(hash1.substring(32, 48), hash2.substring(32, 48));
  final cHashDist = _calculateDistance(hash1.substring(48, 64), hash2.substring(48, 64));
  final pHashDist = _calculateDistance(hash1.substring(64, 80), hash2.substring(64, 80));

  return aHashDist * _weightAHash +
         dHashDist * _weightDHash +
         vHashDist * _weightVHash +
         cHashDist * _weightCHash +
         pHashDist * _weightPHash;
}

/// 计算相似度百分比(0-100)
static int calculateSimilarityPercent(String hash1, String hash2) {
  final weightedDist = weightedHammingDistance(hash1, hash2);
  return ((maxWeightedDistance - weightedDist) / maxWeightedDistance * 100)
      .round().clamp(0, 100);
}

早期退出优化

对于大量照片的比较,可以先检查 pHash,快速排除明显不相似的:

/// 使用加权距离判断是否相似(带早期退出优化)
static bool isSimilarWeighted(String hash1, String hash2, {double threshold = 64.0}) {
  // 早期退出:先检查 pHash(占比 51%,最重要)
  final pHashDist = _calculateDistance(hash1.substring(64, 80), hash2.substring(64, 80));
  final pHashWeighted = pHashDist * _weightPHash;

  // 如果仅 pHash 的加权距离已超过阈值的 60%,大概率不相似
  if (pHashWeighted > threshold * 0.6) {
    // 继续检查边缘特征
    final dHashDist = _calculateDistance(hash1.substring(16, 32), hash2.substring(16, 32));
    final vHashDist = _calculateDistance(hash1.substring(32, 48), hash2.substring(32, 48));
    final edgeWeighted = dHashDist * _weightDHash + vHashDist * _weightVHash;

    // 如果 pHash + 边缘特征已超过阈值,直接返回不相似
    if (pHashWeighted + edgeWeighted > threshold) {
      return false;
    }
  }

  return weightedHammingDistance(hash1, hash2) <= threshold;
}

性能优化

1. Isolate 池并行计算

哈希计算是 CPU 密集型任务,使用 Isolate 池避免阻塞主线程:

/// Isolate 池管理器
class IsolatePool {
  static IsolatePool? _instance;
  static IsolatePool get instance => _instance ??= IsolatePool._();

  final List<_IsolateWorker> _workers = [];
  int _poolSize = 2;
  int _nextWorkerIndex = 0;

  /// 初始化 Isolate 池
  Future<void> initialize({int? poolSize}) async {
    // 根据 CPU 核心数设置池大小,最少2个,最多6个
    _poolSize = poolSize ?? (Platform.numberOfProcessors ~/ 2).clamp(2, 6);

    for (int i = 0; i < _poolSize; i++) {
      final worker = await _IsolateWorker.spawn(i);
      _workers.add(worker);
    }
  }

  /// 计算图片哈希(轮询调度)
  Future<String?> computeHash(Uint8List imageData) async {
    final worker = _workers[_nextWorkerIndex];
    _nextWorkerIndex = (_nextWorkerIndex + 1) % _poolSize;
    return worker.computeHash(imageData);
  }

  /// 批量计算哈希(并行处理)
  Future<List<String?>> computeHashBatch(List<Uint8List> imageDataList) async {
    final futures = <Future<String?>>[];
    for (int i = 0; i < imageDataList.length; i++) {
      final workerIndex = i % _poolSize;
      futures.add(_workers[workerIndex].computeHash(imageDataList[i]));
    }
    return Future.wait(futures);
  }
}

/// Isolate 工作者
class _IsolateWorker {
  final Isolate isolate;
  final SendPort sendPort;

  static Future<_IsolateWorker> spawn(int id) async {
    final receivePort = ReceivePort();
    final broadcastStream = receivePort.asBroadcastStream();

    final isolate = await Isolate.spawn(_isolateEntryPoint, receivePort.sendPort);
    final sendPort = await broadcastStream.first as SendPort;

    return _IsolateWorker._(isolate: isolate, sendPort: sendPort, ...);
  }

  Future<String?> computeHash(Uint8List imageData) async {
    final responsePort = ReceivePort();
    sendPort.send(_HashRequest(imageData, responsePort.sendPort));
    final result = await responsePort.first as String?;
    responsePort.close();
    return result;
  }
}

/// Isolate 入口点
void _isolateEntryPoint(SendPort mainSendPort) {
  final receivePort = ReceivePort();
  mainSendPort.send(receivePort.sendPort);

  receivePort.listen((message) {
    if (message is _HashRequest) {
      final result = HashComputeCore.computeHash(message.imageData);
      message.responsePort.send(result);
    }
  });
}

2. 设备性能自适应

根据设备性能调整并发数:

/// 检测设备性能
Future<void> _detectDevicePerformance() async {
  final deviceInfo = DeviceInfoPlugin();

  if (GetPlatform.isAndroid) {
    final androidInfo = await deviceInfo.androidInfo;
    final sdkInt = androidInfo.version.sdkInt;

    if (sdkInt >= 31) {
      performanceLevel.value = DevicePerformanceLevel.high;
      _concurrency = 8;
      _useIsolatePool = true;
    } else if (sdkInt >= 28) {
      performanceLevel.value = DevicePerformanceLevel.medium;
      _concurrency = 4;
    } else {
      performanceLevel.value = DevicePerformanceLevel.low;
      _concurrency = 2;
    }
  }
}

3. 断点续传

支持中断后继续计算:

/// 哈希计算进度持久化模型
class HashComputeProgress {
  final int totalPhotos;
  final int computedPhotos;
  final List<String> pendingPhotoIds;
  final int lastUpdateTime;
  final bool isCompleted;
}

/// 保存计算进度
Future<void> _saveProgress() async {
  final progressData = HashComputeProgress(
    totalPhotos: totalCount.value,
    computedPhotos: computedCount.value,
    pendingPhotoIds: _pendingPhotoIds,
    lastUpdateTime: DateTime.now().millisecondsSinceEpoch,
    isCompleted: status.value == HashComputeStatus.completed,
  );
  await _storage.write(_progressKey, progressData.toJson());
}

/// 继续上次未完成的计算
Future<void> resumePendingCompute(List<AssetEntity> allPhotos) async {
  if (_pendingPhotoIds.isEmpty) return;

  // 过滤出仍然存在的照片
  final pendingSet = _pendingPhotoIds.toSet();
  final photosToCompute = allPhotos.where((p) => pendingSet.contains(p.id)).toList();

  if (photosToCompute.isEmpty) {
    await _clearSavedProgress();
    return;
  }

  await _startCompute(photosToCompute, isResume: true);
}

4. 数据库索引优化

-- 哈希表结构
CREATE TABLE photo_hash (
  photo_id TEXT PRIMARY KEY,
  phash TEXT NOT NULL,
  file_path TEXT,
  create_time INTEGER,
  hash_time INTEGER NOT NULL,
  width INTEGER,
  height INTEGER,
  file_size INTEGER,
  photo_type TEXT,
  burst_id TEXT,
  quality_score REAL
);

-- 索引优化
CREATE INDEX idx_phash ON photo_hash(phash);
CREATE INDEX idx_create_time ON photo_hash(create_time);
CREATE INDEX idx_photo_type ON photo_hash(photo_type);
CREATE INDEX idx_quality_score ON photo_hash(quality_score);

-- 复合索引 - 用于相似照片查询
CREATE INDEX idx_type_time ON photo_hash(photo_type, create_time);

相似照片分组

相似关系检测

/// 检测相似照片(纯哈希模式)
Future<List<List<AssetEntity>>> _detectByHash(List<AssetEntity> photos) async {
  // 1. 获取所有相似关系
  final allSimilarRelations = await _hashService!.findSimilarRelations(
    threshold: _currentWeightedThreshold.round(),
  );

  if (allSimilarRelations.isEmpty) return [];

  // 2. 过滤相似关系,只保留传入照片列表中的照片
  final photoIdSet = photos.map((p) => p.id).toSet();
  final filteredRelations = <String, Set<String>>{};

  for (final entry in allSimilarRelations.entries) {
    if (!photoIdSet.contains(entry.key)) continue;

    final filteredSimilarIds = entry.value
        .where((id) => photoIdSet.contains(id))
        .toSet();
    if (filteredSimilarIds.isNotEmpty) {
      filteredRelations[entry.key] = filteredSimilarIds;
    }
  }

  // 3. 构建分组(连通分量)
  return _buildGroupsFromRelations(filteredRelations, photos);
}

连通分量算法

将相似关系转换为分组(使用 BFS):

/// 根据相似关系构建分组
List<List<AssetEntity>> _buildGroupsFromRelations(
  Map<String, Set<String>> relations, 
  List<AssetEntity> photos
) {
  final groups = <List<AssetEntity>>[];
  final visited = <String>{};
  final photoMap = {for (final photo in photos) photo.id: photo};

  for (final photoId in relations.keys) {
    if (visited.contains(photoId)) continue;
    if (!photoMap.containsKey(photoId)) continue;

    // 使用 BFS 构建连通分量
    final group = _buildConnectedComponent(photoId, relations, photoMap, visited);
    if (group.length >= 2) {
      groups.add(group);
    }
  }

  return groups;
}

/// 构建连通分量(相似照片组)
List<AssetEntity> _buildConnectedComponent(
  String startPhotoId,
  Map<String, Set<String>> relations,
  Map<String, AssetEntity> photoMap,
  Set<String> visited,
) {
  final group = <AssetEntity>[];
  final queue = <String>[startPhotoId];
  final localVisited = <String>{};

  while (queue.isNotEmpty) {
    final currentId = queue.removeAt(0);

    if (localVisited.contains(currentId) || visited.contains(currentId)) {
      continue;
    }

    final photo = photoMap[currentId];
    if (photo == null) continue;

    group.add(photo);
    localVisited.add(currentId);
    visited.add(currentId);

    // 添加相似的照片到队列
    final similarIds = relations[currentId];
    if (similarIds != null) {
      for (final similarId in similarIds) {
        if (!localVisited.contains(similarId) && 
            !visited.contains(similarId) && 
            photoMap.containsKey(similarId)) {
          queue.add(similarId);
        }
      }
    }
  }

  return group;
}

最佳照片选择

从相似组中选出质量最好的照片:

/// 找出质量最好的照片
/// 综合考虑分辨率、文件大小、清晰度和曝光质量
Future<String> _findBestPhoto(List<AssetEntity> photos) async {
  if (photos.isEmpty) return '';
  if (photos.length == 1) return photos.first.id;

  // 优先从数据库获取预计算的质量评分
  if (_hashService != null) {
    final photoIds = photos.map((p) => p.id).toList();
    final bestId = await _hashService!.getBestPhotoId(photoIds);
    if (bestId != null) return bestId;
  }

  // 回退:使用分辨率评估
  return _findBestPhotoSync(photos);
}

/// 同步版本:基于分辨率选择最佳照片
String _findBestPhotoSync(List<AssetEntity> photos) {
  AssetEntity? best;
  int bestScore = 0;

  for (final photo in photos) {
    // 分辨率评分
    final resolution = photo.width * photo.height;
    final score = resolution;

    if (score > bestScore) {
      bestScore = score;
      best = photo;
    }
  }

  return best?.id ?? photos.first.id;
}

相似照片组模型

/// 相似照片组模型
class SimilarPhotoGroup {
  final String groupId;
  final List<AssetEntity> photos;
  final String bestPhotoId;           // 质量最好的照片ID
  final DateTime createTime;
  final SimilarPhotoType groupType;   // 组类型:连拍/截图/相似/重复
  final Map<String, int> similarityScores;  // 每张照片的相似度百分比
  final int averageSimilarity;        // 组内平均相似度

  /// 获取最佳照片
  AssetEntity? get bestPhoto => photos.firstWhereOrNull((p) => p.id == bestPhotoId);

  /// 获取非最佳照片(推荐删除的)
  List<AssetEntity> get otherPhotos => photos.where((p) => p.id != bestPhotoId).toList();

  /// 获取指定照片的相似度百分比
  int getSimilarityScore(String photoId) {
    if (photoId == bestPhotoId) return 100;
    return similarityScores[photoId] ?? 0;
  }

  /// 获取相似度等级描述
  String getSimilarityLevel(String photoId) {
    final score = getSimilarityScore(photoId);
    if (score >= 95) return '几乎相同';
    if (score >= 85) return '非常相似';
    if (score >= 70) return '较相似';
    if (score >= 50) return '有相似性';
    return '略有相似';
  }
}

/// 相似照片类型
enum SimilarPhotoType {
  burst,      // 连拍
  screenshot, // 截图
  similar,    // 相似场景
  duplicate,  // 完全重复
  unknown,
}

自适应阈值

用户反馈学习

根据用户反馈自动调整相似度阈值:

/// 记录用户反馈
Future<void> recordFeedback({required bool isCorrect}) async {
  if (isCorrect) {
    feedbackCorrect.value++;
  } else {
    feedbackIncorrect.value++;
  }
  await _saveSettings();

  // 自适应调整阈值
  if (autoThresholdEnabled.value) {
    _adjustThresholdBasedOnFeedback();
  }
}

/// 根据用户反馈自适应调整阈值
void _adjustThresholdBasedOnFeedback() {
  final total = feedbackCorrect.value + feedbackIncorrect.value;
  if (total < 10) return;  // 需要足够的反馈数据

  final correctRate = feedbackCorrect.value / total;

  // 正确率低于 70%:阈值太宽松,需要收紧
  // 正确率高于 90%:阈值太严格,可以放宽
  if (correctRate < 0.7 && similarityThreshold.value > minThreshold) {
    similarityThreshold.value = (similarityThreshold.value - 3).clamp(minThreshold, maxThreshold);
  } else if (correctRate > 0.9 && similarityThreshold.value < maxThreshold) {
    similarityThreshold.value = (similarityThreshold.value + 2).clamp(minThreshold, maxThreshold);
  }

  _saveSettings();
}

阈值与相似度对应关系

// 加权阈值与相似度对应关系(基于公式:相似度% = (640 - 阈值) / 640 * 100)
// 
// 阈值 32-80:   87%+ 相似(几乎相同)
// 阈值 80-120:  81%+ 相似(高度相似)
// 阈值 120-160: 75%+ 相似(较相似)
// 阈值 160-192: 70%+ 相似(有一定相似性)

static const int defaultThreshold = 145;  // 默认检测 77%+ 相似的照片
static const int minThreshold = 32;       // 最严格
static const int maxThreshold = 192;      // 最宽松

内存管理

LRU 缓存

/// 使用 LRU 缓存管理质量评分和照片类型
late final GenericLRUCache<String, double> _qualityScoreCache;
late final GenericLRUCache<String, SimilarPhotoType> _photoTypeCache;

void _initCaches() {
  // 质量评分缓存 - 最多缓存 5000 张照片
  _qualityScoreCache = GenericLRUCache<String, double>(
    name: 'photo_quality_scores',
    maxSize: 5000,
  );

  // 照片类型缓存
  _photoTypeCache = GenericLRUCache<String, SimilarPhotoType>(
    name: 'photo_types',
    maxSize: 5000,
  );

  // 注册到统一缓存管理器
  _cacheManager.registerCache('photo_quality_scores', _qualityScoreCache, priority: 6);
  _cacheManager.registerCache('photo_types', _photoTypeCache, priority: 5);
}

内存压力响应

/// 内存压力回调
Future<void> _onMemoryPressure(MemoryPressureLevel level) async {
  switch (level) {
    case MemoryPressureLevel.moderate:
      // 中等压力:清理 25% 缓存
      _qualityScoreCache.trimByPercent(0.25);
      _photoTypeCache.trimByPercent(0.25);
      break;
    case MemoryPressureLevel.high:
      // 高压力:清理 50% 缓存
      _qualityScoreCache.trimByPercent(0.50);
      _photoTypeCache.trimByPercent(0.50);
      _photoCache.clear();
      break;
    case MemoryPressureLevel.critical:
      // 临界压力:清理辅助缓存,保留相似组数据
      _qualityScoreCache.clear();
      _photoTypeCache.clear();
      _photoCache.clear();
      break;
  }
}

效果展示

检测准确率

场景 准确率 说明
连拍照片 98%+ 时间接近 + 哈希相似
截图 95%+ 尺寸特征 + 哈希相似
同场景多拍 90%+ 主要依赖 pHash
滤镜处理后 85%+ cHash 权重低,影响小

性能数据

设备 1000张照片哈希计算 相似检测
高端机 (8核) ~15秒 <1秒
中端机 (4核) ~30秒 ~1秒
低端机 (2核) ~60秒 ~2秒

总结

本文实现的相似照片检测系统具有以下特点:

  1. 多哈希组合:aHash + dHash + vHash + cHash + pHash,取长补短
  2. 加权相似度:pHash 主导(51%),更准确反映画面相似度
  3. 早期退出优化:快速排除明显不相似的照片
  4. Isolate 并行计算:充分利用多核 CPU
  5. 断点续传:支持中断后继续计算
  6. 自适应阈值:根据用户反馈自动调整
  7. 内存友好:LRU 缓存 + 内存压力响应

这套方案在实际应用中表现良好,能够准确识别连拍、截图、同场景多拍等相似照片,帮助用户高效整理相册。


本文基于「回留」相册整理应用的实际开发经验。

分类: 标签:

评论

-- 评论已关闭 --

全部评论