Abstract
Hashes are vital in limiting the spread of child sexual abuse material online, yet their use introduces unresolved technical, legal, and ethical challenges. This paper bridges a critical gap by analyzing both cryptographic and perceptual hashing, not only in terms of detection capabilities, but also their vulnerabilities and implications for privacy governance. Unlike prior work, it reframes CSAM detection as a multidimensional issue, at the intersection of cybersecurity, data protection law, and digital ethics. Three key contributions are made: first, a comparative evaluation of hashing techniques, revealing weaknesses, such as susceptibility to media edits, collision attacks, hash inversion, and data leakage; second, a call for standardized benchmarks and interoperable evaluation protocols to assess system robustness; and third, a legal argument that perceptual hashes qualify as personal data under EU law, with implications for transparency and accountability. Ethically, the paper underscores the tension faced by service providers in balancing user privacy with the duty to detect CSAM. It advocates for detection systems that are not only technically sound, but also legally defensible and ethically governed. By integrating technical analysis with legal insight, this paper offers a comprehensive framework for evaluating CSAM detection, within the broader context of digital safety and privacy.
Keywords: hash functions; data anonymization; domain-specific security and privacy architectures; CSAM database; hash database; GDPR
https://www.mdpi.com/2624-800X/5/4/92#Abstract
The entire work was performed while E. Daskalaki was working at FORTH for the PreventCSA@EU project, while in the time of publication she is affiliated with European Union Agency for Cybersecurity (ENISA)