Hierarchical One Permutation Hashing: Efficient Multimedia Near Duplicate Detection
With advances in multimedia technologies and the proliferation of smart phone, digital cameras, storage devices, there are a rapidly growing massive amount of multimedia data collected in many applications such as multimedia retrieval and management system, in which the data element is composed of text, image, video and audio. Consequently, the study of multimedia near duplicate detection has attracted significant concern from research organizations and commercial communities. Traditional solution minwish hashing () faces two challenges: expensive preprocessing time and lower comparison speed. Thus, this work first introduce a hashing method called one permutation hashing () to shun the costly preprocessing time. Based on , a more efficient strategy group based one permutation hashing () is developed to deal with the high comparison time. Based on the fact that the similarity of most multimedia data is not very high, this work design an new hashing method namely hierarchical one permutation hashing () to further improve the performance. Comprehensive experiments on real multimedia datasets clearly show that with similar accuracy is five to seven times faster than
READ FULL TEXT