Large-scale grid computing for content-based image retrieval

C Town, Karl Harrison

Research output: Contribution to journalArticle

6 Citations (Scopus)


Purpose - Content-based image retrieval (CBIR) technologies offer many advantages over purely text-based image search. However, one of the drawbacks associated with CBIR is the increased computational cost arising from tasks such as image processing, feature extraction, image classification, and object detection and recognition. Consequently CBIR systems have suffered from a lack of scalability, which has greatly hampered their adoption for real-world public and commercial image search. At the same time, paradigms for large-scale heterogeneous distributed computing such as grid computing, cloud computing, and utility-based computing are gaining traction as a way of providing more scalable and efficient solutions to large-scale computing tasks. Design/methodology/approach - This paper presents an approach in which a large distributed processing grid has been used to apply a range of CBIR methods to a substantial number of images. By massively distributing the required computational task across thousands of grid nodes, very high through-put has been achieved at relatively low overheads. Findings - This has allowed one to analyse and index about 25 million high resolution images thus far, while using just two servers for storage and job submission. The CBIR system was developed by Imense Ltd and is based on automated analysis and recognition of image content using a semantic ontology. It features a range of image-processing and analysis modules, including image segmentation, region classification, scene analysis, object detection, and face recognition methods. Originality/value - In the case of content-based image analysis, the primary performance criterion is the overall through-put achieved by the system in terms of the number of images that can be processed over a given time frame, irrespective of the time taken to process any given image. As such, grid processing has great potential for massively parallel content-based image retrieval and other tasks with similar performance requirements.
Original languageEnglish
Pages (from-to)438-446
Number of pages9
JournalAslib Proceedings
Issue number4-5
Publication statusPublished - 1 Jan 2010


  • Pattern recognition
  • Data handling
  • Virtual work
  • Data analysis


Dive into the research topics of 'Large-scale grid computing for content-based image retrieval'. Together they form a unique fingerprint.

Cite this