ITEE seminar: Dr Heng Tao Shen, 10.00AM, Mon 27 Oct 2003
Efficient Database Support for WWW Image Retrieval
Speaker: Dr Heng Tao Shen, National University of Singapore
When: 10.00AM, Monday 27 Oct 2003
Venue: 78-420
Host: Professor Maria Orlowska
Abstract:
WWW is exploring and shaping the current research direction. To
enhance the WWW page content, images are increasingly being embedded
in web pages. Such pages over the WWW essentially provide a rich and
interesting source of image collection from which users can
query. Uniquely, WWW images carry both high-level feature - text,
and low-level features - color, shape, and texture. Typically, each
feature is represented as a high-dimensional point (or feature
vector).
Unfortunately, most WWW image search engines fail to exploit image
semantics and give rise to low precision. On the other hand,
existing indexing techniques fail to provide more efficient
retrieval than sequential scan as the dimensionality of image
features reaches high due to the well-known 'dimensionality
curse'. Moreover, the problem of indexing multiple image features is
too hard to have been addressed. To build an image retrieval system,
both effectiveness and efficiency have to be considered.
In this talk, we focus on two novel indexing methods which provide
strong efficiency support for the rertieval.
1): One well known approach to overcoming 'dimensionality curse' is
to reduce the dimensionality of the original dataset before
constructing the index. We present an adaptive Multi-level
Mahalanobis-based Dimensionality Reduction (MMDR) technique for
high-dimensional indexing. Our MMDR technique has four notable
features compared to existing methods. First, it discovers
elliptical clusters for more effective dimensionality reduction by
using only the low-dimensional subspaces. Second, data points in
the different axis systems are indexed using a single B+-tree.
Third, our technique is highly scalable in terms of data size and
dimension. Finally, it is also dynamic and adaptive to insertions.
2): To futher support hyper-dimensional databases which contain
hundreds or even thousands of dimensions. We introduce a novel
methodology called Local Digital Coding (LDC). LDC extracts a
simple bitmap representation called Digital Code(DC) for each
point in the database. Pruning during KNN search is performed by
dynamically selecting only a subset of the bits from the DC based
on which subsequent comparisons are performed. In doing so,
expensive operations involved in computing L-norm distance
functions between hyper-dimensional data can be avoided.
Biography:
Dr Shen received Undergraduate Scholarship from Singapore Ministry of Eduction in 1996. He obtained his BSc (with 1st class Honors) and PhD from School of Computing, National University of Singapore, in 2000 and 2003 respectively. His research interests include autonomic computing, database, multimedia, P2P and internet applications. Heng Tao is supervised by Professor Beng Chin Ooi and rewarded as 2001 Dean's Graduate Award Winner, School of Computing, National University of Singapore due to his meritorious performance. His journal and conference papers appeared mainly in database area (TKDE, ICDE 2004, ICDE 2003, etc) and multimedia retrieval (ACM Multimdedia, etc). Currently his research also inlcudes autonomic computing, data stream, bioinformatics, etc, and have several submissions on these areas.
Contact:
Professor Maria Orlowska, seminar host (maria@itee.uq.edu.au)
or Guido Governatori (ITEE seminar co-ordinator)
(guido@itee.uq.edu.au)
ITEE seminar web page: http://www.itee.uq.edu.au/~seminar
[All seminars]
