DIMATEX DATA: Fast Image Auto-annotation with Visual Vector Approximation Clusters
Welcome in the README of DIMATEX DATA
25 july 2005
This web page contains the data used and produced by the image auto-annotation model DIMATEX published in:
H. Glotin, S. Tollari, Fast Image Auto-annotation with Visual Vector Approximation Clusters,
International Workshop on Content-Based Multimedia Indexing (CBMI2005),
Riga, Latvia, june 2005 [PDF] [POSTER] [BIB]
# Corpus Copyright:
COREL images in the demo are copyrighted by Wang J. of Penn State University.
Some pre-processed image signals can be downloaded from Kobus Barnard web page
http://vision.cs.arizona.edu/kobus/research/data/jmlr_2003/
Below we provide our normalized lexicon and the auto-annotations from DIMATEX model.
You are free to use the files below for academic stuff,
if so please cite in your productions our CBMI 2005 paper [BIB].
# Lexicon normalization:
"vocabulary_adaptation.txt" file displays our COREL normalized lexicon of 267 words
manually generated at LSIS lab from the 325 original words.
We mostly normalized some plurals and generalized some words like F-16 to plane.
# Auto-annotations from DIMATEX model:
Files TEST_LIST_WORD_REF_EST_FLAB and TEST_LIST_WORD_REF_EST_FLABT provide
auto-annotations from DIMATEX model.
FILES and their FORMAT:
image_name_fig6_CBMI2005.txt
NOVEL_LIST_WORD_REF
TEST_LIST_WORD_REF_EST_FLAB
TEST_LIST_WORD_REF_EST_FLABT
TRAIN_LIST_WORD_REF
vocabulary_adaptation.txt
words_13911
# File vocabulary_adaptation.txt
Give the vocabulary adaptation of 267 words manually generated at LSIS from the 325 original words.
We mostly normalized some plurals and generalized some words like F-16 to plane.
File format:
newword1:oldword1
newword2:oldword2
...
# File words_13911
The normalized lexicon used in CBMI paper.
The line number is the word number.
# File TRAIN_LIST_WORD_REF
Columns
1 Train set image COREL number.
2:6 Number of the COREL reference words (in our lexicon) for each image. 0 is for no word.
# File TEST_LIST_WORD_REF_EST_FLAB/TEST_LIST_WORD_REF_EST_FLABT
Columns
1 Test set image COREL number.
2:6 Number of the COREL reference words (in our lexicon) for each image. 0 is for no word.
7:16 The 10 first estimated words sorted from high to low probabilities (Method = E1 FLAB see CBMI05).
# File NOVEL_LIST_WORD_REF (other images from Kobus Barnard set)
Columns
1 Novel image number.
2:6 Reference
# File image_name_fig6_CBMI2005.txt
List the image numbers demonstrated in figure 6 of the paper.
## For any question http://webia.lip6.fr/~tollaris/, sabrina.tollari (at) lip6.fr or http://glotin.univ-tln.fr, glotin (at) univ-tln.fr