TY - JOUR
T1 - A comparative study of multiple attribute tree and inverted file structures for large bibliographic files
AU - Rao, S. V.Nageswara
AU - Iyengar, S. Sitharama
AU - Madhavan, C. E.Veni
PY - 1985
Y1 - 1985
N2 - A variety of data structures such as inverted file, multi-lists, quad tree, k-d tree, range tree, polygon tree, quintary tree, multidimensional tries, segment tree, doubly chained tree, the grid file, d-fold tree. super B-tree, Multiple Attribute Tree (MAT), etc. have been studied for multidimensional searching and related problems. Physical data base organization, which is an important application of multidimensional searching, is traditionally and mostly handled by employing inverted file. This study proposes MAT data structure for bibliographic file systems, by illustrating the superiority of MAT data structure over inverted file. Both the methods are compared in terms of preprocessing, storage and query costs. Worst-case complexity analysis of both the methods, for a partial match query, is carried out in two cases: (a) when directory resides in main memory, (b) when directory resides in secondary memory. In both cases, MAT data structure is shown to be more efficient than the inverted file method. Arguments are given to illustrate the superiority of MAT data structure in an average case also. An efficient adaptation of MAT data structure, that exploits the special features of MAT structure and bibliographic files, is proposed for bibliographic file systems. In this adaptation, suitable techniques for fixing and ranking of the attributes for MAT data structure are proposed. Conclusions and proposals for future research are presented.
AB - A variety of data structures such as inverted file, multi-lists, quad tree, k-d tree, range tree, polygon tree, quintary tree, multidimensional tries, segment tree, doubly chained tree, the grid file, d-fold tree. super B-tree, Multiple Attribute Tree (MAT), etc. have been studied for multidimensional searching and related problems. Physical data base organization, which is an important application of multidimensional searching, is traditionally and mostly handled by employing inverted file. This study proposes MAT data structure for bibliographic file systems, by illustrating the superiority of MAT data structure over inverted file. Both the methods are compared in terms of preprocessing, storage and query costs. Worst-case complexity analysis of both the methods, for a partial match query, is carried out in two cases: (a) when directory resides in main memory, (b) when directory resides in secondary memory. In both cases, MAT data structure is shown to be more efficient than the inverted file method. Arguments are given to illustrate the superiority of MAT data structure in an average case also. An efficient adaptation of MAT data structure, that exploits the special features of MAT structure and bibliographic files, is proposed for bibliographic file systems. In this adaptation, suitable techniques for fixing and ranking of the attributes for MAT data structure are proposed. Conclusions and proposals for future research are presented.
KW - Design and analysis of algorithms
KW - complexity analysis
KW - multidimensional data structures
UR - http://www.scopus.com/inward/record.url?scp=0021898636&partnerID=8YFLogxK
U2 - 10.1016/0306-4573(85)90089-5
DO - 10.1016/0306-4573(85)90089-5
M3 - Article
AN - SCOPUS:0021898636
SN - 0306-4573
VL - 21
SP - 433
EP - 442
JO - Information Processing and Management
JF - Information Processing and Management
IS - 5
ER -