N-GRAM BASED QUERY STRUCTURING SYSTEM FOR EFFECTIVE XML RETRIEVAL

  • Roko Abubakar Usmanu Danfodiyo University, Sokoto, Nigeria
  • Asma’u Shehu Usmanu Danfodiyo University, Sokoto, Nigeria
  • Aminu Muhammad Bui Usmanu Danfodiyo University, Sokoto, Nigeria
  • Ibrahim Saidu Usmanu Danfodiyo University, Sokoto

Abstract

Query structuring systems are keyword search systems recently used for the effective retrieval of XML documents. Existing systems fail to put keyword query ambiguity prob-lems into consideration during query pre-processing and return irrelevant predicate nodes. As a result, these sys-tems return irrelevant results. In this research, an XML keyword search system, called N-gram based XML query structuring system (NBXQSS) is developed to improve the performance of keyword searches. The NBXQSS uses an N-gram Based Query Segmentation (NBQS) method which interprets a user query as a list of semantic units to help resolve ambiguity. The system also introduces an improved predicate identification algorithm (IPIA) to return rele-vant predicates. The IPIA uses a proposed function to com-pute the query term proximity and ordering. The effective-ness of the NBXQS is demonstrated through experimental performance study on some real-world XML documents. The results show that the developed system performs bet-ter compared to the existing system in terms of precision.

Downloads

Download data is not yet available.

References

[1] Abubakar Roko, et al, “Effective Keyword query structuring using NER for XML retrieval”. International Journal of Web Information Systems. Vol 11, PP 33-53 .,(2015). ISSN:1744-0084. DOI:10.1108/IJWIS-06-2014-0022.
[2] Abubakar Roko, et al, "Named Entity Based Ranking with Term Proximity for XML Retrieval”, International Journal of Information Retrieval Research. Vol 8. Issue 2. .,(2018). SSN: 2155-6377DOI: 10.4018/IJIRR.2018040104.
[3] Akritidis, L., Katsaros, D., & Bozanis, P. (2012). Improved retrieval effectiveness by efficient combination of term proximity and zone scoring: A simulation-based evaluation. Simulation Modelling Practice and Theory, 22, 74–91. http://doi.org/ 10.1016/j.simpat.2011.12.002.
[4] Büttcher, S., Clarke, C. L. a., & Lushman, B. (2006). Term proximity scoring for ad-hoc retrieval on very large text collections. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR ’06 (p. 621). http://doi.org/10.1145 /1148170.1148285.
[5] Brian, D., “The Definitive Guide to Berkeley DB XML”. (N. Sixsmith, Ed.). New York, New York, USA: Apress. (2006)
[6] Da C. Hummel, F., Da Silva, A. S., Moro, M. M., & Laender, A. H. F., “Automatically generating structured queries in XML keyword search”. In Lecture Notes in Computer Science
(including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6932 LNCS, pp. 194–205) 2011. http://doi.org/ 10.1007/978 -3-642.
[7] Jenny R. F., et al, “Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling”. Proceedings of the 43rd Annual Meeting of the Association for Computation Linguistics (ACL 2005), pp. 370. http://nlp.stanford.edu /papers/gibbscrf3.pdf.
[8] Li, J., Liu, C., Zhou, R., & Ning, B. (2009). Processing XML Keyword Search by Constructing Effective Structured Queries. Advances in Data and Web Management.
[9] Li, X., Li, Z., Chen, Q., & Li, N. (2011). XIOTR: A Terse Ranking of XIO for XML Keyword Search. Journal of Software, 6(1), 156–163. http://doi.org/10.4304/ jsw.6.1.156-163.
[10] Li, X., Li, Z., Wang, P., & Chen, Q. (2010). XIOF: Finding XIO for Effective Keyword Search in XML Documents. In 2010 2nd International Workshop on Intelligent Systems and Applications (pp. 1–6). Ieee. http://doi.org/10.1109/IWISA.2010.5473249.
[11] Nguyen, K., & Cao, J. (2010). Exploit Keyword Query Semantics and Structure of Data for Effective XML Keyword Search. In Proceedings of the Twenty-First Australasian Conference on Database Technologies (Vol. 104, pp. 133–140).
[12] Petkova, D., Croft, W. B., & Diao, Y. (2009). Refining Keyword Queries for XML Retrieval by Combining Content and Structure. Advances in Information Retrieval.
[13] Lin, Y., Michel, J.-B., Aiden, E. L., Orwant, J., Brockman, W., & Petrov, S. (2012). Syntactic Annotations for the Google Books Ngram Corpus. Proceedings of the ACL 2012 System Demonstrations, (July), 169–174. Retrieved from http://www.aclweb.org/anthology/P12-3029.
Published
2019-08-25
How to Cite
Abubakar, R., Shehu, A., Muhammad Bui, A., & Saidu, I. (2019). N-GRAM BASED QUERY STRUCTURING SYSTEM FOR EFFECTIVE XML RETRIEVAL. International Journal of Advanced Computer Technology, 8(4), 01-10. Retrieved from http://ijact.org/index.php/ijact/article/view/22
Section
Articles