Abstract
Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may not be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.
Original language | English |
---|---|
Title of host publication | Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 |
Editors | Jian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 4736-4737 |
Number of pages | 2 |
ISBN (Electronic) | 9781538627143 |
DOIs | |
State | Published - Jul 1 2017 |
Event | 5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States Duration: Dec 11 2017 → Dec 14 2017 |
Publication series
Name | Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 |
---|---|
Volume | 2018-January |
Conference
Conference | 5th IEEE International Conference on Big Data, Big Data 2017 |
---|---|
Country/Territory | United States |
City | Boston |
Period | 12/11/17 → 12/14/17 |
Funding
ACKNOWLEDGEMENT Oak Ridge National Laboratory is managed by the UT-Battelle, LLC, for the U.S. Department of Energy under contract DEAC05-00OR22725.
Keywords
- machine learning
- natural language processing
- social media interaction
- stream pipelining