TY - GEN
T1 - One SQL to rule them all - An efficient and syntactically idiomatic approach to management of streams and tables
AU - Begoli, Edmon
AU - Hyde, Julian
AU - Akidau, Tyler
AU - Knight, Kathryn
AU - Hueske, Fabian
AU - Knowles, Kenneth
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2019/6/25
Y1 - 2019/6/25
N2 - Real-time data analysis and management are increasingly critical for today's businesses. SQL is the de facto lingua franca for these endeavors, yet support for robust streaming analysis and management with SQL remains limited. Many approaches restrict semantics to a reduced subset of features and/or require a suite of non-standard constructs. Additionally, use of event timestamps to provide native support for analyzing events according to when they actually occurred is not pervasive, and often comes with important limitations. We present a three-part proposal for integrating robust streaming into the SQL standard, namely: (1) time-varying relations as a foundation for classical tables as well as streaming data, (2) event time semantics, (3) a limited set of optional keyword extensions to control the materialization of time-varying query results. Motivated and illustrated using examples and lessons learned from implementations in Apache Calcite, Apache Flink, and Apache Beam, we show how with these minimal additions it is possible to utilize the complete suite of standard SQL semantics to perform robust stream processing.
AB - Real-time data analysis and management are increasingly critical for today's businesses. SQL is the de facto lingua franca for these endeavors, yet support for robust streaming analysis and management with SQL remains limited. Many approaches restrict semantics to a reduced subset of features and/or require a suite of non-standard constructs. Additionally, use of event timestamps to provide native support for analyzing events according to when they actually occurred is not pervasive, and often comes with important limitations. We present a three-part proposal for integrating robust streaming into the SQL standard, namely: (1) time-varying relations as a foundation for classical tables as well as streaming data, (2) event time semantics, (3) a limited set of optional keyword extensions to control the materialization of time-varying query results. Motivated and illustrated using examples and lessons learned from implementations in Apache Calcite, Apache Flink, and Apache Beam, we show how with these minimal additions it is possible to utilize the complete suite of standard SQL semantics to perform robust stream processing.
KW - Data management
KW - Query processing
KW - Stream processing
UR - http://www.scopus.com/inward/record.url?scp=85069440444&partnerID=8YFLogxK
U2 - 10.1145/3299869.3314040
DO - 10.1145/3299869.3314040
M3 - Conference contribution
AN - SCOPUS:85069440444
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1757
EP - 1772
BT - SIGMOD 2019 - Proceedings of the 2019 International Conference on Management of Data
PB - Association for Computing Machinery
T2 - 2019 International Conference on Management of Data, SIGMOD 2019
Y2 - 30 June 2019 through 5 July 2019
ER -