Abstract
This chapter presents a new way of providing Atmospheric Radiation Measurement (ARM) data discovery through data analysis and visualization services. This program was created to study cloud formation processes and their influence on radiative transfer at various highly instrumented ground and mobile stations. The total volume of ARM data is roughly 1.4 PB. The current search for ARM data is performed by using its metadata, such as the site name, instrument name, date, and so on. NoSQL technologies were explored to improve the capabilities of data searching, not only by their metadata, but also by use of the measurement values. Two technologies that are currently being implemented for testing are Apache Cassandra (NoSQL database) and Apache Spark (analytics framework). Both technologies were developed to work in a distributed environment and hence can handle large data for storing and analytics. JavaScript-based visualization libraries were used to generate interactive data plots in Web browsers. To assess the performance of NoSQL for ARM data, ARM's widely used atmospheric measurements will be used to discover the data.
Original language | English |
---|---|
Title of host publication | Big Data Analytics in Earth, Atmospheric, and Ocean Sciences |
Publisher | wiley |
Pages | 237-252 |
Number of pages | 16 |
ISBN (Electronic) | 9781119467557 |
ISBN (Print) | 9781119467571 |
DOIs | |
State | Published - Nov 1 2022 |
Keywords
- ARM value added products
- Atmospheric radiation measurement data discovery
- Big data computing framework
- Cassandra Query Language
- LASSO
- MapReduce
- NERSC
- NoSQL technologies
- Web-based visualization techniques