Improving the Estimation of the Atmospheric Water Vapor Pressure Using Interpretable Long Short-Term Memory Networks: Dataset, Python code, and trained models

Dataset

Description

Atmospheric water vapor pressure is an essential meteorological control on land surface and hydrologic processes. It is not as frequently observed as other meteorologic conditions, but often inferred through the August–Roche–Magnus formula by simply assuming dew point and daily minimum temperatures are equivalent or by empirically correlating the two temperatures using an aridity correction. The performance of both methods varies considerably across different regions and during different time periods; obtaining consistently accurate estimates across space and time remains a great challenge. We applied an interpretable Long Short-Term Memory (iLSTM) network conditioned on static, location specific attributes to estimate daily vapor pressure for 83 FLUXNET sites in the United States and Canada. This data package includes all raw data of the 83 FLUXNET sites, input data for model training/validation/test, trained models and results, and python codes for the manuscript "Improving the Estimation of the Atmospheric Water Vapor Pressure Using an Interpretable Long Short-term Memory Network". Specifically, it consists of five parts. - First, "1_Daymet_data_83sites.zip" includes raw data downloaded from Daymet for the 83 sites used in the paper according to their longitude and latitude, in which vapor pressure is used. It also includes a pre-processed CSV data file combining all data from the 83 sites which is specifically used for the paper. - Second, "2_Fluxnet2015_data_83sites.zip" includes raw half hourly data of the 83 sites downloaded from FLUXNET2015 data portal, pre-processed daily data of the 83 sites, a CSV file including combined pre-processed daily data of the 83 sites, and a CSV file including the information (site ID, site name, latitude, longitude, data available period) of the 83 sites. - Third, "3_MODIS_LAI_data_83sites_raw.zip" includes raw leaf area index (LAI) data downloaded from the AppEEARs data portal. - Fourth, "4_Scripts.zip" includes all scripts related to model training and post-processing of a trained model, and a jupyter notebook showing an example for model post-processing. Two typo errors in files titled "run2get_args.py" and "postprocess.py" were corrected on March 27, 2024 to avoid confusions.- Finally, "Trained_models_and_results.zip" includes three folders and three files with suffix ".npy", and each folder corresponds to one file with suffix ".npy" with the same title. Each of the three folders include all trained models associated with one iLSTM model configuration (35 models for each configuration, details are described in the paper). Each file with suffix ".npy" includes the post-processed results of the corresponding 35 models under one iLSTM model configuration.

Cite this