GitHub - TAMIUCEES/ML-DownScale_SMERGE_CEES

The mission of CEES is to facilitate education, research, and outreach in the earth and environmental studies across the TAMIU service area. CEES also takes pride in supporting diverse research endeavors across the TAMIU campus. This beginner-friendly guide aims to simplify the understanding of raster, files, and other related concepts by utilizing the OS library and ArcPy package in the operating environment.

➡️ https://www.tamiu.edu/cees/

Explanation of Structure

The user can start with the first file folder, ArcPy_Scripts.

There are four distinct folders in total. After the ArcPy_Scripts folder, users will proceed to the Dataset_Synthesis folder, followed by the Machine_Learning folder, and lastly, the Miscellaneous_Deprecated folder.

Each folder consists of subfolders that contain notebooks and scripts equipped with a range of tools. These tools are designed to assist in organizing and presenting pertinent data efficiently, enabling users to access and manipulate information more effectively. This enhanced productivity and predictive capabilities contribute to improved overall performance and accuracy.

AcrPy Scripts

Included Notebooks and Python Scripts:

MDR_layer_extractor_v1

auto_multiD_subsetting

auto_zonal_stats

multiBand_raster_extractor

multiD_raster_merge

netCDF_layer_extractor

raster_merge

Libraries used:

This guide will provide a comprehensive explanation of the directory structure and offer a step-by-step guide to utilizing the os library and ArcPy packages. The programs will consistently employ the libraries to ensure cohesion.

Note

To initiate the process, the auto_multiD_subsetting program should be utilized first. Subsequently, based on the user's requirements, any intermediate programs can be employed as needed. It is essential to finalize the process with the auto_zonal_stats program as the final step.

auto_multiD_subsetting:

The program facilitates efficient data retrieval and raster processing by subsetting and resampling raster data on an individual layer basis.

multiD_raster_merge:

This program is designed to combine multiple multidimensional raster datasets into a single, cohesive raster dataset.

multiBand_raster_extractor:

The program is equipped with efficient band extraction and reordering capabilities, enabling the extraction of one or more bands or reorganizing the bands into a multiband raster layer effectively.

netCDF_layer_extractor:

This program utilizes the find_files function to identify files and construct new file paths, allowing users to generate new files and perform data resampling.

raster_merge:

This program enables users to merge multiple raster files, consolidating diverse datasets into a single comprehensive raster.

auto_zonal_stats:

This program allows users to resample files, enabling raster file size conversion.

Additional:

MDR_layer_extractor_v1:

Through data slicing along specific variables and dimensions, the program extracts subsets from numerous multidimensional raster datasets, enabling the generation of a multidimensional raster layer from a given dataset.

Dataset Synthesis

Included Notebooks and Python Scripts:

dataset_breakdown

inst_post_process

primary_dataset_gen

static_var_gen

These notebooks are designed to arrange and display pertinent data methodically. By doing so, users can access and manage the required information.

Note

In the notebook sequence, the dataset_breakdown and static_var_gen notebooks can be used interchangeably and should come before all other notebooks. After these two notebooks, the primary_dataset notebook can be used. Finally, the inst_post_process notebook is optional, but if used, it should be placed at the end of the sequence.

dataset_breakdown:

The program gathers data from the user input in the form of a CSV file or files then breaks down the data in the file.

static_var_gen:

The program generates a new CSV file using static variables from the initial dataset.

primary_dataset_gen:

The program enhances prediction accuracy by generating, cleaning, and combining data from static and dynamic files.

inst_post_process:

The program organizes data from two files to obtain updated station names and pagenames.

Machine Learning

Included Notebooks and Python Scripts:

GBR_w_verifer

GBR_wout_verifier

RF

XGB_w_verifier

XGB_wout_verifier

These notebooks aim to provide users with a reference to machine-learning techniques, specifically guiding them through the process of constructing and utilizing multiple decision trees. The ultimate objective is to enhance the accuracy and overall predictive capabilities of the constructed models.

Important

Random Forest ONLY runs with Linux.

GBR_w_verifier:

The Gradient Boosting Regressor with verifier is a machine learning algorithm that employs a sequence of decision trees to perform predictions.

GBR_wout_verifier:

The machine learning algorithm, which employs a sequence of decision trees to forecast, will operate without a verifier. Each decision tree is trained to correct the faults of the previous trees, resulting in a highly accurate overall prediction.

RF:

Random forest is a machine learning algorithm that generates a single prediction by collecting the output of multiple decision trees.

XGB_w_verifier:

The eXtreme Gradient Boost with verifier constructs a sequence of decision trees iteratively, each tree contributing to the overall predictive performance.

XGB_wout_verifier:

eXtreme Gradient Boost is a machine learning algorithm that operates without a verifier. XGBoost functions by constructing a series of trees sequentially, where every subsequent tree learns from the errors of its predecessor. This process continues until a point of maximal improvement is reached, resulting in a robust and accurate regression model.

Miscellaneous Deprecated

Included Notebooks and Python Scripts:

Aggregate_MultiPoints

MDR_layer_extractorV2

auto_multipoint_extractor

Note

This folder contains a collection of assorted ArcPy notebooks that, while not currently utilized, possess the potential to serve as valuable resources for the user.

Aggregate_Multipoints:

The aggregate_multipoints program enables the summarization of point features, facilitating the creation of an area from these points. This area is then utilized to compute statistical data.

MDR_layer_extractorV2:

The MDR_layer_extractorV2 program utilizes the function to extract subsets from multiple multidimensional raster datasets by saving the datasets to a designated directory.

auto_multipoint_extractor:

The auto_multipoint_extractor program allows users to extract multiple points automatically by modifying the input point features.

Requirements txt

Note

Users may need to modify the 'requirements.txt' file according to their specific Linux version. The file was created using Ubuntu 24.04 LTS and users with different versions may need to make adjustments to ensure compatibility.

Demo Files

The demo files required for working with the notebooks, primary_dataset_gen, and XGB_wout_verifier can be found at the provided link.

https://dustytamiu-my.sharepoint.com/personal/aaron_sanchez_tamiu_edu/_layouts/15/onedrive.aspx?id=%2Fpersonal%2Faaron%5Fsanchez%5Ftamiu%5Fedu%2FDocuments%2FGithub%5FStuff&ct=1719936717028&or=OWA%2DNT%2DMail&cid=9ece2fa8%2Dfefa%2Dc467%2D69ca%2D6a2861e1641e&ga=1&LOF=1

Name		Name	Last commit message	Last commit date
Latest commit History 483 Commits
ArcPy_Scripts		ArcPy_Scripts
Dataset_Synthesis		Dataset_Synthesis
Machine_Learning		Machine_Learning
Miscellaneous Deprecated		Miscellaneous Deprecated
ArcGIS_Arcpy Dataset Generation.pdf		ArcGIS_Arcpy Dataset Generation.pdf
README.md		README.md
README_ Additional Libraries Documentation.pdf		README_ Additional Libraries Documentation.pdf
READ_ME_Procedure_Soil_Texture.pdf		READ_ME_Procedure_Soil_Texture.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Mission Statement

Explanation of Structure

AcrPy Scripts

Dataset Synthesis

Machine Learning

Miscellaneous Deprecated

Requirements txt

Demo Files

🛠️ Languages and Tools :

About

Releases

Packages

Contributors 5

Languages

TAMIUCEES/ML-DownScale_SMERGE_CEES

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Mission Statement

Explanation of Structure

AcrPy Scripts

Dataset Synthesis

Machine Learning

Miscellaneous Deprecated

Requirements txt

Demo Files

🛠️ Languages and Tools :

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages