Descrição:
This dataset is related to the paper "Mapping Agricultural Intensification in the Brazilian Savanna: A Machine Learning Approach and Harmonized Data from Landsat Sentinel-2". The study aimed to analyze the performance of the machine learning algorithms Random Forest (RF), Artificial Neural Networks (ANN), and Extreme Gradient Boosting (XGBoost), fed with the time-series of spectral indices NDVI, NDWI, and SAVI from NASA Harmonized Landsat Sentinel-2 (HLS), in detecting intensification (number of cycles) and crop types in Sorriso municipality, Mato Grosso State, in the 2021-2022 crop season, using hierarchical classification in three levels. At Level 1, the target classes were temporary crops (1), native vegetation and silviculture (2), and pastures (3). At Level 2, double cropping (1), single cropping (2), and triple cropping (3). At Level 3, the aim was to identify the second-season crops cultivated in areas identified as double cropping: beans (1), corn (2), cotton (3), and other crops (4). The files available in this dataset are:
- Vector files, in shapefiles format, with ground samples obtained during fieldwork in Sorriso, Mato Grosso, between 6-9 June 2022. The files are compressed by level, with the names "Samples_LevelX.zip" in the "Vector" folder.
- Worksheets for modeling, in xlsx format, containing the values of the time series of each spectral index, at each classification level, for each sampling point. The files are named "DB_index_LevelX.xlsx" (e.g., "DB_NDVI_Level1"). There is also a PDF file (Order_of_Layers.pdf) to identify the explanatory variables according to the layer order of the original raster stack (e.g., "NDVI_1" is NDVI from September 3rd, 2021). These files are in the "Dataset" folder with subfolders named by level (e.g., "Level_1").
- The R scripts for running the models, getting confusion matrices, and accuracy metrics. The files are named "ALGORITHM_LevelX.R" (e.g., "ANN_Level1.R" or "RF_Level2.R"). In each script, all the modeling processes of all spectral indices are present. For example, the file "ANN_Level1.R" contains the models with the variables NDVI, SAVI, NDWI, and the three combined (AllVI).
- The results of each model, in 'rds' format (use R to read it). The files are named "ALGORITHM_index_model_LevelX.rds" (e.g., "XGBoost_NDVI_model_Level2.rds") and allocated in the "Results" folder.
- The 27 final maps resulted from spatial predictions in TIFF format (e.g., "Map_ANN_NDVI_Level3_Final.tif"). The files are in the "Final_Maps" folder.
Each file contains a brief description, and we encourage users to read the associated paper for further processing details.