remotior_sensus.tools.band_classification module

Band classification.

This tool allows for the classification of remote sensing images, providing several algorithms such as Minimum Distance, Maximum Likelihood, Spectral Angle Mapping. Also, machine learning algorithms are provided through PyTorch (pytorch_multi_layer_perceptron) and scikit-learn (random_forest, random_forest_ovr, support_vector_machine, multi_layer_perceptron). This module includes tools for training the algorithms using Regions of Interest (ROIs) or spectral signatures.

Typical usage example:

>>> # import Remotior Sensus and start the session
>>> import remotior_sensus
>>> rs = remotior_sensus.Session()
>>> # start the process
>>> classification = rs.band_classification(
... input_bands=['file1.tif', 'file2.tif'], output_path='output.tif',
... algorithm_name=cfg.maximum_likelihood
... )
class remotior_sensus.tools.band_classification.Classifier(algorithm_name, spectral_signatures, covariance_matrices, model_classifier, input_normalization, normalization_values)

Bases: object

Manages classifiers.

A classifier is an object which includes the required parameters to perform a classification, including the tools to perform the training.

algorithm_name

algorithm name selected form cfg.classification_algorithms.

spectral_signatures

a SpectralSignaturesCatalog containing spectral signatures.

covariance_matrices

dictionary of previously calculated covariance matrices (used in maximum_likelihood).

model_classifier

classifier object.

input_normalization

perform input normalization; options are z_score or linear_scaling.

normalization_values

list of normalization paramters defined for each variable [normalization expressions, mean values, standar deviation values, minimum values, maximum values].

framework_name

name of framework such as classification_framework, scikit_framework, or pytorch_framework.

classification_function

the actual classification function.

function_argument = a dictionary including arguments for the classification function

such as model_classifier, covariance_matrices, normalization_values, spectral_signatures_catalog.

Examples

Fit a classifier
>>> # Start a session
>>> import remotior_sensus
>>> rs = remotior_sensus.Session()
>>> # create a BandSet
>>> catalog = rs.bandset_catalog()
>>> file_list = ['file1.tif', 'file2.tif', 'file3.tif']
>>> catalog.create_bandset(file_list, wavelengths=['Landsat 8'])
>>> # set a BandSet reference in signature catalog
>>> signature_catalog = rs.spectral_signatures_catalog(
>>>     bandset=catalog.get(1))
>>> # import vector in signature catalog
>>> signature_catalog.import_vector(
>>>     file_path='file.gpkg', macroclass_field='macroclass', class_field='class',
>>>     macroclass_name_field='macroclass', class_name_field='class',
>>>     calculate_signature=True)
>>> # train a minimum distance classifier
>>> classifier = Classifier.train(
>>>     spectral_signatures=signature_catalog,
>>>     algorithm_name='minimum distance'
>>> )
__init__(algorithm_name, spectral_signatures, covariance_matrices, model_classifier, input_normalization, normalization_values)

Initializes a classifier.

A classifier is an object which includes the

Parameters:
  • algorithm_name – algorithm name selected form cfg.classification_algorithms.

  • spectral_signatures – a SpectralSignaturesCatalog containing spectral signatures.

  • covariance_matrices – dictionary of previously calculated covariance matrices (used in maximum_likelihood).

  • model_classifier – classifier object.

  • input_normalization – perform input normalization; options are z_score or linear_scaling.

  • normalization_values – list of normalization parameters defined for each variable [normalization expressions, mean values, standard deviation values, minimum values, maximum values].

classmethod load_classifier(algorithm_name=None, spectral_signatures=None, covariance_matrices=None, model_classifier=None, input_normalization=None, normalization_values=None)

Loads a classifier.

Creates a classifier from loading.

Parameters:
  • algorithm_name – algorithm name selected form cfg.classification_algorithms.

  • spectral_signatures – a SpectralSignaturesCatalog containing spectral signatures.

  • covariance_matrices – dictionary of previously calculated covariance matrices (used in maximum_likelihood).

  • model_classifier – classifier object.

  • input_normalization – perform input normalization; options are z_score or linear_scaling.

  • normalization_values – list of normalization parameters defined for each variable [normalization expressions, mean values, standard deviation values, minimum values, maximum values].

Returns:

Classifier object.

Examples

Load a classifier
>>> classifier = Classifier.load_classifier(
>>> algorithm_name=algorithm_name, spectral_signatures=spectral_signatures,
>>> covariance_matrices=covariance_matrices, model_classifier=model_classifier,
>>> input_normalization=input_normalization, normalization_values=normalization_values)
run_prediction(input_raster_list, output_raster_path, n_processes: None | int = None, available_ram: None | int = None, macroclass: bool | None = True, threshold: bool | None = False, signature_raster: bool | None = False, classification_confidence: bool | None = False, virtual_raster: bool | None = None, min_progress: int | None = 1, max_progress: int | None = 100)

Runs prediction.

Performs multiprocess classification using a trained classifier using input bands.

Parameters:
  • input_raster_list – list of paths of input rasters.

  • output_raster_path – path of output file.

  • n_processes – number of parallel processes.

  • available_ram – number of megabytes of RAM available to processes.

  • macroclass – if True, use macroclass ID from ROIs or spectral signatures; if False use class ID.

  • threshold – if False, classification without threshold; if True, use single threshold for each signature; if float, use this value as threshold for all the signature.

  • classification_confidence – if True, write also additional classification confidence rasters as output.

  • signature_raster – if True, write additional rasters for each spectral signature as output.

  • virtual_raster – if True, create virtual raster output.

  • min_progress – minimum progress value for Progress().

  • max_progress – maximum progress value for Progress().

Returns:

OutputManager object with
  • path = [output path]

Examples

Save a trainied classifier
>>> classifier = Classifier()
>>> # after the training
>>> prediction = classifier.run_prediction(
... input_raster_list=['file1.tif', 'file2.tif', 'file3.tif'],
... output_raster_path='file_path')
save_model(output_path: str) OutputManager

Saves classifier model.

Saves a classifier model to file to be loaded later.

Parameters:

output_path – path of output file.

Returns:

OutputManager object with
  • path = [output path]

Examples

Save a trainied classifier
>>> classifier = Classifier()
>>> # after the training
>>> saved_model = classifier.save_model(output_path=output_path)
classmethod train(spectral_signatures=None, algorithm_name=None, covariance_matrices=None, svc_classification_confidence=None, n_processes: int | None = None, available_ram: int | None = None, cross_validation=True, x_matrix=None, y=None, class_weight=None, input_normalization=None, normalization_values=None, find_best_estimator=False, rf_max_features=None, rf_number_trees=100, rf_min_samples_split=None, svm_c=None, svm_gamma=None, svm_kernel=None, pytorch_model=None, pytorch_optimizer=None, mlp_training_portion=None, pytorch_loss_function=None, mlp_hidden_layer_sizes=None, mlp_alpha=None, mlp_learning_rate_init=None, mlp_max_iter=None, mlp_batch_size=None, mlp_activation=None, pytorch_optimization_n_iter_no_change=None, pytorch_optimization_tol=None, pytorch_device=None, min_progress=1, max_progress=100)

Trains a classifier.

Trains a classifier using ROIs or spectral signatures.

Parameters:
  • spectral_signatures – a SpectralSignaturesCatalog containing spectral signatures.

  • algorithm_name – algorithm name selected from cfg.classification_algorithms; if None, minimum distance is used.

  • n_processes – number of parallel processes.

  • available_ram – number of megabytes of RAM available to processes.

  • cross_validation – if True, perform cross validation for algorithms provided through scikit-learn (random_forest, random_forest_ovr, support_vector_machine, multi_layer_perceptron).

  • x_matrix – optional previously saved x matrix.

  • y – optional previously saved y matrix.

  • covariance_matrices – dictionary of previously calculated covariance matrices (used in maximum_likelihood).

  • svc_classification_confidence – if True, write also additional classification confidence rasters as output; required information for support_vector_machine.

  • input_normalization – perform input normalization; options are z_score or linear_scaling.

  • normalization_values – list of normalization paramters defined for each variable [normalization expressions, mean values, standar deviation values, minimum values, maximum values].

  • class_weight – specific for random forest and support_vector_machine, if None each class has equal weight 1, if ‘balanced’ weight is computed inversely proportional to class frequency.

  • find_best_estimator – specific for scikit classifiers, if True, find automatically the best parameters and fit the model, if integer the greater the value the more are the tested combinations.

  • rf_max_features – specific for random forest, if None all features are considered in node splitting, available options are ‘sqrt’ as square root of all the features, an integer number, or a float number for a fraction of all the features.

  • rf_number_trees – specific for random forest, number of trees in the forest.

  • rf_min_samples_split – specific for random forest through scikit, sets the minimum number of samples required to split an internal node; default = 2.

  • svm_c – specific for support_vector_machine through scikit, sets the regularization parameter C; default = 1.

  • svm_gamma – specific for support_vector_machine through scikit, sets the kernel coefficient; default = scale.

  • svm_kernel – specific for support_vector_machine through scikit, sets the kernel; default = rbf.

  • mlp_training_portion – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, the proportion of data to be used as training (default = 0.9) and the remaining part as test (default = 0.1).

  • mlp_hidden_layer_sizes – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, list of values where each value defines the number of neurons in a hidden layer (e.g., [200, 100] for two hidden layers of 200 and 100 neurons respectively); default = [100].

  • mlp_alpha – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, weight decay (also L2 regularization term) for Adam optimizer (default = 0.0001).

  • mlp_learning_rate_init – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets initial learning rate (default = 0.001).

  • mlp_max_iter – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets the maximum number of iterations (default = 200).

  • mlp_batch_size – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets the number of samples per batch for optimizer; if “auto”, the batch is the minimum value between 200 and the number of samples (default = auto).

  • mlp_activation – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets the activation function (default relu).

  • pytorch_model – specific for pytorch_multi_layer_perceptron, custom pytorch nn.Module.

  • pytorch_optimizer – specific for pytorch_multi_layer_perceptron, custom pytorch optimizer.

  • pytorch_loss_function – specific for pytorch_multi_layer_perceptron, sets a custom loss function (default CrossEntropyLoss).

  • pytorch_optimization_n_iter_no_change – specific for pytorch_multi_layer_perceptron, sets the maximum number of epochs where the loss is not improving by at least the value pytorch_optimization_tol (default 5).

  • pytorch_optimization_tol – specific for pytorch_multi_layer_perceptron, sets the tolerance of optimization (default = 0.0001).

  • pytorch_device – specific for pytorch_multi_layer_perceptron, processing device ‘cpu’ (default) or ‘cuda’ if available.

  • min_progress – minimum progress value for Progress().

  • max_progress – maximum progress value for Progress().

Returns:

Classifier object.

Examples

Load a classifier
>>> import remotior_sensus
>>> rs = remotior_sensus.Session()
>>> signature_catalog = rs.spectral_signatures_catalog()
>>> classifier = Classifier.train(
>>> spectral_signatures=spectral_signatures,
>>> algorithm_name='minimum distance')
remotior_sensus.tools.band_classification.band_classification(input_bands: list | int | BandSet, output_path: str | None = None, overwrite: bool | None = False, spectral_signatures: str | SpectralSignaturesCatalog | None = None, algorithm_name: str | None = None, bandset_catalog: BandSetCatalog | None = None, macroclass: bool | None = True, threshold: bool | float | None = False, classification_confidence: bool | None = False, signature_raster: bool | None = False, n_processes: int | None = None, available_ram: int | None = None, cross_validation: bool | None = True, x_input: array | None = None, y_input: array | None = None, covariance_matrices: dict | None = None, input_normalization: str | None = None, load_classifier: str | None = None, save_classifier: bool | None = False, only_fit: bool | None = False, class_weight: None | str | dict = None, find_best_estimator=False, rf_max_features=None, rf_number_trees: int | None = 100, rf_min_samples_split: None | int | float = None, svm_c: float | None = None, svm_gamma: str | float | None = None, svm_kernel: str | None = None, mlp_training_portion: None | float = None, mlp_hidden_layer_sizes: None | tuple | list = None, mlp_alpha: float | None = None, mlp_learning_rate_init: float | None = None, mlp_max_iter: float | None = None, mlp_batch_size: None | int | str = None, mlp_activation: None | str = None, pytorch_model: Optional = None, pytorch_optimizer: Optional = None, pytorch_loss_function: Optional = None, pytorch_optimization_n_iter_no_change: None | int = None, pytorch_optimization_tol: None | int = None, pytorch_device: None | str = None, progress_message: bool | None = True) OutputManager

Performs band classification.

This tool allows for classification of raster bands using the selected algorithm.

Parameters:
  • input_bands – list of input raster paths, or a BandSet number, or a previously defined BandSet.

  • output_path – path of output file.

  • overwrite – if True, output overwrites existing files.

  • spectral_signatures – a SpectralSignaturesCatalog containing spectral signatures.

  • algorithm_name – algorithm name selected from cfg.classification_algorithms; if None, minimum distance is used.

  • bandset_catalog – BandSetCatalog object.

  • macroclass – if True, use macroclass ID from ROIs or spectral signatures; if False use class ID.

  • threshold – if False, classification without threshold; if True, use single threshold for each signature; if float, use this value as threshold for all the signature.

  • classification_confidence – if True, write also additional classification confidence rasters as output.

  • signature_raster – if True, write additional rasters for each spectral signature as output.

  • n_processes – number of parallel processes.

  • available_ram – number of megabytes of RAM available to processes.

  • cross_validation – if True, perform cross validation for algorithms provided through scikit-learn (random_forest, random_forest_ovr, support_vector_machine, multi_layer_perceptron).

  • load_classifier – path to a previously saved classifier.

  • x_input – optional previously saved x matrix.

  • y_input – optional previously saved y matrix.

  • covariance_matrices – dictionary of previously calculated covariance matrices (used in maximum_likelihood).

  • input_normalization – perform input normalization; options are z_score or linear_scaling.

  • only_fit – perform only classifier fitting.

  • save_classifier – save classifier to file.

  • class_weight – specific for random forest and support_vector_machine, if None each class has equal weight 1, if ‘balanced’ weight is computed inversely proportional to class frequency.

  • find_best_estimator – specific for scikit classifiers, if True, find automatically the best parameters and fit the model, if integer the greater the value the more are the tested combinations.

  • rf_max_features – specific for random forest, if None all features are considered in node splitting, available options are ‘sqrt’ as square root of all the features, an integer number, or a float number for a fraction of all the features.

  • rf_number_trees – specific for random forest, number of trees in the forest.

  • rf_min_samples_split – specific for random forest through scikit, sets the minimum number of samples required to split an internal node; default = 2.

  • svm_c – specific for support_vector_machine through scikit, sets the regularization parameter C; default = 1.

  • svm_gamma – specific for support_vector_machine through scikit, sets the kernel coefficient; default = scale.

  • svm_kernel – specific for support_vector_machine through scikit, sets the kernel; default = rbf.

  • mlp_training_portion – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, the proportion of data to be used as training (default = 0.9) and the remaining part as test (default = 0.1).

  • mlp_hidden_layer_sizes – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, list of values where each value defines the number of neurons in a hidden layer (e.g., [200, 100] for two hidden layers of 200 and 100 neurons respectively); default = [100].

  • mlp_alpha – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, weight decay (also L2 regularization term) for Adam optimizer (default = 0.0001).

  • mlp_learning_rate_init – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets initial learning rate (default = 0.001).

  • mlp_max_iter – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets the maximum number of iterations (default = 200).

  • mlp_batch_size – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets the number of samples per batch for optimizer; if “auto”, the batch is the minimum value between 200 and the number of samples (default = auto).

  • mlp_activation – specific for multi_layer_perceptron and pytorch_multi_layer_perceptron, sets the activation function (default relu).

  • pytorch_model – specific for pytorch_multi_layer_perceptron, custom pytorch nn.Module.

  • pytorch_optimizer – specific for pytorch_multi_layer_perceptron, custom pytorch optimizer.

  • pytorch_loss_function – specific for pytorch_multi_layer_perceptron, sets a custom loss function (default CrossEntropyLoss).

  • pytorch_optimization_n_iter_no_change – specific for pytorch_multi_layer_perceptron, sets the maximum number of epochs where the loss is not improving by at least the value pytorch_optimization_tol (default 5).

  • pytorch_optimization_tol – specific for pytorch_multi_layer_perceptron, sets the tolerance of optimization (default = 0.0001).

  • pytorch_device – specific for pytorch_multi_layer_perceptron, processing device ‘cpu’ (default) or ‘cuda’ if available.

  • progress_message – progress message.

Returns:

If only_fit is True returns OutputManager() object with
  • extra = {‘classifier’: classifier, ‘model_path’: output model path}

If only_fit is False returns OutputManager() object with
  • path = classification path

  • extra = {‘model_path’: output model path}