attribench.distributed.metrics.Insertion
- class attribench.distributed.metrics.Insertion(model_factory, attributions_dataset, batch_size, maskers, activation_fns, mode='morf', start=0.0, stop=1.0, num_steps=100, address='localhost', port='12355', devices=None)[source]
Bases:
DeletionCompute the Insertion metric for a given
AttributionsDatasetand model using multiple processes. Insertion can be viewed as an opposite version of the Deletion metric.Insertion is computed by iteratively revealing the top (Most Relevant First, or MoRF) or bottom (Least Relevant First, or LeRF) features of the input samples, leaving the other features masked out, and computing the confidence of the model on the masked samples.
This results in a curve of confidence vs. number of features masked. The area under (or equivalently over) this curve is the Insertion metric.
start, stop, and num_steps are used to determine the range of features to mask. The range is determined by start and stop as a percentage of the total number of features. num_steps is the number of steps to take between start and stop.
The Insertion metric is computed for each masker in maskers and for each activation function in activation_fns. The number of processes is determined by the number of devices. If devices is None, then all available devices are used. Samples are distributed evenly across the processes.
Note that the Insertion metric is equivalent to the Deletion metric with the following changes: - Start and stop are 1 - start and 1 - stop, respectively - The mode parameter is swapped
Note also that, if start and stop are 1 and 0 or vice versa, then Insertion-morf and Deletion-lerf are equal, and Insertion-lerf and Deletion-morf are equal.
- Parameters:
- model_factoryModelFactory
ModelFactory instance or callable that returns a model. Used to create a model for each subprocess.
- attributions_datasetAttributionsDataset
Dataset containing the samples and attributions to compute Insertion on.
- batch_sizeint
The batch size to use when computing the metric.
- maskersDict[str, Masker]
Dictionary of maskers to use for computing the metric.
- activation_fnsUnion[List[str], str]
List of activation functions to use for computing the metric. If a single string is given, it is converted to a single-element list.
- modestr, optional
Mode to use for computing the metric. Either “morf” or “lerf”. Default: “morf”
- startfloat, optional
Relative start of the range of features to mask. Must be between 0 and 1. Default: 0.0
- stopfloat, optional
Relative end of the range of features to mask. Must be between 0 and 1. Default: 1.0
- num_stepsint, optional
Number of steps to use for the range of features to mask. Default: 100
- addressstr, optional
Address to use for the multiprocessing connection. Default: “localhost”
- portstr, optional
Port to use for the multiprocessing connection. Default: “12355”
- devicesOptional[Tuple], optional
Devices to use. If None, then all available devices are used. Default: None
Methods
Runs the metric computation and optionally saves the result.
Save the result to disk.
Attributes
result- run(result_path=None, progress=True)
Runs the metric computation and optionally saves the result. If no result path is given, the result will not be saved to disk. It can still be accessed via the
resultproperty.- Parameters:
- result_pathstr, optional
Path to save the result to. If None, the result is not saved to disk.
- progressbool, optional
Whether to show a progress bar. Defaults to True.
- save_result(path, format='hdf5')
Save the result to disk.
- Parameters:
- pathstr
Path to save the result to.
- formatstr, optional
Format to save the result in. If
"hdf5", the result is saved as an HDF5 file. If"csv", the result is saved as a directory structure containing CSV files. Default:"hdf5".
- Raises:
- ValueError
If the result is None.