datagenerator.pertinent_negatives
Attributes
Classes
A dataset designed to investigate the impact of pertinent negative (PN) features |
Module Contents
- class datagenerator.pertinent_negatives.PertinentNegativesDataset(seed: int = 0, n_features: int = 5, n_samples: int = 10, distribution: str = 'normal', weight_range: Tuple[float, float] = (-1.0, 1.0), weights: torch.Tensor | None = None, pn_features: List[int] | None = None, pn_zero_likelihood: float = 0.5, pn_weight_factor: float = 10, baseline: str = 'zero')
Bases:
xaiunits.datagenerator.WeightedFeaturesDatasetA dataset designed to investigate the impact of pertinent negative (PN) features on model predictions by introducing zero values in selected features, which are expected to significantly impact the output.
This dataset is useful for scenarios where the absence of certain features (indicated by zero values) provides important information for model predictions.
- Inherits from:
WeightedFeaturesDataset: Class extending BaseFeaturesDataset with support for weighted features
- pn_features
Indices of features considered as pertinent negatives.
- Type:
list[int]
- pn_zero_likelihood
Likelihood of a pertinent negative feature being set to zero.
- Type:
float
- pn_weight_factor
Weight factor applied to the pertinent negative features to emphasize their impact.
- Type:
float
- cat_features
Categorical features derived from the pertinent negatives.
- Type:
list
- labels
Generated labels with optional noise.
- Type:
torch.Tensor
- features
Name of the attribute representing the input features.
- Type:
str
- ground_truth_attribute
Name of the attribute considered as ground truth for analysis.
- Type:
str
- subset_data
List of attributes to be included in subsets.
- Type:
list[str]
- subset_attribute
Additional attributes to be considered in subsets.
- Type:
list[str]
- pn_zero_likelihood = 0.5
- pn_weight_factor = 10
- pn_features = [0]
- cat_features = [0]
- label_noise
- labels
- features = 'samples'
- ground_truth_attribute = 'ground_truth'
- subset_data = ['samples', 'weighted_samples', 'ground_truth']
- subset_attribute
- _intialize_pn_features(pn_features: List[int] | None) List[int]
Validates and initializes the indices of features to be considered as pertinent negatives (PN).
Ensures that specified pertinent negative features are within the valid range of feature indices. Falls back to the first feature if pn_features is not specified or invalid.
- Parameters:
pn_features (list of int, optional) – Indices of features specified as pertinent negatives.
- Returns:
The validated list of indices for pertinent negative features.
- Return type:
list[int]
- Raises:
ValueError – If any specified pertinent negative feature index is out of the valid range or if the input is not a list.
- _initialize_zeros_for_PN() None
Sets the values of pertinent negative (PN) features to zero with a specified likelihood, across all samples in a vectorized manner.
This modification is performed directly on the samples attribute.
- _get_new_weighted_samples() None
Recalculates the weighted samples considering the introduction of zeros for pertinent negative features in a vectorized manner.
Adjusts the weight of features set to zero to emphasize their impact by using the pn_weight_factor. Updates the weighted_samples attribute with the new calculations.
- _create_ground_truth_baseline(baseline: str) None
Creates the ground truth baseline based on the specified baseline type (“zero” or “one”).
- Parameters:
baseline (str) – Specifies the type of baseline to use. Must be either “zero” or “one”.
- Raises:
KeyError – If the specified baseline is not “zero” or “one”.
- __getitem__(idx: int, others: List[str] = ['ground_truth_attribute', 'baseline']) Tuple[Any, Ellipsis]
Retrieve a sample and its associated label by index.
- Parameters:
idx (int) – Index of the sample to retrieve.
others (list) – Additional items to retrieve. Defaults to [].
- Returns:
Tuple containing the sample and its label.
- Return type:
tuple
- generate_model() torch.nn.Module
Generates and returns a neural network model tailored for analyzing the impact of pertinent negatives.
The model is configured to incorporate the weights, pertinent negatives, and the pertinent negative weight factor.
- Returns:
- A neural network model designed to work with the dataset’s specific configuration,
including the pertinent negatives and their associated weight factor.
- Return type:
- datagenerator.pertinent_negatives.data