datagenerator.conflicting

Attributes

data

Classes

ConflictingDataset

Generic synthetic dataset with feature cancellation capabilities.

Module Contents

class datagenerator.conflicting.ConflictingDataset(seed: int = 0, n_features: int = 2, n_samples: int = 10, distribution: str = 'normal', weight_range: Tuple[float, float] = (-1.0, 1.0), weights: torch.Tensor | None = None, cancellation_features: List[int] | None = None, cancellation_likelihood: float = 0.5)

Bases: xaiunits.datagenerator.WeightedFeaturesDataset

Generic synthetic dataset with feature cancellation capabilities.

Feature cancellations are based on likelihood. If cancellation_features are not provided, all features in each sample are candidates for cancellation, with a specified likelihood of each feature being canceled. Canceled features are negated in their contributions to the dataset, allowing for the analysis of model behavior under feature absence scenarios.

Inherits from:

WeightedFeaturesDataset: Class extending BaseFeaturesDataset with support for weighted features

cancellation_features

Indices of features subject to cancellation.

Type:

list of int, optional

cancellation_likelihood

Likelihood of feature cancellation, between 0 and 1.

Type:

float

cancellation_outcomes

Binary tensor indicating whether each feature in each sample is canceled.

Type:

torch.Tensor

cancellation_samples

Concatenation of samples with their cancellation outcomes.

Type:

torch.Tensor

cancellation_attributions

The attribution of each feature considering the cancellation.

Type:

torch.Tensor

cat_features

Categorical features derived from the cancellation samples.

Type:

list

ground_truth_attributions

Combined tensor of weighted samples and cancellation attributions for ground truth analysis.

Type:

torch.Tensor

Initializes a ConflictingDataset object.

Parameters:
  • seed (int) – Seed for random number generation, ensuring reproducibility. Defaults to 0.

  • n_features (int) – Number of features in each sample. Defaults to 2.

  • n_samples (int) – Number of samples to generate. Defaults to 10.

  • distribution (str) – Type of distribution to use for generating samples. Defaults to ‘normal’.

  • weight_range (tuple[float]) – Range (min, max) for generating random feature weights. Defaults to (-1.0, 1.0).

  • weights (torch.Tensor, optional) – Predefined weights for each feature. Defaults to None.

  • cancellation_features (list[int], optional) – Specific features to apply cancellations to. Defaults to None, applying to all features.

  • cancellation_likelihood (float) – Probability of each feature being canceled. Defaults to 0.5.

cancellation_features = None
cancellation_likelihood = 0.5
cancellation_outcomes
cancellation_samples
labels
cancellation_attributions
cat_features
ground_truth_attributions
features = 'cancellation_samples'
ground_truth_attribute = 'ground_truth_attributions'
subset_data = ['weighted_samples', 'cancellation_outcomes', 'cancellation_samples',...
_initialize_cancellation_features() None

Validates and initializes the list of features subject to cancellation. If no specific features are provided, all features are considered candidates for cancellation.

Raises:

AssertionError – If cancellation_features is not a list, any element in cancellation_features is not an integer, the maximum element in cancellation_features is greater than the number of features, or cancellation_features is empty. Also, if cancellation_likelihood is not a float or is outside the range [0, 1].

_get_cancellations() torch.Tensor

Generates a binary mask indicating whether each feature in each sample is canceled based on the specified likelihood.

This method considers only the features specified in cancellation_features for possible cancellation.

Returns:

An integer tensor of shape (n_samples, n_features) where 1 represents a canceled feature,

and 0 represents an active feature.

Return type:

torch.Tensor

_get_cancellation_samples() torch.Tensor

Concatenates the original samples with their cancellation outcomes to form a comprehensive dataset.

This allows for analyzing the impact of feature cancellations directly alongside the original features.

Returns:

A tensor containing the original samples augmented with their corresponding cancellation outcomes.

Return type:

torch.Tensor

_get_cancellation_attributions() torch.Tensor

Computes the attribution of each feature by negating the effect of canceled features.

This method helps understand the impact of each feature on the model output when certain features are systematically canceled.

Returns:

A tensor of the same shape as the weighted samples, where the values of canceled features are

negated to reflect their absence.

Return type:

torch.Tensor

generate_model() torch.nn.Module

Instantiates and returns a neural network model for analyzing datasets with conflicting features.

The model is configured to use the specified features and weights, allowing for experimentation with feature cancellations.

Returns:

A neural network model designed to work with the specified features and weights.

Return type:

model.ConflictingFeaturesNN

datagenerator.conflicting.data