In this Quickstart, we learn how to:

  • Install ObzAI Python package
  • Explain a vision AI model
  • Extract features and detect outliers
  • Connect to the ObzAI SaaS platform
  • Log data features and model explanations

Prerequisite: This quickstart assumes that you have an vision transformer (ViT) model and data processing pipeline. If you would like to see an end-to-end example, please refer to our tutorials.

Installation

You will install the open source ObzAI library in an Python environment where your vision AI model is being used.

Install a stable version from the PyPI with this command:

pip install obzai

If you are using uv for a virtual environment, use:

uv pip install obzai

We are developing the open source package on GitHub. You may install a developmental version from GitHub:

pip install "git+https://github.com/obzai/obz"

Explain a vision AI model

This Quickstart assume that you have a working data processing and AI system. Therefore, these code snippets are meant to be added to your Python/PyTorch code. Particularly, the ViT model has to be loaded and ready to make prediction.

## example code for making a prediction based on MODEL
with torch.no_grad():
  logits = MODEL(norm_image_batch)
  max_logits_idxs = torch.argmax(logits, dim=1).cpu().tolist()
  probabilities = torch.nn.functional.softmax(logits, dim=1)

For a complete introduction including small models and datasets, please see our examples using natural images or lung nodules.

Use XAI Tools

To import specific XAI tools in obzai library, try:

from obzai.xai.xai_tool import AttentionMap, CDAM, SaliencyMap

Attention maps are inherent to ViT models. AttentionMap makes it easy for you to obtain attention maps regardless of the ViT architectures. In particular, we support models with and without [CLS] token. In general, AttentionMap expects a model that returns attention weights, like the transformers package.

CDAM is a gradient-based extension of attention map that provides class-discriminative feature importances. Within CDAM, there are several several optional arguments which allows you to apply related XAI methods.

ArgumentOptions (default: bold)Note
mode”vanilla”, “smooth”, “integrated”-------
gradient_type”from_logits”, “from_probabilities”-------
gradient_reduction”dot_product”, “average”, “sum”-------
activation_type”softmax”, “sigmoid”-------
noise_level0.05Only applicable to “smooth”
num_steps50Only applicable to “integrated”

Saliency maps computes the gradient of the class score with respect to the input pixels. SmoothGrad extends a saliency map by averaging multiple saliency maps derived from noise-added images. SaliencyMap class can be configured with the following arguments:

ArgumentOptions (default: bold)Note
mode”vanilla”, “smooth”-------
nt_type”smoothgrad”, “smoothgrad_sq”, “vargrad”-------
nt_samples10nt: noise type

Note that before visualizing the resulting XAI maps, it may be a good idea to normalize each of XAI maps.

from obzai.xai import xai_utils
norm_cdam_maps = xai_utils.normalize_xai_maps(cdam_maps)
norm_smooth_grad_maps = xai_utils.normalize_xai_maps(smooth_grad_maps)
norm_attention_maps = xai_utils.normalize_xai_maps(attention_maps)

Use Region-based XAI Tools

In some application, you may need more cohesive segmentation-like maps from XAI. This method has been introduced by XRAI: Better Attributions Through Regions. We provide an easy way to wrap any existing XAI tools to merges importance scores with pixel regions to achieve more visually appealing XAI maps.

Import XAIRegions:

from obzai.xai.xai_regions import XAIRegions

attention_regions_tool = XAIRegions(attention_tool)
cdam_regions_tool = XAIRegions(cdam_tool)

Now, we can use region-based XAI tools:

attention_regions_maps = attention_regions_tool.regionize(raw_images=image_batch, norm_images=norm_image_batch)
cdam_regions_maps = cdam_regions_tool.regionize(raw_images=image_batch ,norm_images=norm_image_batch, target_idx=0)

Evaluate XAI Tools

There are several Explainable AI (XAI) methods available, each with their own advantages and limitations. Obz AI offers a set of evaluation tools to help assess the quality of XAI methods.

fidelity_tool measures how accurately a given XAI method reflects the model’s true decision process. It does this by systematically perturbing input features based on their importance scores and observing the resulting change in the model performance.

compactness_tool evaluates how sparse and concentrated the importance scores are. A more compact set of importance scores is often easier for humans to interpret, as it highlights the most relevant features in a concise manner.

By using these tools, you can better understand and compare the effectiveness and interpretability of different XAI approaches.

from xai.eval_tool import Fidelity, Compactness

fidelity_tool = Fidelity(model=MODEL, device=DEVICE)
compactness_tool = Compactness()

Once instantiated, run the evaluation on the XAI method:

scores_fid = fidelity_tool.score(samples, cdam_maps, target_logits=max_logits)
scores_comp = compactness_tool.score(cdam_maps)

In general, we would average each of fidelity and compactness scores over all samples to obtain an overall score. An overall score helps us to understand if the XAI method applied on this particular model and dataset is accurate or compact.

High fidelity scores indicate more accurate XAI methods according to perturbation-based accuracy curves. See Brocki and Chung (2023) for further details.

High compactness score is considered better as a more compact set of importance scores is often easier for humans to interpret. Such a XAI map highlights the most relevant features in a concise manner.

Extract data features and detect outliers

Before applying the complex AI model for prediction, classification, and other tasks, it is important to inspect the input data. In particular, by profiling the data quality and looking for anomalies, we identify and mitigate potentially problematic samples that are out of distribution (OOD), compared to the reference data (e.g., used to train the model). New samples in production that are OOD may be caused by data drift, concept drift, selection bias, and group attribution bias.

Obz AI streamline this process and provides several machine learning outlier detection algorithms.

Quality of new data for inference is compared to a reference dataset. To achieve this, you need to provide a reference dataset. A reference dataset could be a training data, a public data of the same modality, or a private (historical) data. In this code snippets, it is assume that you provide the reference data by DataLoader from PyTorch, e.g.,:

ref_loader = DataLoader(ref_set, batch_size=32, shuffle=False)
inf_loader = DataLoader(inf_set, batch_size=6, shuffle=True)

Let’s first look at a basic outlier detection algorithms based on First Order Features which are modeled by a Gaussian Mixture Models (GMMs).

Load and instantiate outlier detectors:

# Setup OutlierDetector
from obzai.data_inspector.extractor import FirstOrderExtractor
from obzai.data_inspector.detector import GMMDetector

# Choose desired feature extractor. Chosen extractor will be used for monitoring.
first_order_extrc = FirstOrderExtractor()

GMMDetector is an outlier detection method that utilizes a Gaussian Mixture Model (GMM). To configure and use the GMMDetector, consider the following arguments:

  • extractors - Sequence of Extractor objects which process your data. Currently, only the FirstOrderExtractor is accepted.
  • n_components - A number of Gaussian components for the mixture model. This controls the complexity of the model and how finely it can separate data clusters.
  • outlier_quantile - Set the quantile threshold to determine what is considered an outlier. Data points falling below this quantile are classified as outliers.
  • show_progress - If set to True, a progress bar will be displayed during feature extraction to visualize operation progress.
# Pass choosen extractor(s) to chosen OutlierDetector.
gmm_detector = GMMDetector(extractors=[first_order_extrc], n_gaussian_components=3, outlier_quantile=0.01)
# Call .fit() method with the reference dataloader. 
# Method will extract desired image features and fit a GMM.
gmm_detector.fit(ref_loader)

This outlier detector can be directly supplied to the ObzClient to make seamless connection and monitoring from the Obz AI dashboard.

Your GMM-based Outlier Detector is ready to use. Simply pass batches of images in the inference data into .detect() method. This method returns a named tuple with:

  • img_features - extracted features for each image in the batch.
  • outliers - boolean vector indicating if samples in the batch are outliers.
# Example code to run inference on a single batch
image_batch, _ = next(iter(inf_loader))
detection_results = gmm_detector.detect(image_batch)

Obz AI provides an advanced outlier detection using deep learning embeddings, modeled by principal component analysis (PCA).

Logging to the dashboard with ObzClient

Configure ObzClient

Now, when you have your fitted OutlierDetector instance and XAITool instances, you are ready to wrap it into your ObzClient!
It will take care about proper logging data to your database and to visualize them in your dashboard.

from obzai.client.obz_client import ObzClient

client = ObzClient(detector=gmm_detector,
                   xai_tools=[cdam_tool, attention_tool],
                   verbose=True
                   )

Authentication

You need to authenticate your client. To do so, you need to provide your api_key into client’s method .login().
Your api key available in your ObzAI dashboard after account creation.

client.login(api_key="your_api_key")

After you log in succesfully, your credentials are cached locally in .netrc file, so you don’t need to more provide your API key to authenticate ObzClient!
Now let’s initialize a project with .init_project() method. This method accepts following arguments:

  • project_name - Project to which you would like to connect ObzClient instance. If you provide a brand new project, it will create it.
  • ml_task - Type of your ML task. We need it to adjust your project such as “binary_classification”, “multiclass_classification”, etc.
  • index2name - (Optional) We recommend passing also index to name mapping, to visualize correctly class names in your dashboard.
# Project initialization
client.init_project(project_name="EXAMPLE_PROJECT",
                    ml_task="binary_classification",
                    logit2name=LOGIT2NAME)

Logging

Finally, let’s log all results into your cloud project. client.run_and_log automatically runs outlier detection and XAI algorithms specified earlier:

for idx, (image_batch, _) in enumerate(inf_loader):
    if idx > 50:
        break

    print(f"Batch: {idx}")

    # STEP 0: Take a fresh batch of images:
    image_batch, _ = next(iter(inf_loader))
    image_batch = image_batch.to(DEVICE)

    # STEP 1: Make inference, assuming NORMALIZE function is available:
    res = client.run_and_log(model=MODEL, image_batch=image_batch, transform=NORMALIZE)