Quickstart
Get started with Obz AI
In this Quickstart, we learn how to:
- Install ObzAI Python package
- Explain a vision AI model
- Extract features and detect outliers
- Connect to the ObzAI SaaS platform
- Log data features and model explanations
Prerequisite: This quickstart assumes that you have an vision transformer (ViT) model and data processing pipeline. If you would like to see an end-to-end example, please refer to our tutorials.
Installation
You will install the open source ObzAI library in an Python environment where your vision AI model is being used.
Install a stable version from the PyPI with this command:
If you are using uv for a virtual environment, use:
We are developing the open source package on GitHub. You may install a developmental version from GitHub:
Explain a vision AI model
This Quickstart assume that you have a working data processing and AI system. Therefore, these code snippets are meant to be added to your Python/PyTorch code. Particularly, the ViT model has to be loaded and ready to make prediction.
For a complete introduction including small models and datasets, please see our examples using natural images or lung nodules.
Use XAI Tools
To import specific XAI tools in obzai
library, try:
Attention maps are inherent to ViT models.
AttentionMap
makes it easy for you to obtain attention maps regardless of the ViT architectures.
In particular, we support models with and without [CLS]
token.
In general, AttentionMap
expects a model that returns attention weights, like the transformers package.
CDAM is a gradient-based extension of attention map that provides class-discriminative feature importances.
Within CDAM
, there are several several optional arguments which allows you to apply related XAI methods.
Argument | Options (default: bold) | Note |
---|---|---|
mode | ”vanilla”, “smooth”, “integrated” | ------- |
gradient_type | ”from_logits”, “from_probabilities” | ------- |
gradient_reduction | ”dot_product”, “average”, “sum” | ------- |
activation_type | ”softmax”, “sigmoid” | ------- |
noise_level | 0.05 | Only applicable to “smooth” |
num_steps | 50 | Only applicable to “integrated” |
Saliency maps computes the gradient of the class score with respect to the input pixels.
SmoothGrad extends a saliency map by averaging multiple saliency maps derived from noise-added images.
SaliencyMap
class can be configured with the following arguments:
Argument | Options (default: bold) | Note |
---|---|---|
mode | ”vanilla”, “smooth” | ------- |
nt_type | ”smoothgrad”, “smoothgrad_sq”, “vargrad” | ------- |
nt_samples | 10 | nt: noise type |
Note that before visualizing the resulting XAI maps, it may be a good idea to normalize each of XAI maps.
Use Region-based XAI Tools
In some application, you may need more cohesive segmentation-like maps from XAI. This method has been introduced by XRAI: Better Attributions Through Regions. We provide an easy way to wrap any existing XAI tools to merges importance scores with pixel regions to achieve more visually appealing XAI maps.
Import XAIRegions
:
Now, we can use region-based XAI tools:
Evaluate XAI Tools
There are several Explainable AI (XAI) methods available, each with their own advantages and limitations. Obz AI offers a set of evaluation tools to help assess the quality of XAI methods.
fidelity_tool
measures how accurately a given XAI method reflects the model’s true decision process. It does this by systematically perturbing input features based on their importance scores and observing the resulting change in the model performance.
compactness_tool
evaluates how sparse and concentrated the importance scores are. A more compact set of importance scores is often easier for humans to interpret, as it highlights the most relevant features in a concise manner.
By using these tools, you can better understand and compare the effectiveness and interpretability of different XAI approaches.
Once instantiated, run the evaluation on the XAI method:
In general, we would average each of fidelity and compactness scores over all samples to obtain an overall score. An overall score helps us to understand if the XAI method applied on this particular model and dataset is accurate or compact.
High fidelity scores indicate more accurate XAI methods according to perturbation-based accuracy curves. See Brocki and Chung (2023) for further details.
High compactness score is considered better as a more compact set of importance scores is often easier for humans to interpret. Such a XAI map highlights the most relevant features in a concise manner.
Extract data features and detect outliers
Before applying the complex AI model for prediction, classification, and other tasks, it is important to inspect the input data. In particular, by profiling the data quality and looking for anomalies, we identify and mitigate potentially problematic samples that are out of distribution (OOD), compared to the reference data (e.g., used to train the model). New samples in production that are OOD may be caused by data drift, concept drift, selection bias, and group attribution bias.
Obz AI streamline this process and provides several machine learning outlier detection algorithms.
Quality of new data for inference is compared to a reference dataset. To achieve this, you need to provide a reference dataset. A reference dataset could be a training data, a public data of the same modality, or a private (historical) data.
In this code snippets, it is assume that you provide the reference data by DataLoader
from PyTorch, e.g.,:
Let’s first look at a basic outlier detection algorithms based on First Order Features
which are modeled by
a Gaussian Mixture Models (GMMs).
Load and instantiate outlier detectors:
GMMDetector
is an outlier detection method that utilizes a Gaussian Mixture Model (GMM). To configure and use the GMMDetector
, consider the following arguments:
extractors
- Sequence of Extractor objects which process your data. Currently, only theFirstOrderExtractor
is accepted.n_components
- A number of Gaussian components for the mixture model. This controls the complexity of the model and how finely it can separate data clusters.outlier_quantile
- Set the quantile threshold to determine what is considered an outlier. Data points falling below this quantile are classified as outliers.show_progress
- If set toTrue
, a progress bar will be displayed during feature extraction to visualize operation progress.
This outlier detector can be directly supplied to the ObzClient
to make seamless connection and monitoring from the Obz AI dashboard.
Your GMM-based Outlier Detector is ready to use. Simply pass batches of images in the inference data into .detect()
method. This method returns a named tuple with:
img_features
- extracted features for each image in the batch.outliers
- boolean vector indicating if samples in the batch are outliers.
Obz AI provides an advanced outlier detection using deep learning embeddings, modeled by principal component analysis (PCA).
Logging to the dashboard with ObzClient
Configure ObzClient
Now, when you have your fitted OutlierDetector
instance and XAITool
instances, you are ready to wrap it into your ObzClient
!
It will take care about proper logging data to your database and to visualize them in your dashboard.
Authentication
You need to authenticate your client. To do so, you need to provide your api_key
into client’s method .login()
.
Your api key available in your ObzAI dashboard after account creation.
After you log in succesfully, your credentials are cached locally in .netrc
file, so you don’t need to more provide your API key to authenticate ObzClient
!
Now let’s initialize a project with .init_project()
method. This method accepts following arguments:
project_name
- Project to which you would like to connect ObzClient instance. If you provide a brand new project, it will create it.ml_task
- Type of your ML task. We need it to adjust your project such as “binary_classification”, “multiclass_classification”, etc.index2name
- (Optional) We recommend passing also index to name mapping, to visualize correctly class names in your dashboard.
Logging
Finally, let’s log all results into your cloud project. client.run_and_log
automatically runs outlier detection and XAI algorithms specified earlier: