xai-assignment_projection-space-exploration

Explainable AI Assignment 1: Projection Space Exploration

General Assignment Information

Dimensionality Reduction and Scatterplots

In this assignment, you are challenged to visually analyze patterns emerging from projections of a dataset. For bonus points, you can optionally project and compare solutions of a problem, game, algorithm, model, or anything else that can be represented by sequential states/trajectories. For this, you will:

Tool for selecting color maps by variable type

Analysis of Clusters, Outliers, and Other Patterns

Depending on your dataset, the projection scatterplot will reveal patterns like clusters, outliers, and edge bundles. Because these scatterplot patterns alone are typically insufficient for making sense of the high-dimensional information, you will experiment with ways to visually analyze these patterns:


For example, if you choose a tabular dataset and want to analyze which features drive a cluster, you might want to use visualizations like bar charts, histograms, KDE plots, boxplots, or violin plots to visualize its categorical and continuous variables appropriately.

Visual Vocabulary Cheatsheet

There may be too many features in your dataset to analyze plots of every single feature. In this case, you might want to first determine which features are the most important to visualize: For instance, using a statistical measure to rank how dissimilar each feature in your cluster distribution is from the overall data distribution, or from that of another cluster. Then you can visualize the top k features ranked by this statistical measure.

If you are using an image dataset, you might want to think about whether it makes sense to average the images in your clusters or whether a more complex strategy is necessary.

For datasets with an inherent 2-D spatial layout, such as a chessboard where each square can take on different states (e.g., occupied by a particular piece), consider how to visualize which regions consistently have the same state and which vary across observations.

Experiment and choose the types of visualizations that fit your dataset best to explain the most interesting structures in your projections.


Exemplary solutions for how to perform projections and create scatterplots are provided in the solution_rubik.ipynb and solution_2048.ipynb notebooks.

One place to look for interesting datasets is Kaggle.

Further examples of trajectory datasets to analyze are (board) games and approximation algorithms. The 2048 notebook uses OpenAI Gym to create a game environment and produce state data. There are various of first- and third-party environments for Gym that can be used.

Intermediate Submission

Team Information

Fill out the team-info.json file in this repository and push it to GitHub by 23:59 on 5 November, 2025. (Check Moodle for any updated deadlines.) Make sure that the file contains your team name, each student’s first and last name, student ID, email address, and GitHub username.

Dataset

Please add your dataset to the repository (or provide a link if it is too large) and answer the following questions about it:

Final Submission

Workload Distribution

Please indicate the workloads for each student in the workload.json.

Make sure that the file includes your correct team name, that the names in the workload file match the team-info.json and that the workloads sum up to 1.

Development Environment

Check out this repo and change into the folder:

git clone https://github.com/jku-icg-classroom/xai_proj_space_2025-<GROUP_NAME>.git
cd xai_proj_space_2025-<GROUP_NAME>

Load the conda environment from the shared environment.yml file:

conda env create -f environment.yml
conda activate xai_proj_space

Hint: For more information on Anaconda and environments take a look at the README in our tutorial repository.

Then launch JupyterLab:

jupyter lab

Go to http://localhost:8888/ and open the template notebook.

Alternatively, you can also work with binder, deepnote, colab, or any other service as long as the notebook runs in the standard Jupyter environment.