Feature Banks

Dataset-level H5 banks store domains and deterministic views.

A single bank can support many source-target tasks. Training chooses source, entropy, consistency, and evaluation views by key.

Download released banks

Released H5 banks are hosted in the Hugging Face dataset repo baogege1995/FPS_H5. The downloader tries $HF_ENDPOINT, Hugging Face, and then https://hf-mirror.com.

PYTHONPATH=src python scripts/download_feature_banks.py all

PYTHONPATH=src python scripts/download_feature_banks.py \
  office31_resnet office_home_vit

PYTHONPATH=src python scripts/download_feature_banks.py all \
  --endpoint hf-mirror

Schema

H5 layout and view roles

/domains/{domain}/label
/domains/{domain}/views/{view_key}/feature
Role Meaning Typical view
src Source supervised features and labels. *_clean
entropy Target view for LSE/LCE entropy losses. *_clean
cr.view1, cr.view2 Paired target views for LCR consistency. *_pool_a, *_pool_b
eval Target view for metrics and predictions. *_clean

Workflow

Extract and analyze banks

Extract a bank

PYTHONPATH=src python -m fps_uda.cli extract-feature-bank \
  --dataset-config configs/datasets/office31_vit.yaml \
  --out fps_h5cache/banks/office31_vit.h5 \
  --device cuda:0

Dataset YAML declares domains, manifests, transforms, backbone, pooling strategy, and deterministic views.

Analyze views

PYTHONPATH=src python -m fps_uda.cli analyze-feature-bank \
  --feature-bank fps_h5cache/banks/office31_vit.h5 \
  --source-domain amazon \
  --target-domain webcam \
  --out runs/analysis/office31_vit

Analysis uses labels for debugging and selection. It writes CSV/JSON/YAML summaries and does not modify the H5 bank.

Image datasets

Download datasets and generate manifests

The helper can download Office31, OfficeHome, and VisDA17, or generate manifests for existing local copies with --skip-download.

PYTHONPATH=src python scripts/download_datasets.py office31 --root data
PYTHONPATH=src python scripts/download_datasets.py office_home --root data
PYTHONPATH=src python scripts/download_datasets.py visda17 --root data

PYTHONPATH=src python scripts/download_datasets.py all --root data --skip-download

Large vision models

SigLIP 2 through HuggingFace AutoModel

Presets such as configs/datasets/office_home_siglip2.yaml use google/siglip2-so400m-patch14-384 through HuggingFace AutoModel.

backbone:
  backend: hf_auto_vision
  name: google/siglip2-so400m-patch14-384
  in_features: 1152
  pooling:
    feature_type: token
    random_strategy: token_channel_squared