Environment Setup Guide¶
Prerequisites¶
- Python 3.10+ (tested with Python 3.13.7)
- Git
- 16GB+ RAM recommended
- (Optional) NVIDIA GPU with CUDA support for training
Installation Instructions¶
Windows¶
-
Clone the repository:
-
Create virtual environment:
-
Activate virtual environment:
-
Upgrade pip:
-
Install dependencies:
Linux / macOS¶
-
Clone the repository:
-
Create virtual environment:
-
Activate virtual environment:
-
Upgrade pip:
-
Install dependencies:
GPU Setup (CUDA)¶
NVIDIA GPU Requirements¶
- CUDA 12.1+ compatible GPU
- NVIDIA Driver 530.30.02+
- CUDA Toolkit 12.1+
- cuDNN 8.x
Installing PyTorch with CUDA Support¶
Replace the CPU version with GPU-enabled PyTorch:
# Uninstall CPU version
pip uninstall torch torchvision
# Install CUDA 12.1 version
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
For CUDA 11.8:
Verify GPU Installation¶
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else \"N/A\"}')"
Expected output (with GPU):
Verification Commands¶
Check Python Version¶
Verify Core Libraries¶
# PyTorch
python -c "import torch; print(f'PyTorch {torch.__version__}')"
# timm (model architectures)
python -c "import timm; print(f'timm {timm.__version__}')"
# Fairness libraries
python -c "import fairlearn; print(f'Fairlearn {fairlearn.__version__}')"
python -c "import aif360; print('AIF360 installed')"
# Data science stack
python -c "import numpy, pandas, sklearn; print('Data science stack OK')"
# Computer vision
python -c "import cv2, albumentations; print('CV libraries OK')"
# Experiment tracking
python -c "import tensorboard, wandb; print('Tracking tools OK')"
Run All Verifications¶
python -c "
import sys
import torch
import timm
import fairlearn
import aif360
import numpy
import pandas
import sklearn
import cv2
import albumentations
import tensorboard
import wandb
print('=== Environment Verification ===')
print(f'Python: {sys.version}')
print(f'PyTorch: {torch.__version__}')
print(f'CUDA available: {torch.cuda.is_available()}')
print(f'timm: {timm.__version__}')
print(f'Fairlearn: {fairlearn.__version__}')
print(f'NumPy: {numpy.__version__}')
print(f'Pandas: {pandas.__version__}')
print(f'scikit-learn: {sklearn.__version__}')
print('All critical dependencies installed successfully!')
"
Troubleshooting¶
Issue: ModuleNotFoundError after installation¶
Solution: Ensure virtual environment is activated:
# Check which Python is being used
which python # Linux/macOS
where python # Windows
# Should point to venv directory
Issue: CUDA not available despite GPU present¶
Solution:
1. Verify NVIDIA driver: nvidia-smi
2. Check CUDA toolkit: nvcc --version
3. Reinstall PyTorch with correct CUDA version (see GPU Setup above)
4. Verify with: python -c "import torch; print(torch.cuda.is_available())"
Issue: Out of memory during training¶
Solutions: - Reduce batch size in configuration files - Enable gradient checkpointing (already configured in model configs) - Use mixed precision training (FP16) - Consider using a smaller model architecture
Issue: Slow data loading¶
Solutions:
- Increase num_workers in DataLoader (default: 4)
- Use SSD for dataset storage
- Pre-process and cache augmentations
Issue: ImportError: DLL load failed (Windows)¶
Solution: 1. Install Visual C++ Redistributable: https://aka.ms/vs/17/release/vc_redist.x64.exe 2. Reinstall PyTorch 3. Restart terminal
Issue: Permission denied when creating directories¶
Solution:
# Linux/macOS: Use sudo or change ownership
sudo chown -R $USER:$USER .
# Windows: Run terminal as Administrator
Issue: Pre-commit hooks failing¶
Solution:
# Reinstall pre-commit hooks
pre-commit clean
pre-commit install
# Run manually to test
pre-commit run --all-files
Environment Variables (Optional)¶
Create .env file in project root:
# Weights & Biases (optional)
WANDB_API_KEY=your_api_key_here
WANDB_PROJECT=skin-cancer-classification
# Data directories
DATA_ROOT=./data
EXPERIMENTS_ROOT=./experiments
# Training configuration
CUDA_VISIBLE_DEVICES=0 # GPU device ID
OMP_NUM_THREADS=4 # CPU threads for data loading
Development Tools¶
Jupyter Notebook (Optional)¶
VS Code Extensions (Recommended)¶
- Python (Microsoft)
- Pylance
- Black Formatter
- Jupyter
- GitLens
PyCharm Configuration¶
- File > Settings > Project > Python Interpreter
- Add Interpreter > Existing Environment
- Select
venv/bin/python(Linux/macOS) orvenv\Scripts\python.exe(Windows)
Next Steps¶
After successful environment setup:
- Review project structure:
README.md - Set up data directories:
docs/data_setup.md - Configure experiments:
docs/experiment_tracking.md - Run baseline experiments:
experiments/baseline/README.md
System Requirements¶
Minimum¶
- CPU: 4 cores
- RAM: 16GB
- Storage: 50GB
- Python: 3.10+
Recommended (for training)¶
- CPU: 8+ cores
- RAM: 32GB+
- GPU: NVIDIA RTX 3090 or better (24GB VRAM)
- Storage: 100GB+ SSD
- Python: 3.10-3.12
Cloud Options¶
- Google Colab (Free GPU tier available)
- Kaggle Notebooks (Free GPU: 30hrs/week)
- AWS SageMaker
- Azure ML
- Lambda Labs
Support¶
For issues not covered here: 1. Check GitHub Issues 2. Review PyTorch documentation: https://pytorch.org/docs/ 3. Consult timm documentation: https://huggingface.co/docs/timm 4. Fairlearn docs: https://fairlearn.org/
Last Updated: 2025-10-13 Python Version Tested: 3.13.7 PyTorch Version: 2.8.0