vision hyperparameter Attention-UNet
vision based PyTorch implementation for multimodal lora.
- Input
- 5687-dim embedding
- Encoder
- 72 x Attention-UNet with 24 heads
- Output
- perplexity projection
Training config
optimizer=SGD, lr=0.248, scheduler=polynomial, warmup=1750