WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@ivanfioravanti
Copy link

MPS Support on Apple Silicon devices. This should solve #127
Many changes had to be applied to make this work properly.

…ilicon

This commit adds full support for training AI models on Apple Silicon Macs using MPS,
including fixes for multiprocessing tensor sharing issues and UI compatibility.

Major changes:
- Created toolkit/device_utils.py with comprehensive MPS device management
- Fixed PyTorch multiprocessing issues on MPS by forcing num_workers=0
- Added Apple Silicon GPU detection in UI API
- Fixed UI client-side errors with type checking and error handling
- Added Apple Silicon-optimized training configs for Flux and Z-Image

Key fixes:
- Resolved "_share_filename_: only available on CPU" error
- Fixed .toFixed() errors on string values from MPS API
- Added defensive JSON parsing with fallbacks
- Updated all training processes to use device-specific dataloader settings
- Update existing job configuration from adamw8bit to adamw for MPS compatibility
- Add device-aware optimizer selection in UI (MPS shows compatible optimizers only)
- Update default optimizer to use adamw on Mac, adamw8bit on CUDA
- Add backend validation to automatically convert 8-bit optimizers for MPS
- Fix multiple UI components with .toFixed() errors on string values
- Add comprehensive MPS device utilities and detection
@ivanfioravanti
Copy link
Author

@jaretburkett you are the boss here, up to you if you want to integrate MPS support or not. For sure from PyTorch 2.9 things has improved a lot.

@jaretburkett
Copy link
Contributor

I looked over the changes, I don't see anything that looks like it would cause an issue. I'll run some tests on it to make sure. @ivanfioravanti, how much ram on mac was needed to train z-image-turbo? I have a 24GB macbook I can test on.

@ivanfioravanti
Copy link
Author

Let me try today. Keep you posted.

@ivanfioravanti
Copy link
Author

~30GB I can grant you access to an M3 Ultra 512GB, but next week.

@FritzTheCatfish
Copy link

Excellent. What settings are you using? I had to disable transformations to get going using PyTorch 2.7.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants