Conversation
|
The audio output I get from this is really clipped and doesn't sound correct. Was this working for you? |
|
@ymgenesis 🤷 I just assumed that's what the model sounds like, e.g. if you integrate too long or your prompt isn't suited to the small model. I didn't change anything but the device it executes on. If I try to run it on CPU, it seems to just hang, so MPS was the only way I could run inference at the time. I'll try to compare against a CUDA run. |
|
Whoa, CUDA results sound MUCH better than analogous MPS calculations! 😱 I see that there are many accuracy-related related MPS issues for PyTorch, https://github.com/pytorch/pytorch/issues?q=is%3Aissue%20state%3Aopen%20mps, i.e. not just about unsupported ops. Not sure yet if this is just for half precision or full precision too. This is surely "upstream" from |
|
Interesting. Also note that the latest MacOS Tahoe 26 uses an upgrade to
MPS called Metal 4 and runs different kernels.
Also note that when you run pytorch MPS and have an unsupported op (even if
it supports that op it may not at that dimension size), it falls back to
CPU for that op. You may not be getting the speedup on the GPU!
I believe the large convolutions in the VAE decoder actually fit in the
kernel of Metal 4 and not in previous versions
Would be great to discover which ops have accuracy issues on MPS.
There's like some shim we can add to avoid numerical error
…On Fri, Nov 21, 2025 at 1:23 PM Scott H. Hawley ***@***.***> wrote:
*drscotthawley* left a comment (Stability-AI/stable-audio-tools#225)
<#225 (comment)>
Whoa, CUDA and CPU results sound MUCH better than analogous MPS
calculations! 😱
I see that there are many *accuracy*-related related MPS issues for
PyTorch,
https://github.com/pytorch/pytorch/issues?q=is%3Aissue%20state%3Aopen%20mps,
i.e. not just about unsupported ops.
Not sure yet if this is just for half precision or full precision too.
This is surely "upstream" from stable-audio-tools. My students and I will
try to look into this. Will also monitor issue #181
<#181>; please
share any updates/fixes.
—
Reply to this email directly, view it on GitHub
<#225 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AASXCZNYKXOIR3X6GS3CIWL355KBTAVCNFSM6AAAAACJ3BE2JGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTKNRUGE2TOOJWGU>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
|
@Cortexelus We observe this on M1 & M2 chips. Execution is fast but prone to garbage outputs compared to CUDA. This is not about CPU fallback or newer M4 Chips. |
|
Hmm I think there was an issue with the code with the Torch audio saving function lately where it's expecting a value between 0 and 1 for the volume, but instead getting a FP16 value, so sound will be tens of thousands of times louder than it should, essentially producing just a clipped square wave. I patched it with Claude Code, but I'm not on my computer now so I can't check exactly what it was. |
device can also equal "mps" if it's available. makes things run way faster than CPU on Mac.