Dell reportedly restricts exports of AMD's fastest gaming GPUs to China — Radeon RX 7900 XTX, RX 7900, Pro W7900 purportedly listed as sanctioned tech

imaginary_num6er@alien.top · 1 year ago

Dell reportedly restricts exports of AMD's fastest gaming GPUs to China — Radeon RX 7900 XTX, RX 7900, Pro W7900 purportedly listed as sanctioned tech

Qesa@alien.top · 1 year ago

That is, unfortunately, sorely outdated. Particularly with the advent of tensorRT. Best case vs best case the 4080 is about twice as fast today

https://www.tomshardware.com/pc-components/gpus/stable-diffusion-benchmarks#section-stable-diffusion-512x512-performance

From-UoM@alien.top · 1 year ago

The gap would be even larger if, or to be precise WHEN, Fp8 and/or sparisity will be used on the Ada Lovelace cards.

moofunk@alien.top · 1 year ago

Of note, TensorRT doesn’t support SDXL yet.

DuranteA@alien.top · 1 year ago

This is no longer true.
If you use NV’s TensorRT plugin with the A1111 development branch, TensorRT works very well with SDXL (it’s actually much less painful to use than SD1.5 TensorRT was initially).

The big constraint is VRAM capacity. I can use it for 1024x1024 (and similar-total-pixel-count) SDXL generations on my 4090, but can’t go much beyond that without tiling (though that is generally what you do anyway for larger resolutions).

Just like for SD1.5, TensorRT speeds up generation by almost a factor of 2 for SDXL (compared to an “optimized” baseline using SDP).

moofunk@alien.top · 1 year ago

Alright thanks. This stuff is moving very fast, and I was only looking at the master branch.