Understanding GPU Virtualization Technologies in VMware: A Comprehensive Guide
- Hans Kraaijeveld
- 3 hours ago
- 5 min read
In the world of virtual desktop infrastructure (VDI) and cloud computing, GPU virtualization plays a crucial role in delivering graphics-intensive applications to multiple users efficiently. Technologies like Soft3D, vSGA, vDGA, NVIDIA vGPU, and AMD MxGPU each offer unique approaches to handling graphics acceleration in virtual environments. This blog post breaks down how they work, their key differences and their ideal use cases, culminating in a summary comprised of some important conclusions.
What is Soft3D and How Does It Work?

Soft3D, also known as Software 3D Renderer, is a CPU-based graphics acceleration method primarily associated with VMware environments. It emulates GPU functionality entirely through software, leveraging the host's CPU to render 3D graphics without requiring any physical GPU hardware. This approach intercepts graphics API calls (like DirectX or OpenGL) from the virtual machine (VM) and processes them on the CPU, making it a fallback option when hardware acceleration isn't available. Unlike hardware-based methods, Soft3D doesn't share or passthrough physical GPUs; it's purely software-driven, which limits its performance but ensures broad compatibility.
The main differences from others lie in its lack of dependency on dedicated GPU hardware. While vSGA and vDGA rely on VMware's integration with physical GPUs for sharing or dedication, NVIDIA vGPU and AMD MxGPU use vendor-specific hardware partitioning. Soft3D stands out for its simplicity and zero additional hardware cost, but it falls short in handling demanding workloads compared to the hardware-accelerated alternatives.
What is vSGA and How Does It Work?"

vSGA, or Virtual Shared Graphics Acceleration, is a VMware technology that allows multiple VMs to share a single physical GPU through API forwarding. Graphics commands from VMs are intercepted by the hypervisor and routed to the physical GPU for processing, with results sent back to the VMs. It supports up to 512MB of video RAM per VM, with half reserved on the GPU and the rest on system memory, enabling higher VM density. vSGA works with a range of GPUs, including consumer-grade ones, but introduces some overhead due to the software mediation layer.
It differs from Soft3D by incorporating actual GPU hardware for better performance, but unlike vDGA's full passthrough, vSGA shares resources among VMs. Compared to NVIDIA vGPU's time-slicing or AMD MxGPU's SR-IOV hardware virtualization, vSGA is more flexible for VMware-specific features like vMotion and HA, though it may not match their raw efficiency in high-end scenarios.
What is vDGA and How Does It Work?

vDGA, standing for Virtual Dedicated Graphics Acceleration, is VMware's passthrough method where an entire physical GPU is assigned directly to a single VM via PCI passthrough. This gives the VM full, native access to the GPU's capabilities, bypassing the hypervisor for graphics processing and delivering near-bare-metal performance. It supports advanced APIs like DirectX and OpenGL without limitations, making it ideal for intensive tasks.
The key difference is its dedication model: unlike shared approaches in vSGA, NVIDIA vGPU, or AMD MxGPU, vDGA doesn't allow multiple VMs to use the same GPU, reducing density but maximizing performance per VM. It contrasts with Soft3D's CPU-only emulation by requiring compatible GPU hardware, and it's functionally similar to general GPU passthrough but optimized for VMware ecosystems.
What is AMD MxGPU and How Does It Work?

AMD MxGPU, or Multiuser GPU, leverages SR-IOV (Single Root I/O Virtualization) to create hardware-virtualized GPU instances from a single physical AMD GPU. Each virtual function (VF) acts as an independent GPU for a VM, providing predictable performance without time-slicing overhead. It's designed for VDI and supports dynamic resource allocation.
Differences include its hardware-centric sharing versus vSGA's software mediation or vDGA's dedication. Unlike NVIDIA vGPU's time-based slicing, MxGPU's SR-IOV ensures dedicated hardware slices, making it more budget-friendly but potentially less scalable for mixed OS environments.
What is NVIDIA vGPU and How Does It Work?

NVIDIA vGPU is a hardware-accelerated virtualization technology that partitions a physical NVIDIA GPU into multiple virtual GPUs (vGPUs), each assignable to a VM. It uses time-slicing or MIG (Multi-Instance GPU) to share resources, with the hypervisor managing scheduling to ensure isolation and performance. Profiles define frame buffer allocation, supporting various workloads from VDI to AI.
It differs from VMware-specific options like vSGA (which is API-based sharing) and vDGA (passthrough) by offering vendor-agnostic but NVIDIA-optimized sharing with licensing requirements. Compared to AMD MxGPU's SR-IOV approach, NVIDIA vGPU provides more flexible profiling and better support for compute-intensive tasks, though it may involve higher costs.
Suitability of Each Technique
Soft3D Suitability
Used to be ideal for basic office tasks and knowledge workers needing light 3D rendering, such as simple CAD previews or web-based graphics.
Used to be suitable for environments without GPU hardware, providing a cost-effective entry point for virtualization setups.
Best for legacy applications limited to older APIs like DirectX 9.0c or OpenGL 2.1, where high performance isn't needed.
vSGA Suitability
Used to be great for moderate graphics workloads in VDI, like 3D modeling for multiple users in education or design teams.
Used to fit high-density deployments where cost efficiency and VMware features like vMotion were prioritized over peak performance.
Used to be appropriate for entry-level professional apps, offering hardware acceleration without dedicating full GPUs.
Best for legacy applications limited to older APIs like DirectX 9.0c or OpenGL 2.1, where high performance isn't needed.
vDGA Suitability
Perfect for power users in engineering or media, requiring full GPU performance for tasks like complex simulations or video editing.
Suited for scenarios demanding native driver support and minimal latency, such as gaming in virtual environments.
Ideal for single-VM heavy workloads where sharing isn't needed, ensuring isolation and maximum resource utilization.
AMD MxGPU Suitability
Well-suited for cost-sensitive VDI deployments, like remote desktops for graphic designers or educators.
Appropriate for consistent, predictable performance in virtualized workstations across teams.
Ideal for SR-IOV-compatible setups focusing on hardware efficiency without extensive licensing fees.
NVIDIA vGPU Suitability
Excellent for high-end design and AI workloads, supporting tools like Dassault CATIA or machine learning training.
Suitable for scalable enterprise VDI with mixed user profiles, from knowledge workers to data scientists.
Best for environments needing advanced features like MIG for AI, with strong ecosystem support.
Summary
In conclusion, while these technologies have historically provided various levels of GPU virtualization, the landscape has evolved significantly by 2025. Both Soft3D and vSGA have largely outlived their practical utility, offering only very light acceleration suitable for minimal workloads.
Their supported API versions (such as DirectX 9 and OpenGL 2.1 for Soft3D, and limited DirectX 11/OpenGL 4.x for vSGA) are severely outdated, failing to meet the demands of modern applications that require DirectX 12, Vulkan, or newer standards.
With vSGA explicitly deprecated in recent vSphere releases like 8.0 Update 3, these options should no longer be considered for new deployments, as they introduce performance bottlenecks and compatibility issues. Similarly, AMD MxGPU, while still documented in some Omnissa Horizon contexts for legacy hardware, has seen no significant updates and is primarily available as an option in cloud environments like Azure (e.g., NVv4 series VMs), rather than in modern on-premises VMware setups where newer GPU technologies dominate.
For contemporary needs, focus on vDGA, NVIDIA vGPU, or emerging alternatives for robust, future-proof performance.