{"id":208531,"date":"2026-03-24T19:10:07","date_gmt":"2026-03-24T18:10:07","guid":{"rendered":"https:\/\/liora.io\/en\/pytorch-2-11-ai-performance-upgrades"},"modified":"2026-03-24T19:10:07","modified_gmt":"2026-03-24T18:10:07","slug":"pytorch-2-11-ai-performance-upgrades","status":"publish","type":"post","link":"https:\/\/liora.io\/en\/pytorch-2-11-ai-performance-upgrades","title":{"rendered":"PyTorch 2.11 release fuels major AI performance upgrades"},"content":{"rendered":"<p><strong>\nPyTorch released version 2.11 today, delivering performance improvements of up to 600x for specific AI operations while adding support for next-generation NVIDIA and Intel GPUs. The update, built from 2,723 contributions by 432 developers, introduces differentiable collectives for distributed training, FlashAttention-4 backend, and expanded Apple Silicon compatibility, marking a significant advancement for machine learning researchers and developers worldwide.\n<\/strong><\/p>\n<p>The new capabilities position <a href=\"https:\/\/liora.io\/en\/pytorch-all-about-this-framework\"><b>PyTorch<\/b><\/a> at the forefront of AI framework competition as organizations race to optimize training and inference for increasingly complex models. The differentiable collectives feature fundamentally changes how researchers can approach <a href=\"https:\/\/liora.io\/en\/pytorch-lightning-empowering-scalable-deep-learning-frameworks\">distributed training algorithms<\/a> by allowing gradients to be computed directly through collective communication operations, eliminating the need for custom implementations.<\/p><br><p>Performance gains in the release are particularly striking for linear algebra operations. The <b>torch.linalg.lstsq<\/b> function achieves speedups ranging from <b>1.7x to 620x<\/b>, while <b>torch.linalg.svd<\/b> delivers <b>2x to 400x<\/b> improvements. These enhancements stem from replacing the legacy MAGMA backend with optimized cuSOLVER and cuBLAS implementations.<\/p><br><p><a href=\"https:\/\/liora.io\/en\/pytorchs-flexattention-with-flashattention-4-is-a-game-changer\"><b>FlexAttention<\/b><\/a>, now powered by the FlashAttention-4 backend, provides <b>1.2x to 3.2x speedups<\/b> for compute-bound attention workloads on <b>NVIDIA&#8217;s Hopper and Blackwell GPUs<\/b>. This optimization uses just-in-time compilation to generate kernels specifically tailored for these next-generation architectures.<\/p>\n\n<h2 style=\"margin-top:2rem;margin-bottom:1rem;\">Hardware Compatibility Shifts<\/h2><figure class=\"wp-block-image size-large\" style=\"margin-top:var(--wp--preset--spacing--columns);margin-bottom:var(--wp--preset--spacing--columns)\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"572\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-1024x572.jpg\" alt=\"Interior view of a data center showcasing rows of servers and storage units.\" class=\"wp-image-208521\" srcset=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-56x56.jpg 56w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-115x64.jpg 115w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-150x150.jpg 150w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-210x117.jpg 210w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-300x167.jpg 300w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-410x270.jpg 410w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-440x246.jpg 440w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-448x448.jpg 448w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-587x510.jpg 587w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-768x429.jpg 768w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-785x438.jpg 785w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-1024x572.jpg 1024w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-1250x590.jpg 1250w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-1440x680.jpg 1440w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-1536x857.jpg 1536w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-2048x1143.jpg 2048w, https:\/\/liora.io\/app\/uploads\/sites\/9\/2026\/03\/data-center-server-infrastructure-scaled.jpg 2560w\" sizes=\"(max-width: 1024px) 100vw, 1024px\"><\/figure>\n\n<p>A significant change accompanies the performance improvements: PyTorch 2.11&#8217;s default installation now ships with <b>CUDA 13.0<\/b>, dropping support for older GPU architectures. <b>Volta, Pascal, and Maxwell GPUs<\/b> are no longer supported in the default build, though users can still access CUDA 12.6 builds for legacy hardware compatibility.<\/p><br><p>The update expands cross-platform support with enhanced <b>Apple Silicon<\/b> capabilities, adding new distribution functions and improved error reporting for MPS operations. <b>Intel GPU<\/b> users gain XPUGraph support, a feature similar to CUDA Graphs that reduces CPU overhead by capturing and replaying sequences of operations.<\/p><br><p>The release also marks progress in PyTorch&#8217;s production deployment capabilities. The torch.export API now supports exporting <b>RNN modules including LSTM and GRU<\/b> for GPU execution, broadening the range of models ready for production inference. This advancement aligns with PyTorch&#8217;s continued deprecation of TorchScript in favor of the export ecosystem.<\/p>\n\n<h2 style=\"margin-top:2rem;margin-bottom:1rem;\">Security and Migration Considerations<\/h2>\n\n<p>Security improvements include hardening of <b>torch.hub.load<\/b>, which now prompts users for confirmation before executing code from untrusted repositories. Organizations upgrading from PyTorch 2.10 will need to address several breaking changes, particularly around CUDA compatibility and API modifications in attention mechanisms.<\/p><br><p>The collaborative nature of the release, built from <b>2,723 contributions by 432 developers<\/b>, underscores PyTorch&#8217;s position as a community-driven project competing with proprietary alternatives from major tech companies.<\/p>\n<div style=\"margin-top:3rem;padding-top:1.5rem;border-top:1px solid #e2e4ea;\">\n  <h3 style=\"margin:0 0 0.75rem;font-size:1.1rem;letter-spacing:0.08em;text-transform:uppercase;\">\n    Sources\n  <\/h3>\n  <ul style=\"margin:0;padding-left:1.2rem;list-style:disc;\">\n    <li>pytorch.org\/blog<\/li>\n  <\/ul>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>PyTorch released version 2.11 today, delivering performance improvements of up to 600x for specific AI operations while adding support for next-generation NVIDIA and Intel GPUs. The update, built from 2,723 contributions by 432 developers, introduces differentiable collectives for distributed training, FlashAttention-4 backend, and expanded Apple Silicon compatibility, marking a significant advancement for machine learning researchers and developers worldwide.<\/p>\n","protected":false},"author":87,"featured_media":208522,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"editor_notices":[],"footnotes":""},"categories":[2417],"class_list":["post-208531","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news"],"acf":[],"_links":{"self":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/208531","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/users\/87"}],"replies":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/comments?post=208531"}],"version-history":[{"count":0,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/208531\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media\/208522"}],"wp:attachment":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media?parent=208531"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/categories?post=208531"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}