shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

263
active users

#avx512

0 posts0 participants0 posts today
David JONES<p>So here's an idea i had that i'm almost certainly not going to do anything with (so you should). With AVX-512 we have 16 x 32-bit registers. Let's pretend that's a 16-deep stack. The permute instruction let us do a DROP and DUP (except, you'd probably want to ROLL them, but whatever). I'm imaging that top-of-stack would always be register 0; PUSHing something permutes all the registers 1-higher and replaces register 0. Now implement a FORTH.<br><a href="https://typo.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a> <a href="https://typo.social/tags/FORTH" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FORTH</span></a></p>
nietras 👾<p>New blog post "Sep 0.10.0 - 21 GB/s CSV Parsing Using SIMD on AMD 9950X 🚀"</p><p>📈 Sep <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> from 7 GB/s to 21 GB/s over last two years<br>🧑‍💻 <a href="https://mastodon.social/tags/csharp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>csharp</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> and <a href="https://mastodon.social/tags/x64" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x64</span></a> assembly on <a href="https://mastodon.social/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> 9.0<br>🛠️ Tweaks and new <a href="https://mastodon.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a>-to-256 parser<br>🔢 Lots of benchmarks</p><p>👇<br><a href="https://nietras.com/2025/05/09/sep-0-10-0/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">nietras.com/2025/05/09/sep-0-1</span><span class="invisible">0-0/</span></a></p>
OSTechNix<p>FFmpeg Sees 94x Performance Boost with Handwritten AVX-512 Code <a href="https://floss.social/tags/ffmpeg" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ffmpeg</span></a> <a href="https://floss.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a> <a href="https://floss.social/tags/AssemblyCode" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AssemblyCode</span></a> <a href="https://floss.social/tags/Opensource" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Opensource</span></a> <br><a href="https://ostechnix.com/ffmpeg-sees-94x-performance-boost-with-handwritten-avx-512-code/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">ostechnix.com/ffmpeg-sees-94x-</span><span class="invisible">performance-boost-with-handwritten-avx-512-code/</span></a></p>
FCLC<p>Hi friends! Very excited to announce that I'll be giving an <span class="h-card" translate="no"><a href="https://mast.hpc.social/@easybuild" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>easybuild</span></a></span> Tech Talk on the 13th of October on <a href="https://mast.hpc.social/tags/AVX10" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX10</span></a>!</p><p>The Talk is titled "AVX10 for HPC:<br>A reasonable solution to the 7 levels of AVX-512 folly" </p><p>Registration is free, all <a href="https://mast.hpc.social/tags/x86" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x86</span></a>, <a href="https://mast.hpc.social/tags/AVX" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX</span></a>, <a href="https://mast.hpc.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a>, <a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a>, and <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a> experience levels welcome!</p><p>The page is here: <a href="https://easybuild.io/tech-talks/008_avx10.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">easybuild.io/tech-talks/008_av</span><span class="invisible">x10.html</span></a></p><p>And you can register here! <a href="https://event.ugent.be/registration/ebtechtalk008avx10" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">event.ugent.be/registration/eb</span><span class="invisible">techtalk008avx10</span></a></p>
Linh Pham<p>Double yikes in CPU vulnerabilities! Both articles are from <a href="https://linh.social/tags/ServeTheHome" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ServeTheHome</span></a> </p><p><a href="https://linh.social/tags/Intel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Intel</span></a> DOWNFALL Ultra-Scary <a href="https://linh.social/tags/AVX2" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX2</span></a> and <a href="https://linh.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a> Side channel Attack Discovered</p><p><a href="https://www.servethehome.com/intel-downfall-ultra-scary-avx2-and-avx-512-side-channel-attack-discovered/" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">servethehome.com/intel-downfal</span><span class="invisible">l-ultra-scary-avx2-and-avx-512-side-channel-attack-discovered/</span></a></p><p>New Inception Vulnerability Impacts ALL <a href="https://linh.social/tags/AMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AMD</span></a> Zen CPUs Yikes</p><p><a href="https://www.servethehome.com/new-inception-vulnerability-impacts-all-amd-zen-cpus-yikes-phantom/" rel="nofollow noopener" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">servethehome.com/new-inception</span><span class="invisible">-vulnerability-impacts-all-amd-zen-cpus-yikes-phantom/</span></a></p><p><a href="https://linh.social/tags/Security" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Security</span></a></p>
FCLC<p><a href="https://mast.hpc.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a> extension request: IFMA-52 but lower precision integers. </p><p>IFMA-52 is nice because of it's high intermediate precision as well as great throughput (but high latency). I suspect 52 is convenient because of the FP64 FMA unit. </p><p>Perhaps an FP32 based IFMA-22 could be doable?</p><p><a href="https://mast.hpc.social/tags/intel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>intel</span></a> <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a> <a href="https://mast.hpc.social/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> <a href="https://mast.hpc.social/tags/x86" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x86</span></a> <a href="https://mast.hpc.social/tags/YetAnotherISAExtension" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>YetAnotherISAExtension</span></a></p>
FCLC<p>Time for an <a href="https://mast.hpc.social/tags/introduction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>introduction</span></a>! <br>I'm a young Canuck with interests/experience in <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a>, <a href="https://mast.hpc.social/tags/Linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Linux</span></a>, <a href="https://mast.hpc.social/tags/BLAS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>BLAS</span></a>, <a href="https://mast.hpc.social/tags/SYCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SYCL</span></a>, <a href="https://mast.hpc.social/tags/C" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>C</span></a>, <a href="https://mast.hpc.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a>, <a href="https://mast.hpc.social/tags/Rust" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Rust</span></a>, heterogeneous compute &amp; other such things. </p><p>Currently my personal projects are bringing <a href="https://mast.hpc.social/tags/FP16" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FP16</span></a> to the <a href="https://mast.hpc.social/tags/OpenBLAS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenBLAS</span></a> library, working to standardize what Complex domain BLAS FP16 kernels/implementations should look like, and making sure <a href="https://mast.hpc.social/tags/SYCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SYCL</span></a> is available everywhere. </p><p>I also write every now and again. Here's the tail of AVX512 FP16 on Alderlake <br><a href="https://gist.github.com/FCLC/56e4b3f4a4d98cfd274d1430fabb9458" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gist.github.com/FCLC/56e4b3f4a</span><span class="invisible">4d98cfd274d1430fabb9458</span></a></p>
FelixCLC<p>Ok, beyond posting *about* <a href="https://mastodon.social/tags/mastodon" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>mastodon</span></a>, time to post *on* mastodon. <br>For those interested in <a href="https://mastodon.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a>, <a href="https://mastodon.social/tags/CPU" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>CPU</span></a> , <a href="https://mastodon.social/tags/intel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>intel</span></a> , <a href="https://mastodon.social/tags/linux" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>linux</span></a> , <a href="https://mastodon.social/tags/kernel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>kernel</span></a> development and other such things, this blog post/article from the other week may be of interest. </p><p>It chronicles what had already been a year in the making of <a href="https://mastodon.social/tags/avx512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>avx512</span></a> development, the trials and tribulations of dealing with vendors and the quest to bring reduced precision ( <a href="https://mastodon.social/tags/fp16" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>fp16</span></a> ) to main stream <a href="https://mastodon.social/tags/x86" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x86</span></a> </p><p>Post here from my <a href="https://mastodon.social/tags/github" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>github</span></a> : <a href="https://gist.github.com/FCLC/56e4b3f4a4d98cfd274d1430fabb9458" rel="nofollow noopener" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gist.github.com/FCLC/56e4b3f4a</span><span class="invisible">4d98cfd274d1430fabb9458</span></a></p>