shakedown.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A community for live music fans with roots in the jam scene. Shakedown Social is run by a team of volunteers (led by @clifff and @sethadam1) and funded by donations.

Administered by:

Server stats:

271
active users

#simd

0 posts0 participants0 posts today
Larry (Mr.Optimization)<p>I decided to share my Arm NEON optimizations for the FFmpeg Cinepak encoder. On Apple Silicon / RPI / NEON 32/64-bit, it gets a 250-300% speedup for encoding:</p><p><a href="https://github.com/bitbank2/FFmpeg-in-Xcode" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/bitbank2/FFmpeg-in-</span><span class="invisible">Xcode</span></a></p><p><a href="https://floss.social/tags/FOSS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>FOSS</span></a> <br><a href="https://floss.social/tags/Optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Optimization</span></a> <br><a href="https://floss.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <br><a href="https://floss.social/tags/NEON" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NEON</span></a></p>
nietras 👾<p>Updated "Sep 0.10.0 - 21 GB/s CSV Parsing Using SIMD on AMD 9950X 🚀" to make it 300% clear the graph shows <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> progression over different Sep, <a href="https://mastodon.social/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> versions and CPUs.</p><p>To show how runtime and library improvements go hand in hand with hardware changes. E.g. AVX512 <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <br>As it should be 👾</p>
nietras 👾<p>New blog post "Sep 0.10.0 - 21 GB/s CSV Parsing Using SIMD on AMD 9950X 🚀"</p><p>📈 Sep <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>performance</span></a> from 7 GB/s to 21 GB/s over last two years<br>🧑‍💻 <a href="https://mastodon.social/tags/csharp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>csharp</span></a> <a href="https://mastodon.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> and <a href="https://mastodon.social/tags/x64" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x64</span></a> assembly on <a href="https://mastodon.social/tags/dotnet" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>dotnet</span></a> 9.0<br>🛠️ Tweaks and new <a href="https://mastodon.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a>-to-256 parser<br>🔢 Lots of benchmarks</p><p>👇<br><a href="https://nietras.com/2025/05/09/sep-0-10-0/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">nietras.com/2025/05/09/sep-0-1</span><span class="invisible">0-0/</span></a></p>
IT News<p>Faster Integer Division with Floating Point - Multiplication on a common microcontroller is easy. But division is much more diff... - <a href="https://hackaday.com/2024/12/22/faster-integer-division-with-floating-point/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">hackaday.com/2024/12/22/faster</span><span class="invisible">-integer-division-with-floating-point/</span></a> <a href="https://schleuss.online/tags/softwaredevelopment" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>softwaredevelopment</span></a> <a href="https://schleuss.online/tags/softwarehacks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>softwarehacks</span></a> <a href="https://schleuss.online/tags/optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>optimization</span></a> <a href="https://schleuss.online/tags/assembly" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>assembly</span></a> <a href="https://schleuss.online/tags/avx" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>avx</span></a>-512 <a href="https://schleuss.online/tags/x86_64" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x86_64</span></a> <a href="https://schleuss.online/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a> <a href="https://schleuss.online/tags/x86" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x86</span></a></p>
sarah quiñones<p><a href="https://eldritch.cafe/tags/simd" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>simd</span></a> </p><p>there's this trick i randomly found a few years ago and i've been wondering if there's a name for it or if other people have done this before</p><p>```<br>for enforcing floating point determinism with realigned buffers</p><p>if we have<br>x x x 0 1 2 3 4 5 6 7 x x x</p><p>where x is the identity for my operation, and our operation is commutative (not necessarily associative)</p><p>then adding x padding doesn't affect the result as long as we do a tree reduction at the end</p><p>e.g.</p><p>accumulate in register: v = 0+4 1+5 2+6 3+7</p><p>tree reduction step 0: (0+4)+(2+6) (1+5)+(3+7)<br>tree reduction step 1: ((0+4)+(2+6)) + ((1+5)+(3+7))</p><p>if we add padding (e.g., by realigning the buffer and using a masked load)</p><p>accumulate in register: v = x+1+5 x+2+6 x+3+7 0+4+x</p><p>tree reduction step 0: (1+5)+(3+7) (0+4)+(2+6)<br>tree reduction step 1: ((1+5)+(3+7)) + ((0+4)+(2+6))</p><p>commuting the elements shows us that this is the exact same result as the previous one, so the bit pattern of the final result is unaffected (modulo signed zero, nan, etc)<br>```</p>
Karsten Schmidt<p>Yesterday, one year ago... (Still wondering how many people actually have read or tried out any of these)</p><p><a href="https://mastodon.thi.ng/@toxi/111348591236791838" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mastodon.thi.ng/@toxi/11134859</span><span class="invisible">1236791838</span></a></p><p><a href="https://mastodon.thi.ng/tags/ThingUmbrella" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ThingUmbrella</span></a> <a href="https://mastodon.thi.ng/tags/HowToThing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HowToThing</span></a> <a href="https://mastodon.thi.ng/tags/TypeScript" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TypeScript</span></a> <a href="https://mastodon.thi.ng/tags/Tutorial" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Tutorial</span></a> <a href="https://mastodon.thi.ng/tags/Shader" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Shader</span></a> <a href="https://mastodon.thi.ng/tags/GIS" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GIS</span></a> <a href="https://mastodon.thi.ng/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.thi.ng/tags/Forth" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Forth</span></a> <a href="https://mastodon.thi.ng/tags/ProcGen" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ProcGen</span></a></p>
FCLC<p>Hey friends! Looking for clarity on the topic of the matrix extensions to <a href="https://mast.hpc.social/tags/RISCV" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RISCV</span></a>. </p><p>I’ve seen a lot of proposals for specs around, but is there an actual, in progress, official *spec* that someone can point me towards?</p><p><a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a> <a href="https://mast.hpc.social/tags/RVV" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RVV</span></a> <a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mast.hpc.social/tags/RV" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RV</span></a></p>
FCLC<p>Clang/LLVM friends, trying to understand *why* Clang (18) doesn't see through what seems to me like an obvious optimization. </p><p><a href="https://mast.hpc.social/tags/compiler_explorer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>compiler_explorer</span></a> link here, explanation of what I don't understand follows: <br><a href="https://godbolt.org/z/j8WqsMjb6" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">godbolt.org/z/j8WqsMjb6</span><span class="invisible"></span></a></p><p>Going through Hackers delight and doing some of the dirt simple exercises, I dumped the assembly for Chapter 1 exercise 2 "loop that goes from 1 to 0xFFFFFFFF". (changed to not fault in CE) </p><p>(continues in next post, but putting hashtags here)</p><p><a href="https://mast.hpc.social/tags/clang" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>clang</span></a> <a href="https://mast.hpc.social/tags/compilers" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>compilers</span></a> <a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a></p>
Karsten Schmidt<p>One of the best SIMD intro articles I've ever come across thus far. Very nicely explains all the core concepts and operations, lots of sketches/diagrams... Noice! 👏</p><p><a href="https://mcyoung.xyz/2023/11/27/simd-base64/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mcyoung.xyz/2023/11/27/simd-ba</span><span class="invisible">se64/</span></a></p><p>Btw. If you're using TypeScript/JavaScript, you can play with some of these concepts/ops directly from the REPL using <a href="https://thi.ng/simd" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/simd</span><span class="invisible"></span></a>. This package uses WASM behind the scenes, but doesn't expose the full set of available SIMD instructions (it's a lil' bit more highlevel...)</p><p>Also see recent <a href="https://mastodon.thi.ng/tags/HowToThing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HowToThing</span></a> post and practical example about it here:<br><a href="https://mastodon.thi.ng/@toxi/111283262419126958" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">mastodon.thi.ng/@toxi/11128326</span><span class="invisible">2419126958</span></a></p><p><a href="https://mastodon.thi.ng/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.thi.ng/tags/Tutorial" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Tutorial</span></a> <a href="https://mastodon.thi.ng/tags/Rust" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Rust</span></a> <a href="https://mastodon.thi.ng/tags/TypeScript" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TypeScript</span></a> <a href="https://mastodon.thi.ng/tags/WebAssembly" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebAssembly</span></a></p>
Karsten Schmidt<p><a href="https://mastodon.thi.ng/tags/HowToThing" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HowToThing</span></a> #025 — Sampling, fitting, transforming &amp; plotting 10k data points per frame using a whole bunch of underexposed thi.ng packages:</p><p>- <a href="https://thi.ng/colored-noise" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/colored-noise</span><span class="invisible"></span></a>: using violet noise as fake data source<br>- <a href="https://thi.ng/matrices" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/matrices</span><span class="invisible"></span></a>: fitting/transformation matrix creation<br>- <a href="https://thi.ng/simd" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/simd</span><span class="invisible"></span></a>: WASM-based batch processing<br>- <a href="https://thi.ng/malloc" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/malloc</span><span class="invisible"></span></a>: Memory management for WASM/SIMD data buffers<br>- <a href="https://thi.ng/hiccup-canvas" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/hiccup-canvas</span><span class="invisible"></span></a>: 2D canvas visualization</p><p>As noted in the comments, the SIMD batch processing here is to illustrate the overall usage and handling. In this specific example, the main bottleneck is the actual canvas drawing step (esp. in Firefox, which in this case is ~3.75x slower than Chrome [latter easily manages 60fps]). The SIMD step could handle magnitude(s) more points per frame, also on FF...</p><p>As an aside, this is now already the 140th (!!!) fully documented small example project, bundled as part of the <a href="https://thi.ng/umbrella" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">thi.ng/umbrella</span><span class="invisible"></span></a> monorepo... Please do tell me at which point the prejudice of not having enough starting points &amp; info about these packages will be fading into oblivion... 😅</p><p>Demo:<br><a href="https://demo.thi.ng/umbrella/simd-plot/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">demo.thi.ng/umbrella/simd-plot</span><span class="invisible">/</span></a></p><p>Source:<br><a href="https://github.com/thi-ng/umbrella/tree/develop/examples/simd-plot/src/index.ts" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/thi-ng/umbrella/tre</span><span class="invisible">e/develop/examples/simd-plot/src/index.ts</span></a></p><p>Also big thanks to Maximillian Schulte for sending me off on this topic (as a tangent) via an issue on GitHub... I've been meaning to create more examples for these above packages for a while! Last but not least, hat tip &amp; nerd sniping <span class="h-card" translate="no"><a href="https://mastodon.gamedev.place/@demofox" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>demofox</span></a></span> re: colored noise... 😎🤩</p><p><a href="https://mastodon.thi.ng/tags/ThingUmbrella" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ThingUmbrella</span></a> <a href="https://mastodon.thi.ng/tags/WebAssembly" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WebAssembly</span></a> <a href="https://mastodon.thi.ng/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> <a href="https://mastodon.thi.ng/tags/SharedMemory" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SharedMemory</span></a> <a href="https://mastodon.thi.ng/tags/DataViz" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DataViz</span></a> <a href="https://mastodon.thi.ng/tags/Noise" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Noise</span></a> <a href="https://mastodon.thi.ng/tags/TypeScript" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TypeScript</span></a> <a href="https://mastodon.thi.ng/tags/JavaScript" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>JavaScript</span></a> <a href="https://mastodon.thi.ng/tags/Tutorial" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Tutorial</span></a></p>
FCLC<p>Hi friends! Very excited to announce that I'll be giving an <span class="h-card" translate="no"><a href="https://mast.hpc.social/@easybuild" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>easybuild</span></a></span> Tech Talk on the 13th of October on <a href="https://mast.hpc.social/tags/AVX10" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX10</span></a>!</p><p>The Talk is titled "AVX10 for HPC:<br>A reasonable solution to the 7 levels of AVX-512 folly" </p><p>Registration is free, all <a href="https://mast.hpc.social/tags/x86" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>x86</span></a>, <a href="https://mast.hpc.social/tags/AVX" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX</span></a>, <a href="https://mast.hpc.social/tags/AVX512" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AVX512</span></a>, <a href="https://mast.hpc.social/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a>, and <a href="https://mast.hpc.social/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>HPC</span></a> experience levels welcome!</p><p>The page is here: <a href="https://easybuild.io/tech-talks/008_avx10.html" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">easybuild.io/tech-talks/008_av</span><span class="invisible">x10.html</span></a></p><p>And you can register here! <a href="https://event.ugent.be/registration/ebtechtalk008avx10" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">event.ugent.be/registration/eb</span><span class="invisible">techtalk008avx10</span></a></p>
Karsten Schmidt<p>As I've been updating the build files for my various <a href="https://mastodon.thi.ng/tags/ziglang" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ziglang</span></a> projects &amp; templates, also learned that quite a few of them have to be overhauled/refactored due to syntax changes and a more strict compiler. One example is this <a href="https://mastodon.thi.ng/tags/WASM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>WASM</span></a> <a href="https://mastodon.thi.ng/tags/voxel" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>voxel</span></a> <a href="https://mastodon.thi.ng/tags/renderer" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>renderer</span></a> from 1.5 years ago which doesn't build anymore without major code updates, but the old build still works:</p><p><a href="https://demo.thi.ng/zig/voxel-trace/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">demo.thi.ng/zig/voxel-trace/</span><span class="invisible"></span></a></p><p>Reload for random views. Press `x` to export current frame. The renderer is incremental (never finishes) and slowly reduces pixel size from 8 down to 1. It would be much faster, but I had some ideas for creating a more stylistic output and in this current state it only renders a fixed area per frame...</p><p>The 2-bit 512^3 voxel model was generated with a custom fork of <span class="h-card" translate="no"><a href="https://sigmoid.social/@R4_Unit" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>R4_Unit</span></a></span>'s voxel automata... 🥰</p><p>Other renders &amp; process on my old Twitter:</p><p><a href="https://twitter.com/search?q=from%3A%40toxi+voxel+ziglang" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">twitter.com/search?q=from%3A%4</span><span class="invisible">0toxi+voxel+ziglang</span></a></p><p>Ps. This renderer is heavily using this <a href="https://mastodon.thi.ng/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a> vector library:</p><p><a href="https://github.com/thi-ng/zig-thing/tree/main/vectors" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/thi-ng/zig-thing/tr</span><span class="invisible">ee/main/vectors</span></a></p><p>...and is a rewrite of my 2013 hybrid <a href="https://mastodon.thi.ng/tags/OpenCL" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>OpenCL</span></a> <a href="https://mastodon.thi.ng/tags/Clojure" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Clojure</span></a> voxel renderer:</p><p><a href="https://github.com/thi-ng/raymarchcl" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/thi-ng/raymarchcl</span><span class="invisible"></span></a></p><p><a href="https://mastodon.thi.ng/tags/GenerativeArt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>GenerativeArt</span></a> <a href="https://mastodon.thi.ng/tags/ThingUmbrella" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ThingUmbrella</span></a></p>
Nick Doyle<p>An <a href="https://hachyderm.io/tags/introduction" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>introduction</span></a>,</p><p>I'm Nick, a principal software engineer at <a href="https://hachyderm.io/tags/Akamai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Akamai</span></a>. I've been working on Image &amp; Video Manager to make working with images on the <a href="https://hachyderm.io/tags/web" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>web</span></a> fast and easy. I've worked on bits as high as frontend UI and as low as hand written <a href="https://hachyderm.io/tags/ASM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ASM</span></a> and <a href="https://hachyderm.io/tags/SIMD" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SIMD</span></a>. <a href="https://hachyderm.io/tags/webperf" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>webperf</span></a></p><p>I'm enthusiastic about synthesizers, particularly modular synths. I use a large <a href="https://hachyderm.io/tags/Buchla" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Buchla</span></a> clone system in the studio and a <a href="https://hachyderm.io/tags/Eurorack" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Eurorack</span></a> system when I perform live. <a href="https://hachyderm.io/tags/synth" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>synth</span></a> </p><p>I also enjoy ergonomic mechanical keyboards.</p><p>Nice to meet you!</p>