andai 3 months ago

Nice! I'd suggest embedding the simulation in the blog. I had to scroll up and down for a while before finding a link to the actual simulation.

(You might want to pick a value that runs reasonably well on old phones, or have it adjust based on frame rate. Alternatively just put a some links at the top of the article.)

See https://ciechanow.ski/ (very popular on this website) for a world-class example of just how cool it is to embed simulations right in the article.

(Obligatory: back in my day, every website used to embed cool interactive stuff!)

--

Also, I think you can run a particle sim on GPU without WebGPU.

e.g. https://news.ycombinator.com/item?id=19963640

  • w_for_wumbo 3 months ago

    That's one of the best examples of an explanatory blog that I've ever seen. I wish that this would become the standard of which information was shared - if it's worth sharing, it's worth making it easy to understand.

    • Lerc 3 months ago

      I have done a few blog posts with interactive doodads like this. It takes a lot (like really a _lot_) more time to do, but I think it's the right way to go. There is so much noise on the internet caused by people casting their 2 cents into the void.

      Interactive thingywotsits may slow down individuals making posts, but there are a lot of individuals out there.

      • magicalhippo 3 months ago

        Not being a frontend dev, I have no idea how to even start making something like that.

        Are there some frameworks that make interactive simulations like that easier to make, or do you just do it the hard way?

        • interactivecode 3 months ago

          The "hard" way is often the simple way with these sort of things. What makes it easier is while building out your code, you make little pieces of UI to visualize what you're doing. Think of them like unit tests or test driven development. Then you can take those, clean them up a little and publish them.

        • jasonjmcghee 3 months ago

          p5.js is a great medium. I did a short series in this style - you can inspect it to see full source (non minified / obfuscated) with some comments here and there.

          https://jason.today/falling-sand

          • rustystump 3 months ago

            This is an amazing series. Love the style and incremental examples.

          • magicalhippo 3 months ago

            Excellent, that does look quite approachable indeed. Thanks!

  • kragen 3 months ago

    wow, that fluid sim is astounding

  • rustystump 3 months ago

    I do agree about embedding. I thought about embedding each version but was worried about having too many workers all going at once. I'll update the article to include the final version embedded at the end. Thanks for the feedback.

    That blog is amazing. Each example is so polished. I love it.

    edit: I tried adding an embedded version but the required headers didn't play well with other embeds. The older versions are all still stuck in codesandboxes.

  • bahmboo 3 months ago

    "Skip to the end to play around with the final app."

jekude 3 months ago

Demo on mobile [0], pretty incredible to play with.

[0] https://dgerrells.com/sabby

  • jerbear4328 3 months ago

    Woah, it works with multiple fingers! This is wild for pure JS. Interestingly, more fingers means more lag, I guess more stuff being sent between threads.

  • bloopernova 3 months ago

    Wow, looks strangely organic, like lipid structures in primordial ooze.

  • ruined 3 months ago

    on my phone firefox outperforms chrome! that's satisfying

franciscop 3 months ago

Random question (genuine, I do not know if it's possible):

> I decided to have each particle be represented by 4 numbers an x, y, dx, and dy. These will each be 32-bit floating point numbers.

Would it be possible to encode this data into a single JS number (53-bit number, given that MAX_SAFE_INTEGER is 2^53 - 1 = 9,007,199,254,740,991). Or -3.4e38 to 3.4e38, which is the range of the Float32Array used in the blog.

For example, I understand for the screen position you might have a 1000x1000 canvas, which can be represented with 0-1,000,000 numbers. Even if we add 10 sub-pixel divisions, that's still 100,000,000, which still fits very comfortably within JS.

Similar for speed (dx, dy), I see you are doing "(Math.random()*2-1)*10" for calculating the speed, which should go from -10,+10 with arbitrary decimal accuracy, but I wonder if limiting it to 1 decimal would be enough for the simulation, which would be [-10.0, +10.0] and can also be converted to the -100,+100 range in integers. Which needs 10,000 numbers to represent all of the possible values.

If you put both of those together, that gives 10,000 * 100,000,000 = 1,000,000,000,000 (1T) numbers needed to represent the particles, which still fits within JS' MAX_SAFE_INTEGER. So it seems you might be able to fit all of the data for a single particle within a single MAX_SAFE_INTEGER or a single Float32Array element? Then you don't need the stride and can be a lot more sure about data consistency.

It might be that the encoding/decoding of the data into a single number is slower than the savings in memory and so it's totally not worth it though, which I don't know.

codelikeawolf 3 months ago

This is really awesome!

I did have a question about this:

> Javascript does support an Atomics API but it uses promises which are gross. Eww sick.

With the exception of waitAsync[1], the Atomics APIs don't appear to use promises. I've used Atomics before and never needed to mess with any async/promise code. Is it using promises behind the scenes or is there something else I'm missing?

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

Edit: formatting

dado3212 3 months ago

The videos look awesome but the "try it out here" codesandbox links don't work for me on MacOS Chrome desktop. I get 'Uncaught ReferenceError: SharedArrayBuffer is not defined' and some CORS errors: 'ERR_BLOCKED_BY_RESPONSE.NotSameOriginAfterDefaultedToSameOriginByCoep'.

  • rustystump 3 months ago

    You have to open the previews in a dedicated tab as codesandbox's inline editor blocks the header from being set. It also may get blocked if you are using a privacy focused browser.

    I'll try to include embedded examples in the future.

    You can try out the final version here https://dgerrells.com/sabby

    • shash7 3 months ago

      Whoa this is like fine sand. Amazing!

edweis 3 months ago

Marvelous. I spent an hour to understand the code and play with it. Here is a live implementation: https://particules.kapochamo.com/index.html

  • everyos_ 3 months ago

    When I first opened this, I was stunned! It is really pretty, and I can't believe there are so many simulated particles in JS!

  • rustystump 3 months ago

    Awesome! I am impressed you dug that deep into the code.

  • cowboylowrez 3 months ago

    thanks for that, i was missing out with the desktop!

hereforcomments 3 months ago

Oh, man, can't wait to send it to the UI team who write dead slow React apps. JS is blazing fast. Especially if written well.

  • jsheard 3 months ago

    The problem is that idiomatic JS and blazing fast JS are diametrically opposed to each other, in practice the latter is more like a bad C dialect. You're not allowed to allocate GC objects in fast JS but the language doesn't have good non-allocating alternatives. Nobody is actually going to make a complex JS app where all memory allocations are pointers into a giant ArrayBuffer, it's easier to just switch to WebAssembly at that point.

    • dgb23 3 months ago

      If JS had typed structs (like they have type arrays) it would definitely be more convenient.

      However, that's not where the problem starts. A lot of web sites are slow because they simply run too much code that doesn't need running in the first place and allocates objects that don't need to be allocated.

      We don't need lower level constructs if we can simply start by removing cruft and be more wary of adding it. Go back to KISS/YAGNI.

  • diggan 3 months ago

    JavaScript is probably the language who has seen the most human-hours spent on optimizations for the various engines.

    Too bad we cant just rely on JS only and have to involve a bunch of DOM operations, which is usually the slow part of the UIs we create.

    • lukan 3 months ago

      "Too bad we cant just rely on JS only and have to involve a bunch of DOM operations, which is usually the slow part of the UIs we create"

      No? With WebGL and soon WebGPU, or in this case here with writing to a imagebuffer and just passing that to canvas, you don't have to use the DOM anymore since quite a while.

      (but then you don't get all the nice things html offers, like displaying and styling text etc)

      • diggan 3 months ago

        + built in accessibility + extensions who does something with the DOM + ...

        In reality, you're right, there are alternatives, but for the basic web documents, it kind of hurts more than help to use them.

tired_and_awake 3 months ago

Seriously impressive engineering OP, thanks for the awesome writeup too. Looks like you've got a ton of fans now, well earned!

hopfog 3 months ago

Great article and very relevant for me since I'm building a game in JavaScript based on "falling sand" physics, which is all about simulating massive amount of particles (think Noita meets Factorio - feel free to wishlist if you think it sounds interesting).

My custom engine is built on a very similar solution using SharedArrayBuffers but there are still many things in this article that I'm eager to try, so thanks!

purple-leafy 3 months ago

Such a clever fellow.

How does one get this good with understanding hardware level details like L1 caches and the software implications?

I graduated as an Electrical Engineer, moved into Software for career. Feel like I’m missing some skills.

Specifically how can I better understand and use:

- the chrome Profiler? It’s scary to me currently. - Graphics programming - Optimisiations?

  • tripzilch 3 months ago

    About caches, the main important thing is to know they exist. Which you do know now :) The general idea of cache is exactly how he explains it in the article, and is useful to know about as a general concept. Note that the very hardware specific bit of info that the M1 chip has a "chungus big" cache is not mentioned until very late in the article, which I didn't know yet either.

    I'm not super skilled at the chrome profiler either, it seems to be suited better for certain tasks than others, but I might just be doing it wrong ...

int0x29 3 months ago

Might want a strobe warning. At least for Firefox and Chromium in Linux on a desktop it strobes heavily in the starting state.

  • rustystump 3 months ago

    It depends on the display type. When run on something with low per pixel lighting it can flicker a bit due to how quickly the average light changes frame to frame. Anything with local dim zones may struggle. I looked at ways to fix this but could not come up anything other than running a blur filter which ends up looking terrible.

    • lukan 3 months ago

      "When run on something with low per pixel lighting it can flicker a bit due to how quickly the average light changes frame to frame"

      Not sure I understand. The flicker is not due to sometimes the screen drawn with white(like I assumed) and just because of my mobiles light settings?

      Other simulations similar to this, don't have this flicker on my devive.

      (still impressive work, genuine question to avoid this effect in my experiements)

      And no matter the technical reasons, for some people this might be a serious health issue, so a warming might make sense in the current state.

llmblockchain 3 months ago

Is the code available somewhere? I'd like to see the full code and run locally. It looks like the code sandbox isn't working anymore.

Seb-C 3 months ago

Nice article.

I have done a somewhat similar experiment a while ago and achieved to fit quite a lot of particles with a basic physics simulation.

https://github.com/Seb-C/gravity

pdsouza 3 months ago

Love this. Enjoyed riding your train of thought from challenge conception through each performance pass to the final form. Surprisingly fun to play around with this sim too. Looking forward to more posts!

iEchoic 3 months ago

Very cool, thank you for sharing.

Has anyone done similar experimentation and/or benchmarking on using webgpu for neural nets in JS?

thomasfromcdnjs 3 months ago

Inspiring tutorial!

Does anyone know why/how it maintains state if you tab out? Does Chrome eventually try to clean up the cache or is it locked in?

  • lukan 3 months ago

    Usually inactive tabs are just paused and their state saved.

  • SamBam 3 months ago

    requestAnimationFrame won't fire while you're tabbed out.

itvision 3 months ago

I've saved it to Web Archive just in case, sadly it doesn't work that way.

lbj 3 months ago

Anyone else having trouble with that web vscode he's using?

  • aap_ 3 months ago

    Yeah, no idea how to run the code. There are links to the final demo at the end, but everything else just links to this editor :/

    • rustystump 3 months ago

      This was prototyped on codesandbox before they nuked their product. Each link goes to a specific version which you can test by running bun http.ts in the terminal which serves the content. I updated the article to include this info.

      In the future I will keep everything self hosted to avoid this issue. I appreciate the patience.

    • pkilgore 3 months ago

      For security reasons you cannot use some of the features in this code without setting a specific header (the blog mentions this).

      The sandbox has a button that's basically "Open a preview in a separate tab". If you click that, the header will be sent, and the demo will work.

      If you only use the "in-editor" preview, the proper header will not be sent.

      Agree not intuitive. Hope it helps, it was a super cool demo.

a-dub 3 months ago

so when do we get WebBLAS and WebFORTRAN?

kinda joking, kinda not.

  • shakna 3 months ago

    Actively in-progress, actually. [0] Since about 2016.

    [0] https://gws.phd/posts/fortran_wasm/

    • a-dub 3 months ago

      there's also numpy and scipy in the webassembly python distro (pyodide). but the "kinda not" part more refers to first class scientific/numerical computing support. it's possible, but the libraries are all disjoint or are webassembly ports, etc.

      • shakna 3 months ago

        Pyodide uses f2c for that, as mentioned in the link, but it isn't great, and barely works. You won't get the expected speed out of BLAS that way.

        Which is why the flang port it's about is attempting to compile to the actual primitives.

        • a-dub 3 months ago

          i wonder if simd is working. that would be cool.

          • shakna 3 months ago

            If I understand the build process correctly... It should be on systems support WebAssemblySIMD, like Chrome's V8.

            • a-dub 3 months ago

              yes, but does it actually work end to end and actually deliver meaningful speedups that make it actually useful?

kragen 3 months ago

super cool! i'm thinking webgpu might be usable for a speedup, not sure if webgl would be

  • mandarax8 3 months ago

    A WebGL transform feedback shader would be 100% as performant as what you could write in WebGPU for this use case (independent particle updates).

    • kragen 3 months ago

      thanks! is there a minimal example you'd recommend looking at?

randall 3 months ago

super helpful!!! thanks for this!!