Sam Thorogood

Make Async Methods Sync in Node.js

Myles pointed out that mucking with the event loop, by making async methods sync, in Node.js is a recipe for z̴̗̒͆a̸̘͋̕l̸̜͑̐g̵͕̹̈o̶̪̊͗. And he's right, so we shouldn't do it.

But… what if we could? 😈

This is now a library called syncingabout. ➡️📚

Background

Very broadly¹, JS runs using an event loop. You can think of a script that runs at startup as just code that is pushed into that loop—when you run node foo.js, the contents of "foo.js" are loaded and run in the very first event of the program. Our foo program can then enqueue more events, like callbacks from setTimeout or something passed to somePromise.then(...).

If any individual run of that loop blocks, and never finishes—well no further enqueued events will ever run. This is what happens if you write while (true) {}: your program just runs forever (not to mention burns your CPU 🔥), and even if you asked to e.g., fetch from network, that will never be able to complete.

¹yes this is probably wrong but it is right enough for now

Atomics

But we have a new primitive in JS, in Atomics, and code that lets us work across multiple JS threads. This gives us a method which blocks until some future condition—and doesn't burn any CPU doing so. (This isn't a great post about Atomics, rather, it's… One Weird Trick.)

Regardless, we use it like this:

const shared = new SharedArrayBuffer(4);
const int32 = new Int32Array(shared);
Atomics.wait(int32, 0, 0);  // blocks until notified at int32[0]

Go ahead, run that code inside Node (not your browser). It'll run forever.

So what?

Let's do something with this. The key part here is that our SharedArrayBuffer can be shared with another thread. By starting a new thread to complete a task, we can unblock the main thread when it's done. You could build "main.js" like this:

// main.js
import { Worker } from 'worker_threads';

const shared = new SharedArrayBuffer(4);

// send the shared buffer to the Worker
const w = new Worker('./task.js', {workerData: shared});

const int32 = new Int32Array(shared);
Atomics.wait(int32, 0, 0);  // blocks until notified at int32[0]
console.info('done');

And build "task.js":

// task.js
import { workerData as shared } from 'worker_threads';
import fetch from 'node-fetch';

console.warn('fetching');
await fetch('https://samthor.au').then((r) => r.text());
console.info('fetched a blog');

const int32 = new Int32Array(shared);
Atomics.notify(int32, 0);

Note that "main.js" has no async or Promise code whatsoever, yet still is able to complete an asynchronous fetch to the network via its friend "task.js".

There's also nothing special about "main.js", or that we're running the code at the top-level of the program, or that the example is in ESM. This works inside functions, in CJS, or anywhere you can write code.

🎉

Limitations, parameters & return values

Doing async work synchronously is great, but… has some interesting limitations. You might notice that we don't get any response data inside "main.js", which is probably something nice to get back when you're doing a network fetch.

We can send data back up to the parent—or send arguments to the task—but only things that are Transferable. (The list on MDN isn't quite complete—it also includes all primitives, and maybe more objects in Node.js). You won't be passing complex classes around without some work.

Demo returning the fetch result

Let's get back the contents of our blog (and assume the program always wants to fetch it). This is a bit awkward, but we can update the programs to look like this:

// main.js
import { Worker, MessageChannel, receiveMessageOnPort } from 'worker_threads';
const { port1: localPort, port2: workerPort } = new MessageChannel();

const shared = new SharedArrayBuffer(4);

// send the shared buffer and port to the Worker
const wokerData = { shared, port: workerPort };
const w = new Worker('./task.js', { workerData, transferList: [workerPort] });

const int32 = new Int32Array(shared);
Atomics.wait(int32, 0, 0);  // blocks until notified at int32[0]

const message = receiveMessageOnPort(localPort);
console.warn('got fetch data', message);

And "task.js", which now needs to send data back:

import { workerData } from 'worker_threads';
import fetch from 'node-fetch';

const { shared, port } = workerData;

console.warn('fetching');
const text = await fetch('https://samthor.au').then((r) => r.text());
console.info('fetched a blog');

port.postMessage(text);

const int32 = new Int32Array(shared);
Atomics.notify(int32, 0);

As long as we post the reply to the port before notifying via Atomics, it'll be available to the parent—synchronously.

(For those of you in the know, you might be aware that Worker itself supports message passing directly—it has an implicit port that you can access in the client via parentPort. However, the synchronous receiveMessageOnPort doesn't operate on Worker, so we can't use it to get the waiting response.)

So, ah,… should I use this idiom?

Maybe! I think it's interesting but probably has limited utility, and that utility is probably mostly in:

There are some alternatives: deasync does basically this, but needs a compile step (every Node developer's worst nightmare 👻), has worse issues around timing & CPU usage, but saves you from using Transferable.

This is now a library called syncingabout. Is this useful? Let me know—@-me or whatever. Bye!

Last Words

Wait, I have more.

…these are rants for another post. Bye for real! 👋