Compiling JavaScript ahead-of-time

Existing JS engines today either interpret or compile just-in-time; but what about ahead-of-time? It has been long theorised and dismissed but until recently never produced. Porffor is a new JS engine which compiles JS ahead-of-time to WebAssembly or native. Join to learn more about it!

Oliver Medhurst

Oliver is the creator of the new Porffor JS engine; also an active participant in web standards as co-chair of WinterCG and a TC39 invited expert.

Links

Video Permalink

Transcript

Yeah, so I'm going to talk about compiling JavaScript ahead of time.

I'm the creator of porffor, which you'll see shortly.

I'm also the co-chair of Win2CG, which is working on standardizing runtimes like Node.js

or Dino across those runtimes instead of making code with just their APIs, which doesn't work

across them.

I'm also an invited expert for TC39, which is improving the language because they actually

specify and make the spec for it.

How JavaScript is actually run in the engine, traditionally it's with interpretation.

The most famous example of this currently is QuickJS, which is a relatively niche engine,

but it's used in various things, especially where you have resource constraints, where

you can't do a just-in-time compiler, which is what engines like V8 or JavaScript Core

are famous for, where they compile JavaScript source code into machine code as it's running

because they collect data on the JS as it's running to make optimized machine code, which

is very good for performance.

But there is a downside of that, which is you have to compile that code on the user's

machine.

It's not as if you're writing, say, C++ and just shipping a binary to your users.

So the problem is the one you take to compile, it runs faster, but then also you have that

initial loading time, which is why most JS engines nowadays have multiple JITs, where

there will be one which is really fast to compile, but is pretty slow.

And then there's one which is slow to compile, which is only run once it gets warmed up,

which is where you've probably heard that terminology before, which is the more code

gets around, the more it will be optimized in the background.

And the other concern is security, which can be not great.

Many JS engine security vulnerabilities are via JITs or some of the optimizations they

do.

So ahead-of-time compilation is where other languages, say C++ or Rust, you give it to

a tool on your machine or the developer's machine ahead of time, and then that makes

a binary, whether it's native or WebAssembly, which you're shipping over a website.

This fixes those trade-offs I talked about, where you're not worried about a long compile

time because that's on the developer's machine, not the user's.

And security is not a concern because if you compile to, say, WebAssembly, that is sandboxed.

You can control all the input and output from it, rather than, say, native binaries or just-in-time

compiling where that's just machine code and you have no control over it.

You can try and sandbox it, but you'll only get so far.

So you might be wondering, does something exist like this for JavaScript?

And the answer is "portful," which I make and I'm here to talk about.

I want to do a live demo of it.

So you have a basic Hello World.

Oh.

You might know a little bit.

Can everyone see that?

Cool.

So you just have a basic Hello World binary.

So you could run this with, say, Node.js, and it works as expected.

Or you can run this with portful, which acts...

Oh.

I'm too still.

I'm too slow.

But then you can say compile it to a native binary, which after it takes a second, behaves

just the same.

But it's a native binary and it's only 44 kilobytes.

Some runtimes let you do this today, but they just bundle their entire runtime because they

don't do ahead of time compiling, and that ends up being, well, like 90 megabytes, which

is not great for storage constraints.

Additionally...

Yeah, yeah.

That's like...

Oh, my slide's not on right.

But yeah.

So you can do anything you could usually do in a JS Raffle, where you can just do anything

you want, like say promises.

They just work.

So it has a small event loop where you can just do promises or say dates.

Promises are still a work in progress, I will disclaim.

But yeah, so say date or even not nice JavaScript type collision.

So you have this.

Oh.

Yeah.

So you can even...

So as you can see, when it does the string minus one, that turns into a noun because

that's not a number.

But it even has those dynamic type conversions, which JavaScript is infamous for.

Yeah.

Or anything like that.

So yeah.

The architecture to briefly dive into it, at an oversimplified level, it just takes

in JavaScript source code, gives that to a parser, which I don't make because I don't

want to deal with parsing JS.

And then turns out to WebAssembly.

Less simplified view is you take that WebAssembly and have an optimizer and other built-ins,

which are written in TypeScript, and then those are compiled ahead of time as well,

to have a hopefully small compile cost.

And you can also optionally take that WebAssembly code and compile it to C, then also compile

that to native to something like Clang or GCC.

A undersimplified view, you can see the bulk of this is optimization, which is a big goal

for the project.

Even getting something as fast as today's JS engines ahead of time is hard.

So I try and have multiple approaches, and you can kind of pick and choose what works

best for your project, rather than just having one bulk massive optimizer.

So test 262 is the, I guess, industry standard test suite, which is officially made by the

people who make the JS specification, where it has every single feature in the language,

which is tested against.

And I currently pass over 50% of that, which is pretty nice.

And yeah, you can see the correlation is bonding, because for about a year, this was a project

I was doing in my free time, and then I got funding to work on it full time.

And you can see the correlation of actually having the funding to do something.

And yeah, another bonus is that tiny native compilation.

This is something like Dina Woburn, which just bundled their entire runtime.

And yeah, you can also take in TypeScript as the native input.

Like if I go back to this demo.

If you do dash t, then you could say do let a number equals two, and you can just give

it TypeScript.

So it's good.

You don't need Mabel or any transpiler.

And yeah, but it's nice for optimization sometimes, but you can also infer types, which can sometimes

work well, and sometimes be more troubling.

Like to go deeper into it, if you look at the actual WebAssembly outputted by the compiler,

say you just do, oh, I guess this time it won't work, because I made the optimizer too

smart.

But if you say do this, maybe it won't work.

Yeah, so since I made these slides, I updated my optimizer for actually optimizers.

But essentially if you do say in node let a equals one and let b equals two, if you

don't infer types, you will have no idea what a or b are.

But once you infer them, you can know that and just do like an integer addition for that

instruction.

So it's being like, oh, are any of these strings?

I have to check that entire string concatenation.

So the timeline for this, right now I would still say pre-alpha, in which you can use

it, but I have disclaimers everywhere saying expect this not to work.

But I hope next year to have it in a state where I'm comfortable with people using it

in an early state.

I hope it's at least now promising where you can see the potential of a new approach to

compiling JavaScript.

So yes, if people want to shout something, I can try and run it.

How would I go about running this?

Okay, so I think I didn't touch upon this.

This is written in JavaScript.

So you can just npm install.

And then you can just run it.

How would something like an input-output like this work?

Right now that's one of the constraints of the...

So currently I don't have asyncio or anything, because the event loop is really primitive

to just get promises working basically.

But it is definitely possible in WebAssembly, it's just a matter of time essentially.

It's just like I haven't...

I'm focusing on the core engine rather than runtime-y stuff.

But it's definitely possible.

So in terms of say in the Unix world of piping something into a...

Say you compile something into a native compiler, at the moment you can't read from standard

in or...

Yeah, so you can, but you can't stream it.

You have to sync read and write currently.

Okay.

What's the coolest project you think could use this?

Yeah, so there's a bunch of use cases I never even considered starting it.

Especially say you're a company hosting JavaScript in theory, because ahead of time compiling

is much more efficient, because you have the source code and you can optimize it then.

You could in theory have 10 times less overhead, because instead of running a full runtime

for every customer's project, if you run it with this, in theory you can just run 10 times

as many customers on the same server with the same hardware.

So where are your past failures currently?

Because it looked like it was almost zero.

So it's kind of like what little corner of JavaScript are those past failures coming

from?

But I guess also the big question is in terms of the test failures, is there a particular

chunk of functionality that you've yet to get to yet?

Yeah.

The main thing right now is internationalization.

All those APIs I have basically never touched yet, which is probably like 20%.

That chunk is mostly features I haven't gone around to, and they cause some trip up.

I think there's some await things, because promises are still relatively new and unstable.

So there's some things where if you have a hundred concurrent promises, you can have

some memory issues or something.

So with the compiler, is that await stuff, or is it some other syntax that you're...

Yeah, it's actual await.

You can do...

Oh, wait, this.

Oh, if it's a module.

JavaScript.

But yeah, you can do like this, but then if you do...

If you force it to be pending, you can see the await doesn't do anything.

It just returns the promise.

So yeah, there's some stuff where it won't await fully properly, because it's like a

primitive event loop for now.

Is there a particular...

Actually, someone else.

Is there a particular JavaScript feature you're worried about coming out of the pipeline that

goes to the rack?

The main thing is eval, which I'm not worried about.

It will be...

Basically, for now, I don't plan to support it, because at least I hope people have learned

by now that it's not good.

You just said strict mode by default, right?

Yeah, essentially.

Right now, if you do eval one, it will work, but this is only because it reads that this

is a string literal.

So if you do...

Let's just say eval is not defined.

In theory, because it's written in JavaScript, it could compile itself and then use that

for eval.

But for now, I'm not worried about that.

Because I guess, at least for now, I'm not focusing on...

Say you have a 20-year-old JavaScript code base, and you want it to just magically run

with this.

For now, I'm more targeting...

It's the same language, because there's been some stuff before like AssemblyScript, where

it's like a subset of TypeScript, which is very cool, but I wouldn't want someone to

have to re-learn the entire language to use my project.

I'm okay with it not working with a 10-year-old legacy thing, because it's using eval or some

really niche, compact things.

But I think if I just gave you it, and you wrote something for it, you shouldn't have

to consider the JS engine, "Oh, it doesn't support future X."

Or it's really slow if you use Y.

One more.

Can you use this as part of a toolchain so that you just write hot code for creating

a Wasm file that you use in the browser?

Not yet, but that's definitely a possibility.

For now, I'm more targeting server-side things, but it's definitely a possibility.

There is one use case which is interesting, which is obfuscation.

Say you have some really sensitive code interacting with your server, or for some reason it's

sensitive.

In theory, if you compile that to WebAssembly, it's much harder to de-obfuscate and see how

it works, because it's WebAssembly rather than just plain text.

Thank you very much.

Thanks.

[applause]