Google V8 JavaScript Engine

You can read the docs and download the C++ source here.

V8 is supposedly the main added value of Chrome, the newly announced Google browser.

Our discussion of the Chrome announcement enumerates some of the features of V8.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

By no means the only

There are a number of other value-adds for Chrome, notably the multi-processing (as opposed to multi-threaded) architecture, the universal address/search box, and the icognito mode.

Disclaimer: I work for Google, but not on Chrome -- I don't even run Windows, and look forward to trying the Linux version.

I was concentrating,

I was concentrating, naturally, on technical values, not user experience issues. Of the things you mention multi-processing is the only one I should have mentioned, but isn't it (partly) a consequence of the way V8 is designed?

What about Tab jails?

I think the jailed tabs is a huge addition in value. That each tab is a process, is neat, and an idea that should have been in our browsers long ago.

Besides that, I find js-heavy sites to be quite peppy (thanks to V8, assumed), tab behavior to be great, and the minimalism is fantastic.

I don't work for Google, nor am I a fan.

Isn't that a big waste on resources?

I think that one process per tab is a silly idea. Lots of computer resources are wasted, and the sole reason behind this is because one tab can crash the application...perhaps if a better programming language was used, crashes would not exist, only exceptions would be there, and each tab would gracefully be closed.

How much resources are wasted, anyway?

Machine-language program text (DLLs) are shared among any process that uses them; as are read-only data segments. Mutable data shouldn't be shared in the first place. In practice, I would expect that the amount of memory being "wasted" is small, and more than made up for by fragmentation avoidance. The lots of small processes model is tried and true.

The traditional argument for the many-threads-one-process model has always been runtime performance--the belief that mutable shared data is faster than message-passing, in particular when said message passing requires a round trip through the kernel. The amount of data that needs to be passed between browser tabs is probably relatively small.

One process per tab is important, and a wise design choice, even if you assume there isn't a line of C/C++ anywhere in the browser code base. Why? Browsers, by their nature, download and execute Javascript code which is untrusted. Even if you heavily sandbox the JS, which all modern browsers try to do, there are many ill-formed JS client apps out there that resemble forkbombs in their functionality, and effectively launch a denial-of-service against the entire web browser. This has nothing to do with the implementation language of the browser itself, obviously. Any language, no matter how high level, can fail due to memory exhaustion or nontermination. The former is a catastrophic failure if it occurs, and users don't care about the difference between a hard crash that results in a core dump, and one that results in a "help, I've fallen and I can't get up" dialog--either is a crash in the user's mind. And nontermination isn't solved by a robust runtime, obviously.

At any rate--much browser functionality *is* implemented in a safe language--Javascript. JS isn't just for web pages. Obviously, much is also implemented in C/C++--a lot of that has to do with the use of pre-existing things like Webkit, which are written in these languages.
That said, it would be interesting to see a fully-functional Web browser written in something else. (Given the nature of Web browsers, I'd nominate Erlang for implementing those parts that can't reasonably be done in JS). Quite a few of these are known to exist, actually--though I'm not sure if any of 'em are fully compliant with modern WWW standards, and I'd be surprised if any of them would provide Grandma with a satisfying user experience at this point.

I would expect that the

I would expect that the amount of memory being "wasted" is small, and more than made up for by fragmentation avoidance.

Plus any kernel data structures, such as context save areas, signal masks, file descriptors, etc. Not an insignificant cost, though I do think it's worth it in this case. Threads are slightly slimmer, and would be preferable if the language didn't have mutation.

There are different CPU

There are different CPU contexts for each tab. The page caches are flushed or invalidated at every task switch. Resources are consumed for keeping each process' swap file. And I don't think the amount of message passing between the browser process and each tab process is small: the browser needs to keep a whole lot of information regarding each tab, history, connections etc.

Regarding the javascript bomb programs, I think the solution is memory quotas, just like disk quotas. Why have processes? the js engine could assign a specific limit of memory for each js program and not let it consume every resource under the sun.

With processes, a denial-of-service attack from a malicious js script would be a denial-of-service against the whole computer: the swap file may grow so big that the computer starts swapping all the time, leaving no choice to the user than to hit the hard reset button.

Process overhead...

The per-process overhead for a modern OS is not that large. Task switches happen fairly infrequently, and much of the time it's the broser getting pre-empted for something else, rather than one browser task getting pre-empted for another. And unless I'm mistaken, most modern OSs (Linux, in particular) don't maintain swap on a per-process basis; virtual memory is a global thing.

Many modern browsers are multi-threaded (but with a single process); context switching is an issue here too. Context switches within a process ar a bit cheaper, but still require purging of the cache.

Imposing a memory quota on JS is a good idea. While it's possible to do that at the VM level, having processes gives another firebreak for the sandbox to live in.

Malicious or poorly-written JS programs routinely drive your average desktop into thrashing today--a single-process browser implementation can just as easily exhaust all of RAM and most of swap. I see Firefox do this sort of thing all the time. :)

One possible concern with processes that I do see is that a malicious or buggy script might be able to spawn a whole lot of them; IIRC, chrome does contain safeguards against that (restricting the ability of JS apps to open new windows, for instance--the same popup blocking that browsers feature today). But process isolation is frequently a big win.

Well, it may be that the

Well, it may be that the per-process overhead for a task switch is not that large, but a lot of things that 'are not that large' accumulate, and in the end the computer feels slow and sluggish, or even is slow and sluggish.

Virtual memory is global in Windows as well, but that does not matter: Windows assigns a whole lot of virtual memory to each process. I do not know if Linux or other Unix flavors do that. This means lots of entries in the page tables, as well as many 4K blocks on disk.

Threaded task switching is much less expensive than process task switching, because of the sharing of memory space between threads. There is no purging/invalidation of the page descriptor caches on thread switch, if the two threads belong to the same process. In older CPUs (well, not too old), a task switch caused the whole page descriptor cache to be flushed, which was a fairly expensive operation. In newer CPUs, the whole page descriptor cache is not flushed, but only some entries are invalidated.

Since malicious/poorly written JS programs routinely drive our average desktops into thrashing, it's a good idea to limit the resources available to them. Sandboxing them in separate processes does not solve this problem.

In the end, the only reason tabs are processes is because the engine and possibly the plugins are written in C++. The consequences of that language are so huge. It's a shame that it is still in use.

Still not convincing

You can run everything in the same process in Chrome, if you're that afraid of the overhead of multitasking. I read a paper about a similar architecture change made on top of konqueror which increased memory usage by about 20%, I couldn't care less about 20% memory increase for the increased stability.

As for making the computer slow, in my limited testing, Chrome doesn't make my computer slow, but Firefox does, especially when one website use 100% CPU and I don't know which it is (Chrome can give this information)..

[[Since malicious/poorly written JS programs routinely drive our average desktops into thrashing, it's a good idea to limit the resources available to them. Sandboxing them in separate processes does not solve this problem.]]

Uh? I don't understand this point: usually the CPU|memory quotas are per process so using different process allow to limit only the greedy|poorly coded websites without disturbing the rest..
Even without using quota, the resource monitor allow the user to see which website is misbehaving which is very interesting: if users starts complaining and/or avoiding greedy|buggy websites, this could trigger a wave of website improvements (assuming that Chrome gets widespread enough which is far from certain).

VM based languages

might be able to provide instrumentation of tasks running on top of the VM (independent of the VM's architecture).

Languages which compile to native, but don't use kernel objects for multitasking, are probably not inspectable by OS-level profiling or analysis tools.

A question for Achilleas: Which language did you have in mind? Laying bare the deficiencies of C++ isn't hard to do, but were you to write a secure-and-robust broswer, what language(s)--including which implementations for those languages which have more than one significant implementation--would you choose for the task? And how would you handle the concurrency inherent in multiple windows/tabs, and/or asynchronous Javascript apps--green threads/coroutining, kernel threads, processes?

Not an architect, I guess.

[[I think that one process per tab is a silly idea.]]

Well, I'm glad you don't work for me, look:
- code reuse: Chrome use an existing C++ HTML engine, rewriting it in a managed language would a huge work with dubious benefits at best especially since Java and the like tend to use much more memory than C++, so I'm not sure at all that this new engine would use less memory, even with threads.

- memory leaks happen, even in language with GCs: using process to make sure that no bug triggered by one webpage as an impact on other webpage is wise, point.

- plus it allow the user to see which webpage is consuming too much memory, CPU and complain to the author of this page.

So Chrome's design is clever because 1)it's simple 2)it allow the reuse of existing code to make a robust application.

Your idea of rewriting everything in a managed language just because you don't like process isn't economically wise..

And from a resource consumption point of view, I've tried Chrome (on a 4 year old PC with 1GB of RAM) and it doesn't seem to be much more resource consuming that the other browsers, so your suggestion really looks like an early optimisation to me which as we all know is evil :-)

I see

I agree with Achilleas. Besides, all one needs to achieve that is to start several browser processes and let the desktop manager group all the browser windows in the task bar.

Google browser in my view is attempting to be a complete desktop manager for users by itself, including its own process manager and window/tab manager. Of course, the goal is obvious: to get users into cloud computing and have the PC be just a dumb (but hella fast!) terminal.

Finally, I feel there's a lot of backslash against threads these days, mainly because of influence of multiprocessor architectures and functional programming, both benefiting from isolated memory. But I believe threads are still plenty useful (specially for non-persistent computations) and shouldn't be trashed without further reason...

Google browser in my view is

Google browser in my view is attempting to be a complete desktop manager for users by itself, including its own process manager and window/tab manager.

I believe you are correct. Google is attempting to lend web applications a more native presence on a user's desktop. Brilliant strategy IMO.

Except that doesn't work...

When you launch the browser on most platforms (I'm speaking of Firefox on Linux mainly; I suspect Firefox on other platforms behaves similarily. I avoid IE like the plague and haven't yet had time to play with Chrome, and don't use Safari or Opera), one of the first things the startup script does is check for the existence of an existing browser process in the user's environment.

If it finds one; it instructs the existing browser process to open a new window.

There may be a way to start multiple Firefox from the same user account, but with a different "session"--a different set of shared configuration files. Multiple Firefox processes can happily run concurrently if each is given its own playpen in the filesystem; this commonly occurs in multiuser Unix environments. But Firefox is not architected such that a user can launch multiple instances of the browser that share the same configuration and data space.

I'm aware of that but kept

I'm aware of that but kept it simple for the sake of argument. But I was guessing you could probably customize the script to prevent it or simply get away with it and call the binary directly with a few command-line options...

There may be a way to start

There may be a way to start multiple Firefox from the same user account, but with a different "session"--a different set of shared configuration files.

No shared configuration files. Firefox has "profiles", and each profile can run in only one instance of FF. To get around FF's single process limitation, I modified the Quick Launch FF link to be a batch script which launches FF's profile manager, so I can manually launch a new process when I want to. Can't launch a new instance with a running profile, but I have a pool of profiles I cycle through as needed, a process just crying out for automation. Chrome is on the right track here.

Having separate processes

Having separate processes for external plug-ins - Flash, media codecs etc - definitely makes sense: it can only improve stability and security, makes debugging the browser (and the plug-ins) much easier, and allows blame for crashes to be directed more accurately. It is very likely worth the additional runtime cost (probably small).

Separate processes for tabs are perhaps not as important. All code is under the control of a single entity so any crashes can be blamed entirely on the browser/javascript engine code. Most managed environments - JVM, multithreaded Smalltalk and Lisp environments - keep everything in a single process because they anticipate arbitrary references and mutation between threads of control. Web browser tabs being more or less independent, this is not a concern. Furthermore, the number of tabs is low, and they are not created/destroyed with high frequency. I suppose the added robustness against bugs in the browser is a sufficient argument for separate spaces.

OMeta likes Chrome

I ran some examples on the OMeta/JS 2.0 workspace...seems like the performance is quite good, and definitely feels a lot faster than other browsers.

Javascript VMs

I know nothing about the contenders. Can anybody provide a brief summary of what's out there? It can't be that Google are the only ones capable of producing an efficient VM, can it?

Other new Javascript VMs

Firefox's new VM is called TraceMonkey. It has a JIT that uses trace-guided specialization. Performance comparison with V8.

Safari/WebKit's new VM is called SquirrelFish. It's a bytecode interpreter (their old VM was an AST walker).

Interesting performance

Interesting performance data. Thanks!

(But really, guys, recursion is your weak spot? That's not LtU-acceptable...)

You know us better than that

Recursion is being tackled this weekend and into next week. Just wait for a new blog update. Tracing can handle general recursion; it's tricky but not rocket science (feels more like brain surgery ;-).

/be

Tamarin

TraceMonkey benefits from code contributed by Adobe to the Tamarin EcmaScript 4 VM:
http://www.mozilla.org/projects/tamarin/

Also used in Flash.

I see exciting times ahead with native high-performance javascript engines in all browsers. Will Firefox finally be the new Emacs? How about doing Ajax and GUI programming in your favorite programming language? Just make sure to load an interpreter for it written in javascript before loading the code. Or better: make sure your language compiler generates javascript backend.

Chrome is just one more browser for web developers to debug...

Although I admire a lot of different aspects of Google's new browser, I'm worried that adding more widely used browsers will only complicate browser support for web developers. You can build from the ground up a new JavaScript engine every browser version, but at the end of the day it is the web developer who most greatly impacts the end user's experience. JavaScript runs notably smoother with V8, but I still don't have fully working representations of some websites I visit on a daily basis with both firefox 3 and chrome. How long should I expect before websites fully support chrome or even Firefox 3? Is there any chance that web developers could expect more redundancy amongst browsers in the future?

JavaScript is fairly

JavaScript is fairly uniformly implemented - it's the APIs which are not. That's a large part of the reason why we have many of the GUI libraries.

Interestingly, efforts like Caja might be usable on the client to 'fix' even that. So, while not a language (implementation) problem, there might be a language-based solution :)

JavaScript is fairly

JavaScript is fairly uniformly implemented - it's the APIs which are not.

I don't think that's true. It's been awhile since I delved deeply into JS, but I recall there were subtle differences between browsers in how they handle the 'this' parameter, and comparisons/operations on null and undefined and ''.

Emphasis on 'fairly'.

Emphasis on 'fairly'.

'this' was confusingly defined (see http://jssec.net/semantics/), though most usages are conformant. Another example of an incompatibility is implementation of slice or splice (I forgot which) -- some have it, some don't. Named lambdas were not uniformly implemented a couple of years ago, either. While there are some incompatibilities, it's significantly better than, say, the incompatibilities between C++ compilers (which Coverity seems to struggle the most with!). There were a few ridiculous errors, like premature stack overflow errors in safari, but most of that sort had been ironed out years ago. That was the whole realization behind Google maps: it's finally safe to use JS for non-intranet apps.

JavaScript was made to be a scripting glue for the live document, and still largely is. Incompatibilities on that layer are the time-suck in practice. Quirksmode.org has highly trafficked pages on DOM incompatibilities, and nothing I know of on JS ones.

On a side-note, it would be interesting to try to use some of Dawn Song's protocol impl. delta extraction techniques on JS engines, though likely it's still too hard of a task :)

I know of no such bugs

While there are indeed bugs in current browser-hosted JS engines, as in all non-trivial code (this should not scandalize anyone here!), there are no bugs that I know of with null and undefined comparisons, even with JScript in IE (which has bad bugs to do with named function expressions binding their name in the variable object, and other whoppers including [1,].length === 2 -- I hope IE8 fixes these!).

See my Ecma TC39 colleague Pratap Lakshman's detailed analysis of JScript and other browser-based JS implementation deviations from ECMA-262.

We at Mozilla test against other browsers, because we have no choice but to improve "real-world" interoperation in order to gain market share. Due to its market dominance, IE's JScript has set several de-facto standards, contrary to the ECMA-262 spec, but these are mostly corner-case issues (I cited two I consider more significant above).

The earlier comment is accurate in summary: most browser differences are in the DOM and browser object models.

/be

Don't

How long should I expect before websites fully support chrome or even Firefox 3?

Please, don't do that. Support W3 APIs like HTML, DOM or XPath, CSS and EcmaScript. Then pray that users have a modern standards-compliant browser. And don't expect too much from an ongoing beta software. Give it some time to mature.

BTW, I find it amusing that JITted javascript end up being the tool of choice over JITted java applets. Isn't it ironic? :)

Amen.

I would expect that any noticeable differences in behavior between the various open-source browses to be viewed as defects, not features. Only one major browser supplier has a history of using deliberate incompatibilities to enable vendor lock-in, and it isn't Apple, Opera, Mozilla, or Google.

The interesting irony with regard to Java vs JS is that what killed Java on the browser (beyond the machinations of the anonymous browser vendor mentioned in the prior paragraph) was a) the horrible VM startup time; and b) the necessity that the VM be a plugin. Someone could probably bundle Java into a browser these days (now that Sun no longer restricts its distribution as tightly), and the startup time issue is less relevant on newer and faster computers with more sophisticated VMs--but the train has left the station, and Java ain't on board. But yes--JS implementations seem to be approaching Java implementations in some ways.

JS implementations seem to

JS implementations seem to be approaching Java implementations in some ways.

More important: it is, AFAIK, the first scripting language with a JIT implementation.

More important: it is,

More important: it is, AFAIK, the first scripting language with a JIT implementation.

Not sure if these are considered "scripting" languages, but the dynamically-typed language Smalltalk has a JIT implementation called Strongtalk. The V8 docs even reference papers about Smalltalk (and Self).

Many dynamically-typed languages have implementations that target JVM bytecode, which means they eventually get JIT-compiled. Jython (Python) and Rhino (Javascript) were the early ones. Then again, the JVM was designed for statically-typed OO languages, so it I doubt the Jython+JVM combination performs optimizations specific to dynamically-typed languages (the new invokedynamic opcode may help).

Edit: I can't believe I forgot to mention Parrot. Originally conceived as Perl's next-generation VM, it now intends to be a generic VM for all dynamically-typed languages. (I have a feeling it is this ambition that is slowing them down.)

That’s not where Parrot came from!

Parrot was originally conceived as an April Fool’s joke: an announcement that the next versions of Python and Perl would merge. A year or two later, someone decided they would write a virtual machine to run both Python and Perl, and named it after the joke. It is progressing slowly.

Lua also has a JIT implementation

And it exists since 2005.
http://luajit.org/luajit.html

PLT JIT

PLT Scheme has had a JIT compiler for PPC and x86 for a couple years.

It isn’t; there were

It isn’t; there were already Python’s Psyco, Lua’s LuaJIT, Smalltalk (for which JIT compilation was invented in, I think, 1982), the runtime-provided JIT compilers for all the scripting languages that compile to bytecode on the Java and CLR virtual machines (Jython, IronPython, JRuby, IronRuby, Groovy, Kawa, and, oh yeah, Rhino, plus a bunch that don’t matter).

However, it may be the scripting language with the best JIT implementation.

As long as we’re dealing with scripting, though, in the sense of orchestrating big blocks of stuff that happen outside the language itself, JIT is fairly irrelevant; if it takes 100μs for a function call but most function calls are to native-code things (you know, /bin/sort on a file, invoking GCC, multiplying a bag of vectors through a transformation matrix, adding a node to the DOM, that kind of thing) that take 10ms each, it doesn't matter if you can get function-call time down to 100ns.

Scripting languages can't be slow glue

About scripting as glue: you are right in the context you describe. However, limiting oneself to such a context does not work. Once you can program in a better language than C, why not just do it? Writing performance-critical parts in C is possible, but every user needing performance needs to do that. Doing that once forever in the implementation is better.

In general, a language implementation should thus offer _efficient_ support for abstraction techniques. I had the luck to be taught this in a course held by Lars Bak, V8 author, in person.

That's why we see on forums people even suggesting to manually inline functions in Python. And I think we agree that inlining is a task for a language implementation, not for a programmer, and we are glad that for many languages it's like that.

Finally, most languages compiling implementations compiling to the CLR/JIT are not that much faster than interpreters, and they can be slower - JRuby seems to be the only exception out of Jython, IronPython, IronRuby (and Ruby 1.9, with an interpreter important change, just caught up). And if a JIT is slower than an interpreter, even if the fault could be of the underlying platform, it's not that useful.

Other things that killed Java applets

The JVM was not just slow at startup; it was also slow to run, unstable, and had even worse UI incompatibilities across browsers and platforms and versions than DHTML did. The Java language was designed to appeal to C++ programmers, not AppleScript and VBA and HyperTalk programmers, and as such it kind of sucked for the folks who are going around writing things like Prototype and MooTools and jQuery plugins and Flash videos.

It’s easy to forget now that we’re all running Eclipse or IDEA on 2GHz machines with 2GB of RAM, but the Java libraries are kind of a pain to navigate, and the language lacks the abstraction-power and conciseness of modern programming languages such as Common Lisp, Scheme, Perl, Python, Tcl, OCaml, JavaScript, Haskell, Ruby, Smalltalk-80, es, rc, Lua, or PostScript.

It isn’t ironic

JavaScript is a much better language than Java, and DHTML is a much better UI toolkit than AWT, Swing, or even SWT. JIT is a good language implementation technique, and probably should be the default way to implement interpreters. Everything is unfolding exactly as it should.

A useful blog post.

A useful blog post.

v8 source

Get it here:

http://burton.samograd.googlepages.com/google-chrome-v8-javascript-engine--.bz2

This was pulled and packaged from the current svn sources on Fri-Sep-5 2008.

--
burton

Chrome is about isolation

I think the most important thing about Chrome is the isolation it procures. Given the state of browsers today, which are badly written and mostly unstable, it can only be a good idea.

However, I do not see why you have to use the OS abstraction of processes to achieve isolation. That is a very multi-purpose heavyweight mechanism that works at the instruction level. Is that really needed? Isn't a lightweight userspace source-code level isolation enough?

Moreover, killing a process to clean it up isn't really the best idea. What if it holds resources other than memory, or was in the middle of a transactional operation?
Good code is needed for that, whether you want it or not. RAII and thread cancelling implemented by asynchronous exception raising seems like a pretty good model to me. Note that this also makes the whole code leak-free, so the rationale for using processes because it becomes more easy to fight leaks becomes useless.

killing processes to clean them up is a great idea

Moreover, killing a process to clean it up isn't really the best idea. What if it holds resources other than memory, or was in the middle of a transactional operation?

In the case of Chrome, the folks who decided to kill the processes also have control over all the code in them, so they can avoid this problem in any case. But I don't think it's a real problem.

If your program holds system resources other than memory, such as a lock on a file, then the operating system will automatically free those resources, or it’s junk. If your program was engaged in a transaction that had not yet committed, then the “A” property of transactions, “atomicity”, is invoked, and it is as if the transaction had never begun. If the program was engaged in a transaction but committed it before you killed it, then the “D” property of transactions, “durability”, is invoked, and it doesn’t matter that you killed it; the changes performed in the transaction live on. This is exactly why we have transactions: so that processes can die.

The wonderful thing about shared-nothing processes, as distinct from shared-memory threads, is that because they don’t have any shared mutable state, you can kill them whenever you want. Any mutable state that they might have in an inconsistent state dies with them. (Of course, once you bring mutable files or shared memory segments into it, you can lose again; building transactions on top of filesystem or disk operations can be a pain, but it can solve that problem for persistent state.) If you have threads that share mutable state, then a thread can be in the middle of mutating that state when you kill it, leaving it in an inconsistent state. (In fact, that’s a problem with exceptions as well. RAII suffices to release resources as an exception propagates, but restoring arbitrary shared data structures to a consistent state when an exception propagates from an arbitrary point is much more painful. The C++ FAQ-Lite has some examples of how to achieve exception-safety, as long as exceptions can’t arise inside your catch blocks, and it’s not trivial, in general.)

Processes are going to get killed from time to time, whether you like it or not. CPUs overheat; you get OOM kills; memory gets parity errors; the diesel generator at 365 Main fails to spin up fast enough during a power outage; the battery in my laptop runs out; the sysadmin kills the process because he thinks it’s hung and he wants the web site to be providing service to users again. If your software corrupts my data when this happens, your software sucks. Consequently it needs to be able to deal gracefully with unexpected process termination. (Have you seen the “Crash-Only Software” paper?)

However, I do not see why you have to use the OS abstraction of processes to achieve isolation. That is a very multi-purpose heavyweight mechanism that works at the instruction level. Is that really needed? Isn't a lightweight userspace source-code level isolation enough?

The instruction level is the only level we have in common between all the programming languages in a browser, and it has simple models of programs and memory, so the simple mechanisms of memory protection and preemptive multitasking suffice for fairly complete isolation. Furthermore, context-switching on modern Linux is fairly fast, although it can’t cope with more than a few thousand processes. So I think it's a good mechanism.