Kerry Seitz

Major Goals

Hardware/Platform Independence
Encoding optimization Hints
Quick JIT / Interpreter
Relatively Small Code Files

Virtual Machines

Virtual-Machine Abstraction and Optimization Techniques (http://portal.acm.org/citation.cfm?id=1660797&CFID=8920236&CFTOKEN=79689016) - Does not have full text

Efficient interpretation using quickening (http://portal.acm.org/citation.cfm?id=1869633&CFID=8920236&CFTOKEN=79689016) - Optimization techniques for improving interpreters

Tracing for web 3.0: trace compilation for the next generation web applications (http://portal.acm.org/citation.cfm?id=1508304&CFID=8920236&CFTOKEN=79689016) - trace-based just-in-time compiler for JavaScript

Comparison of application virtual machines (http://en.wikipedia.org/wiki/Comparison_of_application_virtual_machines)

Bytecodes

Comparison of Bytecodes

Intermediate language design of high-level language virtual machines: towards comprehensive concurrency support (http://portal.acm.org/citation.cfm?id=1711509&CFID=8920236&CFTOKEN=79689016)

Lua programming langauge - register-based bytecode (dynamically typed) (http://en.wikipedia.org/wiki/Lua_(programming_language))

Stack vs. Register Based Bytecode

Virtual Machine Showdown: Stack Versus Registers (2008) - update of the paper below (http://portal.acm.org/citation.cfm?id=1328197)

  • Compiling source to stack-based bytecode is usually simpler than to register code. One of the reasons is that there is no need for a register allocator. (21:3-4)
  • Stack-based bytecode may be better for JIT compilation because there is no assumption about the number of available registers. (21:4)
  • Bytecode verification is simplified by the simpler stack IR. How much more difficult bytecode verification is with register IR is an issue of concern. (21:4)
  • Register code requires more memory fetches. (21:7)
  • Stack code must keep track of the bytecode instruction pointer (IP), the stack pointer (SP), and the frame pointer (FP) while register code only needs the IP and the FP. Thus, register code has one variable fewer, resulting in less register pressure. Stack VM must update SP as values are pushed or popped. (21:7)
  • Used Cacao 0.95 as a base VM to implement the virtual register machine. (21:14)
  • Register code is 26% larger but has 44% fewer static instructions than stack code. (21:17)
  • Speed-ups of register over stack code for different architectures and different dispatch methods. (21:23-25)
  • Steed-ups when eliminating redundant loads. (21:28-30)
  • Register IR may be easier to compile to native code because processors are register based. (21:32)
  • Stack code may be easier to compile to native code because it does not make assumptions about the number of registers. (21:32)
  • Mixed mode JIT compilers (like Sun's Hotspot VM) uses both an interpreter and a JIT, so a register based VM might aid in performance (21:33)

Virtual Machine Showdown: Stack Versus Registers (2005) (http://www.usenix.org/events/vee05/full_papers/p153-yunhe.pdf)

Dalvik

Brief overview of the Dalvik virtual machine and its insights (http://www.dalvikvm.com/)

  • optimized for low memory requirements
  • designed to allow multiple VM instances to run at once

Tech lead for the Dalvik project's blog post about the Dalvik JIT (http://android-developers.blogspot.com/2010/05/dalvik-jit.html)

Lecture given at Google Google I/O 2010: "A JIT Compiler for Android's Dalvik VM" (http://www.google.com/events/io/2010/sessions/jit-compiler-androids-dalvik-vm.html)

Lecture given at Google Google I/O 2008: Dalvik VM Internals (http://sites.google.com/site/io/dalvik-vm-internals)

The Dalvik Virtual Machine Architecture - technical reasons why Google chose to develop their own non-standard virtual machine (http://davidehringer.com/software/android/The_Dalvik_Virtual_Machine.pdf)

Dalvik Docs Mirror (http://milk.com/kodebase/dalvik-docs-mirror/)

List of Dalvik Opcodes (http://developer.android.com/reference/dalvik/bytecode/Opcodes.html)

Method parameter register information (http://code.google.com/p/smali/wiki/Registers)

The JVM Bytecode

Compilation

Register Allocation

GPU Programming

Compilers

Algorithms

Program Analysis

Analysis Types

Effect system

(http://en.wikipedia.org/wiki/Effect_system)

  • Check a function for side effects to determine if it can safely be parallelized

Profiling

(http://en.wikipedia.org/wiki/Performance_analysis)

  • Investigates program performance while it is running - think HotSpot JVM, which optimizes code while it is running

Tail-Recursion

In Java

Reasons why the JVM does not allow tail calls (but Scala has some workarounds). (http://stackoverflow.com/questions/105834/does-the-jvm-prevent-tail-call-optimizations)
Informal blog post from 2007 about adding tail calls to the JVM. (http://blogs.sun.com/jrose/entry/tail_calls_in_the_vm)

In Scala

In order to have tail-recursion, a function must be private or the containing class or object must be final to insure that a subclass will not implement this function.
(http://stackoverflow.com/questions/1702831/scala-and-tail-recursion)

Scala's workarounds for implementing tail calls. Compiler looks for tail calls and translates them into loops. Can optimize self calls in final methods and in local functions. @tailrec annotation (Scala 2.8) - use it on methods you hope the compiler will optimize and you will get a warning if they are not optimized by the compiler.
Trampolines - read more about these
(http://blog.richdougherty.com/2009/04/tail-calls-tailrec-and-trampolines.html).

Very Long Instruction Word

Validation for Thesis

Intel plans to come out with a chip with up to 50 cores by 2012 (http://www.hpcwire.com/hpcwire/2011-04-21/tacc_steps_up_to_the_mic.html)