I very much enjoyed this post on implementing fast interpreters and its follow up. An interpreter executing an integer ADD instruction is a classic example of code where the overhead to get to the operation itself is much greater than the cost of the operation that is being implemented. The classic technique to avoid handle this, and the most effective optimisation of a VM runtime like HotSpot, is inlining, which the post mentions.
The follow up post has a pointer to this thesis which is an interesting read on the subject. Sometimes it is just too much work to write a compiler, and this thesis gives a number of straightforward techniques for getting good performance.