Modern JavaScript for the win

Practical Modern Javascript by Nicolas Bevacqua

I needed a book to get up to speed with ECMAScript 6, a book that would assume some knowledge of JavaScript and would take me through the new features. This book meets those requirements exactly. As we work though the sections, it takes its time to describe what problems the new features are trying to fix, and puts it all together in a really informative 300 page book.

The introduction discusses the various themes behind the added features, and then tells you how to configure Babel so that you can see how some of the new features can be transpiled down into code that will run on a large number of browsers. There’s an online REP for Babel here. ES6 has a large amount of new syntactic sugar, and it is great to be able to play around and see how such additions are converted into the old language – it really aids the understanding.

It was impressive to me how much JavaScript has moved forwards as a language. I remember going to a talk by Mark Miller many years ago where he discussed using revocable proxies to enable mashups, and it is impressive to see how the proxy mechanism can be used for this kind of thing, as well as many other types of meta-programming. I also remember reading articles that described the various patterns to define object-oriented class like entities, and it’s great to see that one standard pattern has now been put into the language.

Promises, generators and iterators (which should really be ordered the other way to describe how the features build on one another) bring really useful higher level programming to the language, and the inclusion of proper maps and sets give us a proper language at last. There’s also a new module system to make programming in the large much easier, and also a number of  new built-ins as well as additional methods on existing types.

What features do I think are really useful?

The extensions to the property syntax – method definitions, shorthands and computed property names

  var x = "hello"
  var y = { x, [x]:29, test(x) {return x+2}}

Argument destructuring and the spread operator.

  var f = x => x + 1
  var x = [1,2,3,4]
  var y = [...x, ...x]
  var [p, ...q] = x

Arrow functions and let and const together with temporal dead zone

  const x = 20
  let y = 40
  let f = a => a * x + y

The class statement which expands to one of the old class patterns where you define a constructor function and set the relevant prototype.

class MyPoint {
    constructor(x,y) {
        this.x = x
        this.y = y
    }
    sum() {
      return this.x + this.y
    }
}
class NextPoint extends MyPoint {
    constructor(x,y,z) {
        super(x,y) 
        this.z = z
    }
}

var p = new NextPoint(1,2,3)

ES6 adds a new extensibility mechanism, the Symbol, which allows you to add items to an object that won’t be found by the standard for..in, Object.keys and Object.getOwnPropertyNames. This gives a way to add methods to make an object iterable, for example, without old code finding that it is being returned new keys in its iterations of an object’s properties.

There are some new utility methods on Object, such as assign which allows you to take a default object and modify the properties,

Object.assign({}, defaults, options);

Also “is” which is a modified form of the === operator where

Object.is(NaN, NaN); // is true
Object.is(+0, -0); // is false

and Object.setPrototypeOf for setting the prototype (if you don’t want to set it when using Object.create).

There is also the concept of decorators which allow you to add additional functionality to classes and statically defined properties. This post offers a few examples.

The most interesting part of the book was the chapter on Promises, which also covers  iterators and the implementation of async. The book works through the design of promises, starting with a description of callback hell and describing how promises help with this issue.

    var reject, resolve
    var p = new Promise((a,b) => { resolve = a; reject = b })
    p.then(() => console.log("ok1"));
    p.catch(() => console.log("fail")).then(() => console.log("ok2"));
    reject(new Error("testing"))

It builds up nicely, with the author spending time talking about Promise.all and Promise.race, before moving on to the Iterable protocol.

const mySequence = {
    [Symbol.iterator]() {
        let i = 0
        return { next() { i++; var value = i; var done = i > 5; return {value,done} } } } }

for (var x of mySequence) { console.log(x); }

The book that talks about generators,

function* test() {
  yield 1;
  yield 2;
}

for (var x of test()) { console.log(x); }

and then async/await

var resolve;

var p = new Promise(r => resolve=r)

async function play() {
  console.log("start")
  await p
  console.log("end")
}

play()  // prints "start"

resolve(20)  // prints "end"

The author spends time describing the transformation that happens when translating the async function into a generator, and how the steps of the generator are forced.

Chapter 5 looks at the new Map/WeakMap/Set/WeakSet datatypes and this is followed by a chapter on the new proxies, which allow controlled access to objects. This part of the language allows you to provide objects to foreign libraries, without having to give that foreign library full control of the object. There is a lot of discussion about the various traps that can be set up and handled, and how you can use the Reflect object to carry on with the default action after intercepting the call.

Chapter 7 of the book discusses a whole load of built-in improvements in ES6, including much better handling of Unicode characters that aren’t on the BMP, and this is followed by a great chapter on the new module system.

In the last, short chapter of the book, the author discusses the language changes, and which parts of the language he considers to be important.

All considered this is a very good book, showing the numerous changes to the language and discussing the reasons behind the changes.

Advertisements
Posted in Uncategorized | Leave a comment

Dependent Types

I’ve been doing a fair amount of reading about dependent types recently, mainly because we have been working through the Idris book as a weekly lunchtime activity at work.

For a good article on the current state of play in Haskell, Richard Eisenberg’s thesis is very good – chapter 2 discusses a load of extensions of Haskell to enable type level programming. He also has a good series of blog posts on various Dependent Types aspects. There are a number of interesting YouTube videos too, including this one of using dependent types in a RegExp library.

At some level, the type checking becomes theorem proving. For the Idris language, there are slides from a talk on how this process happens, and there’s another phd thesis on type checking Haskell which discusses some of the trade offs.

While we are talking about type checking, there is a repository where the author is writing variants on Hindley-Milner type checking in F#. It’s a very interesting learning resource, as the examples are small enough to step in the debugger, so you can watch what happens when a let bound function is generalised by the system.

There’s always that interesting point between type checking and proof, and this github repository uses Idris to generate proofs about various Minesweeper boards.

Posted in Uncategorized | Leave a comment

C# fun with structs

I’ve just spent a while trying to catch up with all the changes to valuetypes in C# 7.x. As someone who doesn’t use the struct keyword in C# very often at all, it is quite amazing all of the changes to C# around this area. This talk is a very good summary.

The key point about structs is that they allow you to avoid some of the heap allocation that can be very bad for certain types of application. In the recent versions of C# the language has changed to allow the user to maintain a pointer to a struct, and hence use that rather than a copy of the original struct.

The classic example is something like the following (where we use a local function to make the example easier to read).

  var data = new[] {1, 2, 3, 4, 5};
  ref int second = ref GetItem(data);
  second = 5;

  ref int GetItem(int[] incoming)
  {
    ref int x = ref incoming[2];
    return ref x;
  }

In the example, the local variable, second, is an alias to the item in the data array, and so any change to either can be seen when viewing the data via the other item. The first thing one notices is the addition of a lot of “ref” keywords to make it clear that the local variable is holding an alias, and that the result of the method call is an alias to something else. It’s a shame that there was a better syntax for this.

There are two other things that have happened in this area. The use of the “in” modifier and the ability to define a struct as readonly.

In the classic struct as a parameter case, the struct is copied on the way in to the method. Hence, for the example struct,

struct A
{
  public int _x;
  public void Increment() => _x++;
}

we get the behaviour.

  A a = new A();
  Increment(a);

  void Increment(A x)
  {
    x.Increment();
    x.Increment();
    Debug.Assert(x._x == 2);
  }

  Debug.Assert(a._x == 0);

We can avoid the copy using ref, but this then affects the caller.

  A a = new A();
  Increment(ref a);

  void Increment(ref A x)
  {
    x.Increment();
    x.Increment();
    Debug.Assert(x._x == 2);
  }

  Debug.Assert(a._x == 2);

We can pass the argument as an “in” parameter, which stops it being copied on entry to the method.

  A a = new A();
  Increment(in a);

  void Increment(in A x)
  {
    x.Increment();
    x.Increment();
    Debug.Assert(x._x == 0);
  }

  Debug.Assert(a._x == 0);

Now, of course, we have to answer the question: how is it that we don’t see the parameter x being changed after the call to x.Increment. The answer is that the compiler takes a defensive copy when it makes the call. You can see this in the IL that is generated (and this is all covered really well by this blog post).

In the above code, the IL for the invocation of Increment is

    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: ldobj ConsoleApp15.Program/A
    L_0007: stloc.0 
    L_0008: ldloca.s a
    L_000a: call instance void ConsoleApp15.Program/A::Increment()
    L_000f: nop 
    L_0010: ldarg.0 
    L_0011: ldobj ConsoleApp15.Program/A
    L_0016: stloc.0 
    L_0017: ldloca.s a
    L_0019: call instance void ConsoleApp15.Program/A::Increment()
    L_001e: nop 

Changing the definition of A to

readonly struct A
{
  public readonly int _x;
  public void Increment() => Console.WriteLine(_x);
}

the compiler notices that the copy can be avoided and hence the IL changes to the more expected

    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: call instance void ConsoleApp15.Program/A::Increment()

It all goes to show that valuetypes are a bit confusing in C#, and it is really hard to know whether you need to optimise them. You really need to do performance measurements to tell whether extra copying is really going to make a difference to the performance of your application.

I got interested in the struct related changes after looking into the implementation of Span, which is implemented as a “ref struct”. Span is a powerful new abstraction over data types such as arrays, and allows you to slice them without copying, and without performance sapping view types. To implement such a thing, the view, the Span instance, needs to be stack allocated and guaranteed to have a dynamic scope that causes it to be de-allocated before the stack frame is unwound – this is a new idea for the CLR which has never really guaranteed that stack allocated things were different before.

You can play with Span using the pre-release System.Memory Nuget package.

            var data = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
            var subdata = data.AsSpan().Slice(3, 4);

The subdata item is a Span, and looking at that in the debugger’s raw view, shows you that it has the fields:

   _byteOffset 0x00000010 System.IntPtr
   _length 4 int 
   _pinnable {int[9]} System.Pinnable 

This efficient implementation, means that we require the Span object is confined to the stack, which is all covered in this proposal document. Span (and the heap allocated version, Memory) are likely to make their way into many framework classes in the coming releases because of the amount that they can reduce allocation in cases where data is pulled from a buffer. The System.Memory package is .NET Standard and so is already available to run on a large number of platforms.

Posted in Uncategorized | Leave a comment

Not sure I love it, but I do understand it better

There was a recent talk at NDC entitled How to stop worrying and love msbuild by Daniel Plaisted. It was an interesting talk that discussed the history of the change to the csproj file to make it a lot smaller and tidier. The new format makes the project file look like the following, which is a massive contrast to the old multi-line mess.

  
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.0</TargetFramework>
  </PropertyGroup>
</Project>

Of course, we have to ask where all of the existing XML has gone. The build system needs to get it from somewhere, and in the other parts of the talk the speaker discusses various debugging tools that give you a better idea of what is going on.

In the past I’ve always found it really hard to debug msbuild proj files. There seem to be masses of files that get imported and the trick has always seemed to be to pick a suitable target and then work from that. Using verbose logging you can track the actions of the build system as it does its work, and then work from that.

The first trick that the talk mentions is to use the pre-processor to get all of the content into a single text file.

msbuild /pp:out.txt

That works really well. Our small project file above expands to around 10000 lines of text, which is commented so that you can see the various imports and what those imports contain. It’s really interesting looking through it to see what the build system defines.

To understand the dynamic side of things, there is a new binary logging format that the build system can dump. You can then load this into a tool to search through the execution of the build.

msbuild /bl

The structured log viewer tool makes it really easy to find your way around the build. You can search for a text string, which makes it really easy to find tasks and properties, and there is a timeline that tells you went and how long the various targets took to execute. It is fascinating to see how much work the system does before it actually calls the CSC task to compile a single file of C#.

I also notice that the documentation about msbuild has got better. A good starting point is the page that talks about the difference between Items and Properties.

msbuild has always felt like the kind of technology I would love. I like programming systems with a small core language (so msbuild has items/properties and targets and tasks) and which then add abstractions on this small core. This kind of approach leaves you needing to understand only the small core language and allows you to explore into the abstractions that have been built up around it. For many Lisp and Smalltalk systems it is this exploration phase that it made easy by the tooling (the development environment) and this has always felt like the part that the msbuild ecosystem was missing. Maybe, these new tools are going to take this pain away at long last, though it still seems to be missing a way to actually single step though the execution of a build in some kind of build debugger which would allow the user to dynamically change things to see the effects. [Apparently you could at one point debug the build scripts using Visual Studio though the debug flag doesn’t seem to be available any more]

Posted in Uncategorized | Leave a comment

Some recent C# learnings

There are a couple of C# related things that I’ve come across recently, and also some interesting C# related talks from NDC which I’ll detail below.

The first interesting thing I came across is that the names of your method parameters are semantically meaningful. I should probably have realised this in the past, but didn’t until someone at work showed me how to stop an ambiguous reference.

The example was something like:

interface IA { }
interface IB { }
class C : IA, IB { }

static void Method(IA a) { }  // set up some methods are equally good
static void Method(IB b) { }

static void Main()
{
   Method(new C());
}

The call to Method in Main cannot be resolved to either of the possible overloads. However, if you change it to

Method(a: new C());

or

Method(b: new C());

then the call is no longer ambiguous.

I’d obviously realised in the past that you had to be careful with optionals as they are inlined, potentially across assemblies, but I had never seen overload resolution changed by naming the parameters.

The second observation is to do with inlining in a Parallel.ForEach. I had some code that simplifies to roughly the following.

static void Main(string[] args)
{
  void Recurse(int x) =>
    Parallel.ForEach(new int[] { x}, Recurse);
  Recurse(1);
}

In the real code I’d added some code to try to check that we didn’t end up calling a Parallel.ForEach in the context of another Parallel.ForEach using a ThreadStatic variable to record the context on the thread.

If you run the above code with a breakpoint on the Recurse method, you’ll see it using the stack of where you first run it, due to the system trying to run the call inline [see TaskScheduler.TryRunInline]. However, rather than eventually getting a StackOverflowException, the computation is eventually moved to a different thread.

Poking around in Reflector you can find the class System.Threading.Tasks.StackGuard which has a method named CheckForSufficientStack which uses Win32 methods to see how much stack space is left on the current thread. If there isn’t enough stack space then we don’t inline the recursive call, but instead move to another thread.

I’ve also seen some good C# talks: C# 7.1 and 7.2 from NDC, high performance C# code from experience working on Bing, msbuild, which is a great introduction and covers the history of the transition to json project files and back,
what is .NET Standard?, more C# 7 related performance changes
and there is also a discussion of the future of Rx and the associated Reactor project
Continue reading

Posted in Uncategorized | Leave a comment

Some what types of reflection are there?

I can’t remember how it came up in the conversation, but I remembered the other day the famous paper by Brian Smith that introduced the idea of the reflective tower of interpreters. The paper is a little hard to read these days, but does try to explain the notion of structural and behavioural reflection in programming languages.

In order to try to understand a little more deeply, I turned to two other Scheme related papers for help. The first talks about reflection and this is extended in the second paper to give a different semantics behind the reflective tower. There is a slightly more modern implementation of the tower in Common Lisp, and some good explanations in this paper.

I also came across a number of related papers – a general discussion of reflection, and a discussion of how to allow compiled methods in this reflective tower world. A lot of this more recent work was also connected to multi-stage programming and partial evaluation

Posted in Uncategorized | Leave a comment

How do you nest your virtualization?

I was looking through some of the YouTube talks from Ignite, when I came across this interesting talk on nested virtualization in Hyper-V. Since September you have been able to provision virtual machines on Azure which support nested virtualization. This is obviously a very powerful feature, and enables many scenarios (such as testing) which you couldn’t easily do before.

This made me start thinking about how you get nested virtualization to work on other platforms such as AWS. I’d come across virtualization using binary translation in the past (as that was the way that VMWare did its thing back in the day), and came across this fairly recent paper that talks about the method. The resulting virtualization which can run in a cloud environment is covered in the paper.

That then leads to the question of whether software implementation can compare with the hardware assisted virtualization, and there are some papers such as this one that study the problem. Hardware support on Intel requires some extensions, the so called VT-X extensions, which are available on more modern processors and which make things a lot easier for the implementation.

Posted in Uncategorized | Leave a comment