Monadic LINQ in JavaScript

Introduction

I have interviewed a number of .NET developers recently and have been surprised by the low depth of LINQ skills.  It may well just be interview nerves, although LINQ is such a powerful concept I think everyone should understand it!

A large proportion of my time is spent in JavaScript and despite the huge proliferation of libraries out there, it seems JavaScript developers are missing out on LINQ.  There’s nothing built into the language to support this out of the box, so perhaps people aren’t aware it’s out there!  It was a revolutionary update to .NET back in 2007 and am sure it will be fully embraced by JavaScript developers eventually.

We have LINQ in JavaScript already!

I won’t name and shame particular articles here, although I have seen time and time again developers stating that [].map().filter() is the same as new object[0].select().where() in LINQ.  It’s simply not true!  Whilst they both achieve the same result, LINQ is far superior in terms of efficiency and operational space.

Advantages of LINQ

There’s a huge number of things that I love about LINQ, it’s a beautiful elegant way to express intent through a monadic structure.

Declarative over imperative

One of the main reasons for clearly expressing intent, is its declarative approach over old-school imperative code.

As a refresher, imperative code explains how to achieve the end goal and usually consists of loops, assignments and state all over the place like so:

var count = 10;
var range = new int[count];
for (var index = 0; index < count; index++)
{
range[index] = index;
}

I have been developing for some time now and I have to read this in detail to see what is going on.  Thoughts when reading the code go from bounds checking to ensuring items are not missed, dealing with zero indices etc.  It's all hard work and this is a basic loop!

The declarative approach is (in my opinion) much easier to reason about:

var range = Enumerable.Range(0, 10).ToArray();

What if we want to start our range from something greater than 0?  In the imperative code, that likely involves another variable and more code to read and understand.  The declarative approach just updates an argument to the range method.

Isn't this the same as the imperative version under the covers?  Not quite… wait for the next section, although let's assume Enumerable.Range returns an array for a minute… if this we're the case, then rather than writing that for loop out in many places throughout your codebase, you can use the declarative code.  The horrible imperative code is out of the way, and fully unit tested so you can rely on it being correct.

It’s lazily evaluated

This is the biggest advantage and best explained by example… take the following contrived JavaScript:

Array.from({length: 10000}, (v, k) => k + 1)
.map((num: number) => num + 1)
.filter((num: number) => num % 3 === 0)
.slice(0, 2);
You may be fooled into thinking it’s the same as the following in LINQ:
Enumerable.Range(0, 10000)
.Select(num => num + 1)
.Where(num => num % 3 == 0)
.Take(2);
Let’s take a look at what the JavaScript version is actually doing…
Array.from({length: 10000}, (v, k) => k + 1) // Create an array of 10,000 elements
.map((num: number) => num + 1) // Create another 10,000 element array with numbers incremented
.filter((num: number) => num % 3 === 0) // Create another 3333 element array filtered down
.slice(0, 2); // Create another 2 element array to take the first two elements.
How many arrays did we just create?  How many items did we need to iterate through?  A fair few!  I may be over optimising in my mind, although I do have an instant dislike when I see this in code bases.  It’s probably fine if you have smaller arrays and are not so concerned with memory or performance.
Now let’s look at the LINQ code:
Enumerable.Range(0, 10000) // Create an enumerable capable of yielding 10,000 items (it doesn't create any arrays or even iterate anything right now)
.Select(num => num + 1) // When something is iterated, plus one to it (no arrays yet)
.Where(num => num % 3 == 0) // When something is iterated, filter out non-divisible by three items (no arrays yet)
.Take(2) // Only take the first two that make it this far
.ToArray(); // Pull values through, until we have two and then create an array!
Only when you iterate (i.e. ForEach or “.Array()”) the enumerable do you pull a value through the pipeline.  The entire thing is deferred until you need it.  We don’t create arrays after arrays and in this case we don’t even iterate over the entire 10,000 items once, let alone multiple times as is the case in the JavaScript version.
This model is known as the pull based model, since the consumer (the thing performing the iteration) is responsible for when it pulls a new value through the pipe.  The consumer can’t be overloaded, since they are only pulled at the rate the consumer is happy with.

Control to the consumer

The deferred nature allows consumers to pull values at the optimal speed, there won’t be any back pressure.  Consumers get to choose how many items they would like to take and are in complete control as to when they stop yielding values.

Consumers have access to a wide array of extension methods and can write their own, so they can build up pretty complicated pipelines which is easy to read at a high level.

Thoughts about Building LINQ in JavaScript

All of this is possible in .NET because of Iterators, JavaScript has a similar concept called Generators (as of ECMAScript2015).  I am not the first to think of bringing LINQ to JavaScript, plenty of libraries already exist, although I haven’t found one that uses the built in generators (they often roll their own) and supports functional composition over method chaining.

The method chaining works well in .NET, we have extension methods to add behaviour to types, without actually changing the underlying type.  Sadly we don’t have this in JavaScript, so users needing to add their own LINQ operator would usually need to extend someone elses prototype (not ideal!).

I’m a huge fan on Reactive Extensions, which is a monadic type for push based events.  You build RX pipes by chaining methods together in .NET, just as you would LINQ.  They even named the methods the same, so it’s intuitive for LINQ users.  As of RXJS 6, the team changed their approach from method chaining to composition using a pipe operator, which solves having to change someones prototype.

Initial Stab at LINQ in JavaScript

I hacked together a very rough proof of concept in Quokka (as an aside, it’s an excellent scratchpad for VSCode users!).  My code is not production ready code by any means, although it illustrates the point.
// Types...
type IEnumerable = IterableIterator | any[];

// Genenrators...
function*range(from: number, to: number) {
    for (let i = from; i  any) {
    return function*(enumerable: IEnumerable) {
        for (const item of enumerable) {
            yield map(item);
        }
    };
}

// Operators
function select(map: (x: any) => any) {
    return function*(enumerable: IEnumerable) {
        for (const item of enumerable) {
           yield map(item);
        }
    };
}

function where(predicate: (x: any) => boolean) {
    return function*(enumerable: IEnumerable) {
        for (const item of enumerable) {
            if (predicate(item)) {
                yield item;
            }
        }
    };
}

function take(count: number) {
    return function*(enumerable: IEnumerable) {
        let counter = 0;

        for (const item of enumerable) {
            yield item;

            counter++;
            if (counter === count) {
                break;
            }
        }
    };
}

// Helper functions
function toArray(enumerable: IEnumerable) {
    const array: any[] = [];
    for (const item of enumerable) {
        array.push(item);
    }
    return array;
}

// Composition
const pipe = (...generators: Array IEnumerable)>) => (enumerable: IEnumerable) =>
	generators.reduce((acc, cur) => cur(acc), enumerable);

// Examples
const processNumbers = pipe(
                        select(num => num * 2),
                        where(num => num > 5),
                        take(2))
                      (range(0, 100000000))

const processedNumbers = toArray(processNumbers); // ?

Summary

It’s actually pretty trivial to implement and saves many wasted iterations and array creations.  Hopefully we will see some libraries embracing functional composition through piping functions together in the near future.
*Update*… Just after posting this article, I was reading the examples on IxJS and they have a pipe operator!  The framework looks really good, perhaps check it out!

Leave a comment