Introduction
I have interviewed a number of .NET developers recently and have been surprised by the low depth of LINQ skills. It may well just be interview nerves, although LINQ is such a powerful concept I think everyone should understand it!
A large proportion of my time is spent in JavaScript and despite the huge proliferation of libraries out there, it seems JavaScript developers are missing out on LINQ. There’s nothing built into the language to support this out of the box, so perhaps people aren’t aware it’s out there! It was a revolutionary update to .NET back in 2007 and am sure it will be fully embraced by JavaScript developers eventually.
We have LINQ in JavaScript already!
I won’t name and shame particular articles here, although I have seen time and time again developers stating that [].map().filter()
is the same as new object[0].select().where()
in LINQ. It’s simply not true! Whilst they both achieve the same result, LINQ is far superior in terms of efficiency and operational space.
Advantages of LINQ
There’s a huge number of things that I love about LINQ, it’s a beautiful elegant way to express intent through a monadic structure.
Declarative over imperative
One of the main reasons for clearly expressing intent, is its declarative approach over old-school imperative code.
As a refresher, imperative code explains how to achieve the end goal and usually consists of loops, assignments and state all over the place like so:
var count = 10; var range = new int[count]; for (var index = 0; index < count; index++) { range[index] = index; }
I have been developing for some time now and I have to read this in detail to see what is going on. Thoughts when reading the code go from bounds checking to ensuring items are not missed, dealing with zero indices etc. It's all hard work and this is a basic loop!
The declarative approach is (in my opinion) much easier to reason about:
var range = Enumerable.Range(0, 10).ToArray();
What if we want to start our range from something greater than 0? In the imperative code, that likely involves another variable and more code to read and understand. The declarative approach just updates an argument to the range method.
Isn't this the same as the imperative version under the covers? Not quite… wait for the next section, although let's assume Enumerable.Range returns an array for a minute… if this we're the case, then rather than writing that for loop out in many places throughout your codebase, you can use the declarative code. The horrible imperative code is out of the way, and fully unit tested so you can rely on it being correct.
It’s lazily evaluated
This is the biggest advantage and best explained by example… take the following contrived JavaScript:
Array.from({length: 10000}, (v, k) => k + 1) .map((num: number) => num + 1) .filter((num: number) => num % 3 === 0) .slice(0, 2);
Enumerable.Range(0, 10000) .Select(num => num + 1) .Where(num => num % 3 == 0) .Take(2);
Array.from({length: 10000}, (v, k) => k + 1) // Create an array of 10,000 elements .map((num: number) => num + 1) // Create another 10,000 element array with numbers incremented .filter((num: number) => num % 3 === 0) // Create another 3333 element array filtered down .slice(0, 2); // Create another 2 element array to take the first two elements.
Enumerable.Range(0, 10000) // Create an enumerable capable of yielding 10,000 items (it doesn't create any arrays or even iterate anything right now) .Select(num => num + 1) // When something is iterated, plus one to it (no arrays yet) .Where(num => num % 3 == 0) // When something is iterated, filter out non-divisible by three items (no arrays yet) .Take(2) // Only take the first two that make it this far .ToArray(); // Pull values through, until we have two and then create an array!
Control to the consumer
The deferred nature allows consumers to pull values at the optimal speed, there won’t be any back pressure. Consumers get to choose how many items they would like to take and are in complete control as to when they stop yielding values.
Consumers have access to a wide array of extension methods and can write their own, so they can build up pretty complicated pipelines which is easy to read at a high level.
Thoughts about Building LINQ in JavaScript
All of this is possible in .NET because of Iterators, JavaScript has a similar concept called Generators (as of ECMAScript2015). I am not the first to think of bringing LINQ to JavaScript, plenty of libraries already exist, although I haven’t found one that uses the built in generators (they often roll their own) and supports functional composition over method chaining.
The method chaining works well in .NET, we have extension methods to add behaviour to types, without actually changing the underlying type. Sadly we don’t have this in JavaScript, so users needing to add their own LINQ operator would usually need to extend someone elses prototype (not ideal!).
I’m a huge fan on Reactive Extensions, which is a monadic type for push based events. You build RX pipes by chaining methods together in .NET, just as you would LINQ. They even named the methods the same, so it’s intuitive for LINQ users. As of RXJS 6, the team changed their approach from method chaining to composition using a pipe operator, which solves having to change someones prototype.
Initial Stab at LINQ in JavaScript
// Types... type IEnumerable = IterableIterator | any[]; // Genenrators... function*range(from: number, to: number) { for (let i = from; i any) { return function*(enumerable: IEnumerable) { for (const item of enumerable) { yield map(item); } }; } // Operators function select(map: (x: any) => any) { return function*(enumerable: IEnumerable) { for (const item of enumerable) { yield map(item); } }; } function where(predicate: (x: any) => boolean) { return function*(enumerable: IEnumerable) { for (const item of enumerable) { if (predicate(item)) { yield item; } } }; } function take(count: number) { return function*(enumerable: IEnumerable) { let counter = 0; for (const item of enumerable) { yield item; counter++; if (counter === count) { break; } } }; } // Helper functions function toArray(enumerable: IEnumerable) { const array: any[] = []; for (const item of enumerable) { array.push(item); } return array; } // Composition const pipe = (...generators: Array IEnumerable)>) => (enumerable: IEnumerable) => generators.reduce((acc, cur) => cur(acc), enumerable); // Examples const processNumbers = pipe( select(num => num * 2), where(num => num > 5), take(2)) (range(0, 100000000)) const processedNumbers = toArray(processNumbers); // ?