Building a range generator in JavaScript – Part III

Well, in the last posts we covered how to write a range generator function using for-loop and recursion. While iterative algorithms are more efficient because they do not create additional stack frames and pass data across function calls, however, in these times of functional programming, imperative programming is getting less popular.

So how do we create iterative and yet not recursive algorithm to generate range() in JavaScript?

Array method and spread magic
function generate(size) {

  // Create an empty array 
  let empty = new Array(size);

  // It contains 5 empty slots
  console.log(empty);

  // Now create a filled array
  // Use empty array's offset
  let filled = [...empty.keys()];

  // [0, 1, 2, 3, 4 ]
  console.log(filled);
}

generate(5);
Output?
empty [ <5 empty items> ]
filled [ 0, 1, 2, 3, 4 ]

Hey! We have already generated a range(5) of 0..4. Why not take this further and make it flexible?

We are getting closer
function generate(size, skip=1) {

  // Create an empty array 
  let empty = new Array(size);

  // It contains 5 empty slots
  console.log("empty", empty);

  // Now create a filled array
  // Use empty array's offset
  let filled = [...empty.keys()];

  // [0, 1, 2, 3, 4 ]
  console.log("filled", filled);

  // Range with skip
  
  let ready = [...empty.keys()]
    .map(value => value * skip);
  console.log("ready", ready);
}

generate(5, 2);
Output
empty [ <5 empty items> ]
filled [ 0, 1, 2, 3, 4 ]
ready [ 0, 2, 4, 6, 8 ]

So now we can generate an array of specific size, fill it with values and those values can come from an expression like value * skip

We are ready!
function* range(start, end, skip) {

  // Sanity checks
  if ((start == undefined) ||
    (start != undefined && 
      end == undefined && 
      skip != undefined) ||
    (start < end && skip <= 0) ||
    start > end && skip >= 0)
    return null;

  // Check for situations like 
  // range(5) and make it range(0, 5)
  end == undefined && 
    ([end, start] = [start, 0]);

  // If skip is undefined 
  // set it to +1 or -1 depending on range
  !skip && (skip = (start < end) ? +1 : -1);

  yield* [...
    [...Array(Math.abs(end - start) 
       / Math.abs(skip)).keys()]
      .map(value => start + value * skip)
  ];
}

Woah! What happened!!!

Nothing much.. just two things
1. Generate an array of given size e.g.
range(5) -> array of 5 (0, 1, 2, 3, 4)
range(0, 3) -> array of 3 (0, 1, 2)
range(0, 6, 2) -> array of 3 (0, 2, 4) we are skipping 2!

// Generate array of start to end / skip
[...Array(Math.abs(end - start) / Math.abs(skip)).keys()]

But the above array is empty, so we are using keys(), which will only give us 0, 1, 2, 3…
It won’t apply skip if its other than 1.

2. Fill the array with values
range(0, 6, 2) should generate (0, 2, 4)

Ascending with skip
// generate(3, 8, 2) -> (3, 5, 7)
.map(value => start + value * skip)
// 0 => 3 + 0 * 2 == 3
// 1 => 3 + 1 * 2 == 5
// 2 => 3 + 2 * 2 == 7
Descending
// generate(6, 0, -2) -> (6, 4, 2)
.map(value => start + value * skip)
// 0 => 6 + 0 * -2 == 6
// 1 => 6 + 1 * -2 == 4
// 2 => 6 + 2 * -2 == 2

.map() is so sweet! It transform our offsets into final values.

// Yield values from the array
// Where were generate array with keys
// Then map the offsets into range values
yield* [...
    [...Array(Math.abs(end - start) 
        / Math.abs(skip)).keys()]
      .map(value => start + value * skip)
  ];

So there we have it. An iterative range generator function which does not us any form of imperative statements like if and for and is more efficient than a recursive range generator.

If you know another way of generating range function, do post a comment!

Range Generator Series

  1. Building a range generator in JavaScript – Part I (iterative – for loop)
  2. Building a range generator in JavaScript – Part II (recursive)
  3. Building a range generator in JavaScript – Part III (iterative – no for loop)

Building a range generator in JavaScript – Part II

We did a for loop based range generator.
How about doing range generator without for loop?

Well, usually eliminating for loop can be done by converting the function to a recursive one.

Iterative approach, using a for loop
function factorial(num) {
  let f = 1;

  for (let value=1; value<=num; value++)
    f *= value;

  return f;
} 
Converted to recursive (eliminate for)
function factorial(num) {
   if (num < 1)
      return 1;
   return num * factorial(num - 1);
}

See! For loop gone!!

We can perhaps do the same thing with our for-loop based generator too.

Let's take this for based generator
function* range(start, end, skip) {

  // If end not provided, assume 0..start
  end == undefined && ([end, start] = [start, 0]);

  // Check order and set skip, if needed
  let ascending = start < end;

  if (ascending && skip <= 0 ||
    !ascending && skip >= 0)
    return null;

  !skip && (skip = ascending ? +1 : -1);

  // Set which direction we want to skip
  const condition =
    ascending
      ? ((value, end) => value < end)
      : ((value, end) => value > end);

  // Now 'yield' values in a loop
  for (var value = start;
    condition(value, end);
    value += skip)
    yield value;
}
And convert it into recursion based

But how?

JavaScript uses yield to generate a value, but it also has yield * to invoke a function or a source of values to generate individual yields

function* pie() {
  yield* [3, 1, 4, 1, 5, 2, 9];
}

for (let digit of pie)
  console.log(digit);
// 3, 1, 4, 1, 5, 2, 9

We can also apply yield* to a function, like this

function* realGenerator(num) {
  for (let value = 0;
       value < num;
       value++) {

    yield value;
  }
}

function* generator() {
  yield *realGenerator(5);
}

console.log([...generator()]);

So yield* can apply to other generators and return us a stream of values.. hmmm.

What if we applied yield* to our own function? Yeah! Recursively!

function* natural(num) {
  if (num > 1)
    yield* natural(num - 1);
  yield num;

}

console.log([...natural(5)]);

The above code prints a list of natural numbers by recursively calling itself and yielding on natural number at a time. See! No for-loop.

We are READY!
function* range(start, end, skip) {
  
  // Sanity checks
  if ((start == undefined) ||
    (start != undefined 
      && end == undefined && skip != undefined) ||
    (start < end && skip <= 0) ||
    start > end && skip >= 0)
    return null;

  // Check for situations like range(5) and make it range(0, 5)
  end == undefined && 
    ([end, start] = [start, 0]);

  // If skip is undefined, set it to +1 or -1 depending on range
  !skip && (skip = (start < end) ? +1 : -1);

  // While recursing, have we hit the end?
  if (start != end) {
    recursing = true;

    // THERE! We are using recursion
    yield* range(start, end - skip, skip);
    yield end - skip;
  }
}

However, this is wasteful. With every recursion we are doing sanity checks, end check, skip check, which should be done only once. We can put debug statements to follow the flow.

debug = console.log;
function* range(start, end, skip) {
  debug(`Called for ${start}, ${end}, ${skip}`);

  // Sanity checks
  if ((start == undefined) ||
    (start != undefined && end == undefined && skip != undefined) ||
    (start < end && skip <= 0) ||
    start > end && skip >= 0)
    return null;

  // Check for situations like range(5) and make it range(0, 5)
  end == undefined && ([end, start] = [start, 0]);

  // If skip is undefined, set it to +1 or -1 depending on range
  !skip && (skip = (start < end) ? +1 : -1);

  // While recursing, have we hit the end?
  if (start != end) {
    recursing = true;
    yield* range(start, end - skip, skip);
    yield end - skip;
  }
}

// Should print 10, 8, 6, 4, 2
console.log("range(10, 0, -2): ", [...range(10, 0, -2)]);

And here is the output
Called for 10, 0, -2
Called for 10, 2, -2
Called for 10, 4, -2
Called for 10, 6, -2
Called for 10, 8, -2
Called for 10, 10, -2
range(10, 0, -2):  [ 10, 8, 6, 4, 2 ]

As we can see, when we recursively call the function for yield, it goes thru all the checks all over again, which it should do only once.

Hmm! How do we avoid it? IDEA! Closure.
We can decalre a closure variable ‘recursing’ and set it to false. When we call the function the first time, it will be false and we can do all the checks. However, upon recursion, we will set it to true and skip all the check. Neat!

var recursing = false;
const nop = (...rest) => { };
let debug = console.log;

function* range(start, end, skip) {

  // Change to tru to see unnecessary recursion
  if (!recursing) {
    debug(`Called for ${start}, ${end}, ${skip}`);

    // Sanity checks
    if ((start == undefined) ||
      (start != undefined && 
        end == undefined && skip!= undefined) ||
      (start < end && skip <= 0) ||
      start > end && skip >= 0)
      return null;

    // Check for situations like range(5)
    // and make it range(0, 5)
    end == undefined && 
      ([end, start] = [start, 0]);

    // If skip is undefined, 
    // set it to +1 or -1 depending on range
    !skip && (skip = (start < end) ? +1 : -1);
  }

  // While recursing, have we hit the end?
  if (start != end) {
    recursing = true;
    yield* range(start, end - skip, skip);
    yield end - skip;
  }
  recursing = false;
}

// Remove this line to see debug prints
debug = nop;

// Should print [ 0, 1, 2, 3, 4 ]
console.log("range(5): ",  [...range(5)]);

What's the output now?
Called for 10, 0, -2
range(10, 0, -2):  [ 10, 8, 6, 4, 2 ]

Yep! We have an optimised recursion based range generator, which works asc/desc. with any value for skip.
However, recursion is an expensive proposition. It creates a new stack frame for every recursive call and always has the danger of running out of stack space, resulting in StackOverflow.

I tend to prefer iterative solutions over recursive ones, unless recursive leads to elegant code.

So, the next blog? Iterative range generator – WITHOUT recursion and WITHOUT for-loop

Range Generator Series

  1. Building a range generator in JavaScript – Part I (iterative – for loop)
  2. Building a range generator in JavaScript – Part II (recursive)
  3. Building a range generator in JavaScript – Part III (iterative – no for loop)

Building a range generator in JavaScript – Part I

I came across a very near trick in Python, for creating quick enums. Instead of assigning values one by one to a set of variables, it uses range() generator to assign them.

Normal code
# Assigning variables one at a time
RED = 1
GREEN = 1
BLUE = 2
print ( 'RED={}, GREEN={}, BLUE={}'
  .format(RED, GREEN, BLUE)
Using range generator
# Use Python's multi variable assignment
RED, GREEN, BLUE = range(3)
print ( 'RED={}, GREEN={}, BLUE={}'
  .format(RED, GREEN, BLUE)

Sometimes, small little things like this give joy! So, I wanted to do the same thing in JavaScript

// Use javaScript destructuring to multiple assignments
var [RED, GREEN, BLUE] = range(3);

// However, there is no built-in range()

No problems, lets write it ourselves… It should be easy!

/*
 * Module: range() generator
 * Author: Sanjay Vyas
 *

// Generators should be function*
function *range(end) {
  for (var value = 0;
       value < end;
       value++) {
    // generator magic
    // Yield create a "state meachine"
    yield value;
  }
}

// Prints 0, 1, 2
for (var x of range(3))
  console.log(x); 

var [RED, GREEN, BLUE] = range(3);

// Prints 0, 1, 2
console.log(RED, GREEN, BLUE); 

Yay! We have just written our own range generator in JavaScript. But wait a minute.. Python allows the following

# range(start, end, number_to_skip)
# range stops BEFORE end, never returns end
range(5) # 0, 1, 2, 3, 4
range(3, 6) # 3, 4, 5
range(0, 6, 2) # 0, 2, 3, 4
range(6, 3, -1) # 6, 4
range(10, 0, -2) # 10, 8, 6, 4, 2
Alright! Let's have a go at it again
function* range(start, end, skip) {

  // If end not provided, assume 0..start
  // e.g. range(5) will become range(0. 5)
  if (end == undefined) {
    end = start;
    start = 0;
  }

  // Reject wrong skips
  // like range(0, 5, -1)
  // or range(6, 0, +1)
  let ascending = start < end;
  if (ascending && skip <= 0 || 
     ! ascending && skip >= 0)
    return null;
  
  // If user did not give skip
  // range(0, 5) -> skip = +1
  // range(6, 0) -> skip = -1
  if (skip == undefined)
    skip = ascending ? +1 : -1;

  // Now 'yield' values in a loop
  if (ascending) {
    for (var value  = start;
      start<end;
      value += skip)
      yield value;
  } else {
    for (var value  = start;
         start<end;
         value += skip)
      yield value;
}

There! We are done!
But is this optimised? We have for loop twice and, once for ascending and once more for descending

Ummm.. let's try to optimise it!
function* range(start, end, skip) {

  // If end not provided, assume 0..start
  end == undefined && ([end, start] = [start, 0]);

  // Check order and set skip, if needed
  let ascending = start < end;

  if (ascending && skip <= 0 || 
     ! ascending && skip >= 0)
    return null;
  
  // Use "truthy" short-circuiting
  // and ternary assignment
  !skip && (skip = ascending ? +1 : -1);

  // Replace 2 fors with 1
  // Use lambda to check condition in for
  const condition = 
       ascending 
           ? ((value, end) => value < end) 
           : ((value, end) => value > end);

  // Now 'yield' values in a loop
  for (var value = start;
    condition(value, end);
    value += skip)
    yield value;
}

Tada!! Finally done. Let's use it and see
// Should print [ 0, 1, 2, 3, 4 ]
console.log("range(5): ", 
  [...range(5)]);

// Should print [ 0, 1, 2 ]
console.log("range(0, 3): ",
  [...range(0, 3)]);

// Should print 6, 5, 4
console.log("range(6, 3): ",
  [...range(6, 3)]);

// Should print 6, 4, 2
console.log("range(6, 0, -2): ",
  [...range(6, 0, -2)]);

// Create enum using range(3)
var [RED, GREEN, BLUE] = range(3);
console.log(`RED=${RED}, GREEN=${GREEN}, BLUE=${BLUE}`);

// Should print 10, 8, 6, 4, 2
console.log("range(10, 0, -2): ",
  [...range(10, 0, -2)]);

// Should not print anything
console.log("range(): ",
  [...range()]);

// Should not print anything
console.log("range(0): ",
  [...range(0)]);

// Should print 0, 2, 4, 6, 8
console.log("range(0, 10, 2): ",
  [...range(0, 10, 2)]);

// Should not print anything
console.log("range(10, undefined, 2): ",
  [...range(10, undefined, 2)]);

// Should print 5, 4, 3, 2, 1
console.log("range(5, 0): ",
  [...range(5, 0)]);

And here is the output
range(5): [ 0, 1, 2, 3, 4 ]
range(0, 3): [ 0, 1, 2 ]
range(6, 3): [ 6, 5, 4 ]
range(6, 0, -2): [ 6, 4, 2 ]
RED=0, GREEN=1, BLUE=2
range(10, 0, -2): [ 10, 8, 6, 4, 2 ]
range(): []
range(0): []
range(0, 10, 2): [ 0, 2, 4, 6, 8 ]
range(10, undefined, 2): [ 0, 2, 4, 6, 8 ]
range(5, 0): [ 5, 4, 3, 2, 1 ]

Is that it? YES! This is a short and sweet implementation of range() generator in JavaScript.
Can we do it in a better way? Well, we can do it using recursion and array expansion with map.

Those will be Part II and III

Range Generator Series

  1. Building a range generator in JavaScript – Part I (iterative – for loop)
  2. Building a range generator in JavaScript – Part II (recursive)
  3. Building a range generator in JavaScript – Part III (iterative – no for loop)

Garbage Collection Internals (Part 3 – Java)

In Part 1 and Par 2 we saw that there is no Garbage Collection at language level in C and C++, even though we can use “smart” pointers in C++ to carry out deallocation when it goes out of scope.

Languages which came after C/C++ understood the problems of memory leaks and developers putting substantial efforts in managing memory that they implemented Garbage Collection at the platform/VM level.

Java equivalent of earlier C++ program
/*
 * File:      GCJava.java
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 * 
 * Revision History:
 * 2020-Jun-21	[SV]: Created
 */

class AllocatedResource {
    private int data;

    public AllocatedResource(int x) {
        this.data = x;
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("Deallocating " + this.data);
    }
}

public class GCJava {
    public static void main(String[] args) {
        AllocatedResource resource = new AllocatedResource(5);
        System.out.println("End of program");
    }
}
And here is the output
End of program

That’s it!!
If we were expecting the finalize() method to blurt out something, well, it didn’t even get called. Not even when the program came to an end. In fact, there is no gaurantee that finalize will ever get called. Morover, starting from Java version 9, it has been deprecated.

Let’s make our code put some pressure on GC and create thousands of objects and see what happens

/*
 * File:      GCJava.java
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 * 
 * Revision History:
 * 2020-Jun-21	[SV]: Created
 */

class AllocatedResource {
    private int data;

    public AllocatedResource(int x) {
        this.data = x;
    }

    @Override
    protected void finalize() throws Throwable {
        System.out.println("Deallocating " + this.data);
    }
}

public class GCJava {
    public static void main(String[] args) {
        
        // Put pressure on GC
        for (int i = 0; i < 1000000; i++) {
            AllocatedResource resource = new AllocatedResource(i);
        }
        System.out.println("End of program");
    }
}
And what do we get?
:
Deallocating 822025
Deallocating 822024
Deallocating 822023
Deallocating 822022
Deallocating 822021
Deallocating 822020
Deallocating 822019
Deallocating 651277
Deallocating 822013
Deallocating 822012
Deallocating 822011
Deallocating 822010
:
End of Program

Finally! The Garbage Collector ran… but only when it started to run out of heap because we allocated 1000000 objects.

So, don’t use finalize() because there is no guarantee that it will be called and also its deprecated now.

So how does Java manage Garbage Collection?

Java has multiple heaps – Eden, Survivors, Tenured, Permanent Generation

Java, unlike C/C++, has multiple heaps like Eden, Survivors, Tenured and Permanent generation. It’s stack is also slightly different with operand stack part for computation (not shown in the Concept Visual, maybe topic of another blogpost).

Java Garbage collector works across 4 heaps (Eden, Survivor 1&2 and Tenured) to keep track of “unreachable” objects which it marks-sweeps-moves across these segments.

HeapPurpose
EdenNew object allocations are stored here
Survivor 1After GC, objects from Eden are moved here
Survivor 2After GC, objects from Survivor 1 are moved here
TenuredLong surviving objects are moved here
PermanentNot really object heap but keeps class artefacts

So what, happens when GC runs?
Let’s say we release two rerefences a and d by setting them to null. As a result, objects containing “5” and “8” and now “unrechable”, i.e. no one is pointing to them. So what happens? GC will come running and remove them? NO! From the first example in this post we know that GC doesn’t run as soon as a single object is orphaned. In fact, the GC wont run unless it starts to run low on heap memory. So the object will keep lying around in the heap.

However, let’s assume that the GC runs, then what happens? Well, GC will mark those objects which are still “reachable”, in this case “6” and “7”, move them to Survivor generation (S0) and then clear the Eden heap.

Mark-Sweep and Move From Eden to Survivor 0 heap

As a result, Eden now becomes empty and surviving objects are “compacted” into Survivor 0. Why not let object remain is just one heap and remove those which are unreachable? Well, if they remain where they are, the free space between them will be fragmented and if there is a larger object is to be allocated, we may not have enough “contiguous” free space in the heap. That’s the reason existing objects are “moved” to a different heap and “compacted”.

So what does the Process Map looks like now?

Post GC, object’s are moved and compacted in Survivor 0

Efficient, isn’t it? Periodically, Garbage Collector removes unreachable objects and moves the remaining in a compacted manner in Survivor generation. So why 2 Survivor generations? Survivor 0 and Survivor 1?

Well, if memory gets fragmented in Survivor 0, then is moves and compacts to Survivor 1 and vice versa. This way new (and may shortlived) objects can be allocated in Eden space while long surviving objects osciallte between S0 and S1. Huh? They will oscillate forever? Hehe.. no. If an object survives a threashold, it’s finally promoted to Tenured generation, where it will no longer be moved around (old people, you see :-)).

Here is what it might look in Tenured.

Tenured Generation

An object reaches “Tenured” generation if it survives N GC passes threshold. Interesting, isn’t it?
We have compared non-GC languages like C and library based GC like C++ Smart pointer. Java was one of the first commercial language to bring in Garbage Collection (the other was Python) and .NET followed with its own Garbage Collection system. In fact, it has 8 heaps (compared to 4 heaps of Java) and 9th has been just added to .NET 5. So, another post?

Garbage Collection Internals Series

  1. Part 1 – C language
  2. Part 2 – C++ Language
  3. Part 3 – Java

Garbage Collection Internals (Part 2 – C++)

In the previous post we saw that C progamming language does not have a built-in garbage collector, but what about C++? Yes and No.

C++ does NOT have a built-in garbage collector at language level but over a period of time it has added “smart” pointers, which deallocate memory when they go out of scope. This at least stems the memory leaks caused in typical C++ program, provided the developer uses these smart pointers.

How smart are these pointers?
/*
 * File:      SmartPointer.cpp
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 * 
 * Revision History:
 * 2020-Jun-20	[SV]: Created
 */

#include <iostream>
using namespace std;

// Resource we will allocate on heap
class AllocatedResource
{
private:
    int data;

public:
    // Constructor
    AllocatedResource(int x)
    {
        this->data = x;
        cout << "Allocated " << x << "\n";
    }

    // Destructor
    ~AllocatedResource()
    {
        cout << "Deallocated " << this->data << "\n";
    }

    void print()
    {
        cout << "Resource: " << this->data << "\n";
    }
};

int main()
{
    unique_ptr<AllocatedResource>
        {new AllocatedResource(5)};
    return 0;
}

And here is the output
Allocated 5
Deallocated 5

As we can see, unique_ptr automatically deallocates the pointer when it goes out of “scope”, in this case, when main ends. This frees us up from keeping track of allocation and manually calling delete when it’s not longer needed.

unique_ptr is an object itself and keeps track of allocation it does
So what happens if we dont deallocate
and simply reallocate a unique_ptr
/*
 * File:      AutoPtr.cpp
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 *      GC is not at language level in C++
 *      We can use auto_ptr/unique_ptr to automate it    
 * 
 * Revision History:
 * 2020-Jun-20	[SV]: Created
 */
#include <iostream>
#include <memory>

using namespace std;

// Resource we will allocate on heap
class AllocatedResource
{
private:
    int data;

public:
    // Constructor
    AllocatedResource(int x)
    {
        this->data = x;
        cout << "Allocated " << x << "\n";
    }

    // Destructor
    ~AllocatedResource()
    {
        cout << "Deallocated " << this->data << "\n";
    }

    void print()
    {
        cout << "Resource: " << this->data << "\n";
    }
};

int main()
{
    // Raw pointer (C style)
    AllocatedResource *rawPointer;
    cout << "Raw pointer allocation\n";
    for (int i = 0; i < 5; i++)
    {
        // This will allocate and fire constructor
        rawPointer = new AllocatedResource(i);
        // WARNING! Memory leak occuring on rawPointer
    }
    cout << "Raw pointer causes memory leak\n\n";

    // Unique pointer (C++ style)
    unique_ptr<AllocatedResource> smartPointer{};
    cout << "Automatic pointer allocation\n";
    for (int i = 0; i < 5; i++)
    {
        // This will allocate and fire constructor
        // When assigned again
        //  It will fire destructor on previous allocation
        //  and fire constructor on new allocation
        smartPointer = unique_ptr<AllocatedResource>
            { new AllocatedResource(i + 100) };
    }
}

And here is the output
Raw pointer allocation
Allocated 0
Allocated 1
Allocated 2
Allocated 3
Allocated 4
Raw pointer causes memory leak
Automatic pointer allocation
Allocated 100
Allocated 101
Deallocated 100
Allocated 102
Deallocated 101
Allocated 103
Deallocated 102
Allocated 104
Deallocated 103
Deallocated 104

As we can see, the C style “raw” pointer allocates memory but if we don’t call delete on it, the memory will not be deallocated, causing it to become “garbage” which is not “garbage collected”.

However, unique_ptr is a class which automatically deallocates when it goes out of scope (basically does a delete when its destructor is called), or when we assign a new allocation to it. This brings some semblance of sanity in deallocating unused memory allocations in C++. However, let me repeat… the is NO built-in Grabage Collector in C++, instead we have to use unique_ptr or shared_ptr and make sure we reduce memory leaks from our code.

If unique_ptr is again allocated, it auto deallocates the previous allocation

C and C++ both don’t have built-in garbage collector, but at least C++ provides built-in classes like unique_ptr and shared_ptr which automatically manage the allocation. Internally shared_ptr uses reference counting to keep track of how many pointers are pointing to a given allocation. However, this is not language level Reference counting (Swift or Python).

Garbage Collection Internals Series

  1. Part 1 – C language
  2. Part 2 – C++ Language
  3. Part 3 – Java

Garbage Collection Internals (Part 1 – C)

Why do programs produce garbage? πŸ€”

Well, we first have to understand memory internals of programming languages to understand what is dynamic memory and what constitutes “garbage”. So let’s take a look at this Digital ConceptVisual video which explains Process Map, Segments, Global Data area, Stack and Heap, the last one being the focus of our attention.

Compilation Process of C/C++ and Process Map

As we can see from the above video, global variables (if any in the language) are stored in DATA/BSS segment, local variables are in the STACK and any data created at run time are stored in HEAP.

Languages like C/C++ were mostly Global and Local variables and only when the developer specifically called malloc/calloc or new, it allocated memory in the HEAP.

One great thing about HEAP allocation is that it’s allocated at runtime and can be “released” by the developer. If the developer stops using the allocated memory and does not release it, well, that’s when it’s called “Garbage” memory.

Different languages have different strategy for releasing heap allocated memory once it’s no longer needed

TechnologyGrabage Collection
CYou clean up your own mess
C++You clean up your own mess
JavaMark and Sweep Garbage Collector (GC)
SwiftAutomatic Reference Counting (ARC)
COMReference Counting (RC)
C#Mark and Sweep Garbage Collector (GC)
PythonAutomatic Reference Counting (ARC) + Mark and Sweep Garbage Collector (GC)
JavaScriptEarlier was ARC but now most JS Engine implement Mark and Sweep (GC)
Language/Technology and Garbage Collection

So there are 4 Garbage collection strategies

  1. No automatic Garbage Collection – C/C++
  2. Manual Reference Counting (RC) – COM
  3. Automatic ReferenceCounting (ARC) – Swift
  4. Mark and Sweep (Compact or Copy) – Java, .NET, JavaScript
1. No Automatic Garbage Collection - C/C++
/*
 * File:      NoGC.c
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 *      C program to demonstarte No GC
 * 
 * Revision History:
 * 2020-Jun-20	[SV]: Created
 */

#include <stdio.h>  // printf/scanf
#include <stdlib.h> // malloc/qsort

// Maximum random number
#define MAX (100)

// Helper function to print array
void print(int *array, int size)
{
    int offset;

    // Print comma (upto 2nd last)
    for (offset = 0;
         offset < size-1;
         offset++)
    {
        printf("%d, ", array[offset]);
    }

    // Print last element
    printf("%d\n", array[size-1]);
}

// Compare function required for qsort
// Return
//      0 if both are equal
//     <0 if lhs is < rhs
//     >0 if lhs is > rhs
int intcmp(const void *lhs, const void *rhs) 
{
    // We are given 2 elements as void *
    // We need to typecast them back to int
    return *(int *)lhs - *(int *)rhs;
}

int main()
{
    int *pointer = NULL;    // Defensive programming πŸ™‚
    int size;
    int offset;

    while (printf("Enter size (0 to quit): "),
           scanf("%d", &size),
           size)
    {
        // Allocate heap memory
        pointer = (int *)malloc(size * sizeof(int));

        // Randomize it
        for (offset = 0;
             offset < size;
             offset++)
        {
            pointer[offset] = rand()%MAX;
        }

        // Now print the array
        print(pointer, size);

        // Let's sort it using built in function
        qsort(pointer, size, sizeof(int), intcmp);

        // And print it again
        print(pointer, size); 

        // WARNING! Memory is LEAKING here
        // We did not free(pointer)
        // The loop will use same pointer
        // to malloc again, losing the previous address
    }
}
What happens we do a malloc or new
This is what happens when we do a malloc
And what is a memory leak?
Losing address without doing free is MEMORY LEAK

What we see in the above image is a result of not deallocating memory (free or delete) and losing the address. Over a period of time, the application will fill up its heap area and would no longer be able to allocate more memory as previously unused allocations are occupying the heap area. These unused allocations are called GARBAGE.

Remember to free your allocations
/*
 * File:      FreeMem.c
 * Project:   CPP
 * Author:    Sanjay Vyas
 * 
 * Description:
 *      C program to demonstarte No GC
 * 
 * Revision History:
 * 2020-Jun-20	[SV]: Created
 */

#include <stdio.h>  // printf/scanf
#include <stdlib.h> // malloc/qsort

// Maximum random number
#define MAX (100)

// Helper function to print array
void print(int *array, int size)
{
    int offset;

    // Print comma (upto 2nd last)
    for (offset = 0;
         offset < size - 1;
         offset++)
    {
        printf("%d, ", array[offset]);
    }

    // Print last element
    printf("%d\n", array[size - 1]);
}

// Compare function required for qsort
// Return
//      0 if both are equal
//     <0 if lhs is < rhs
//     >0 if lhs is > rhs
int intcmp(const void *lhs, const void *rhs)
{
    // We are given 2 elements as void *
    // We need to typecast them back to int
    return *(int *)lhs - *(int *)rhs;
}

int main()
{
    int *pointer = NULL; // Defensive programming πŸ™‚
    int size;
    int offset;

    while (printf("Enter size (0 to quit): "),
           scanf("%d", &size),
           size)
    {
        // Allocate heap memory
        pointer = (int *)malloc(size * sizeof(int));

        // Randomize it
        for (offset = 0;
             offset < size;
             offset++)
        {
            pointer[offset] = rand() % MAX;
        }

        // Now print the array
        print(pointer, size);

        // Let's sort it using built in function
        qsort(pointer, size, sizeof(int), intcmp);

        // And print it again
        print(pointer, size);

        // Remember to deallocate memory
        free(pointer);
    }
}

In C programming there is no built-in GC (Garbage Collection) and as developers we must keep track of the memory we are allocating and runtime and deallocate it when not required. This is tough given that the codebase may be huge and memory addresses may be passed around, losing track of who allocated it and who will deallocate it. Add to this, many developers do not have visualisation of what is happening in memory.

No wonder, languages which came after C/C++ implemented some from of Garbage Collection (Java, .NET, Python, Swift)

Garbage Collection Internals Series

  1. Part 1 – C language
  2. Part 2 – C++ Language
  3. Part 3 – Java

Private in JavaScript ESNext (Part 3)

In the previous two posts we talked about how to create “private” fields for a class. Those involved kind of convoluted mechanisms like closure or WeakMaps.

JavaScript is constantly evolving and in the new ES2019 standard there is a clean way of creating “private” fields of a class

Let's see how ES2019 creates private
class Account {

    // True private fields
    #id = 0;
    #name = "";
    #balance = 0;

    constructor(id, name, balance) {
        this.#id = id;
        this.#name = name;
        this.#balance = balance;

    }

    withdraw(amount) {
        this.#balance -= amount;
    }

    deposit(amount) {
        this.#balance += amount;
    }

    print() {
        console.log(
            `ID: ${this.#id}`,
            `Name: ${this.#name}`,
            `Balance: ${this.#balance}`
        );
    }
}

let eich = new Account(1, "Brendan", 10000);
eich.print(); 

// This doesn't work
eich.#balance = 50000;
And here is the output
ID: 1 Name: Brendan Balance: 10000

eich.#balance = 50000;
^
SyntaxError: Private field '#balance' must be declared in an enclosing class

That’s it! No more closures, Symbols or WeakMaps to implement private in JavaScript latest version.
But JavaScript… why oh why, did you not use the reserved word private??? It would have been so simple to write something like

// Warning: This is NOT valid syntax

class Account {

    // What it COULD have been
    private id = 0;
    private name = "";
    private balance = 0;

    constructor(id, name, balance) {
        this.id = id;
        this.name = name;
        this.balance = balance;

    }

    withdraw(amount) {
        this.balance -= amount;
    }

    deposit(amount) {
        this.balance += amount;
    }

    print() {
        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,
            `Balance: ${this.balance}`
        );
    }
}

However, there is one thing STILL elusive, even in the latest version of JavaScript.. private functions. There is still no way to mark a function as private. So the following still doesn’t work

class Account {

    // True private fields
    #id = 0;
    #name = "";
    #balance = 0;

    constructor(id, name, balance) {
        this.#id = id;
        this.#name = name;
        this.#balance = balance;
    }

    // Cannot create private functions
    // # works on variables but not functions
    #sendSms() {
        console.log(`Sms Sent: Updated balance=${this.#balance}`);    
    }

    withdraw(amount) {
        this.#balance -= amount;
    }

    deposit(amount) {
        this.#balance += amount;
    }

    print() {
        console.log(
            `ID: ${this.#id}`,
            `Name: ${this.#name}`,
            `Balance: ${this.#balance}`
        );
    }
}

let eich = new Account(1, "Brendan", 10000);
eich.print();

So, the question remains… what is the most elegant way of creating private functions in ES6 or later such that we don’t land up adding the same function in EVERY instantiated object.

If you have ideas, drop me a comment

Private in JavaScript Series

Private in JavaScript ES6 (Part 2)

JavaScript ES6 brought a lot of changes like const, let, default arguments, spread operator, classes..
Wait WHAT? JavaScript now is a “class” based OO instead of “prototype” based OO???

Relax! class is just syntactical sugar coating and it’s still “prototype” based OO language

I repeat.. internally JS is NOT class based OO language

Hey, but there is a class keyword! Yes, and it still creates objects in memory.

Everything in JavaScript is an “Object” inside memory

Let’s compare ES5 “class” with ES6 class

How to create a "class" in ES5
// The following is NOT ES6 class keyword
// But if we "imagine" a class
// function looks like a constructor πŸ™‚
// class Account
// {
        function Account(id, name, bal) {

            // Public "fields"
            this.id = id;
            this.name = name;

            // Private "field"
            // NOT stored in the object
            // Stored in a "closure"dictionary
            // Which is visible ONLY inside this scope
            let balance = bal;

            // Operations
            this.deposit = function (amount) {
                
                // Inner functions can access
                // closure variables like balance
                balance += amount;
            }

            this.withdraw = function (amount) {
                // balance is NOT a field
                // But we can access outer variables
                balance -= amount;
            }

            this.print = function () {
                console.log(
                    `ID: ${this.id}`,
                    `Name: ${this.name}`,
                    `Balance: ${balance}`
                )
            }
        }
// }

let eich = new Account(1, "Brendan", 10000);
eich.print();
And here is the output
ID: 1 Name: Brendan Balance: 10000

So how do we create class in ES6?

I repeat.. internally JS is NOT class based OO language
/*
 * ES6 has a class keyword
 * But internally there are no classes
 */

// Using ES6 syntax
class Account {

    // Instead of function Account(...)
    constructor(id, name) {

        // Public "fields"
        this.id = id;
        this.name = name;
    }

    // Member function
    print() {
        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,

        )
    }
}

let eich = new Account(1, "Brendan");
eich.print();
Here is the output
ID: 1 Name: Brendan

Now let’s add a “private” field called balance in our Account class. Hmm.. where?
In our ES6, we just defined a closure variable in the outer function Account and that was available to inner functions like deposit and withdraw. So let’s try that

/*
 * ES6 has a class keyword
 * But internally there are no classes
 */

// Using ES6 syntax
class Account {

     balance = 10000;

    // Instead of function Account(...)
    constructor(id, name) {

        // Public "fields"
        this.id = id;
        this.name = name;
    }

    // Member function
    print() {
        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,
            `Balance: ${balance}`
        )
    }
}

let eich = new Account(1, "Brendan");
eich.print();
And here is the output
`Balance: ${balance}` 
          ^
ReferenceError: balance is not defined

Huh? What just happened? Isn’t balance a “closure” variables like ES5? NO!
class is NOT a function and variables inside are not local variables. In fact, balance=10000 has become a class level static field. It’s NOT in some closure.

Oh ok, SO? Even if it’s part of the class object, should I be able to acccess it? Yes and No.
class object itself is not part of eich object’s prototype chain (topic for another blog?), so it wont show up. Even if it did, there would be just one balance field for ALL objects. Hey, it would be nice to share such Account implementation with some billionaire – YOUR balance is now MINE! πŸ™‚

Ok ok, so how do we create “private” variables in ES6? Unfortunately, there is no direct way!
But we can store data in closure variables, just like ES5, though it would have to be outside the class.

/*
 * ES6 has a class keyword
 * But internally there are no classes
 */

// Local variable to module
let balance = 0;

// Using ES6 syntax
class Account {

    // Instead of function Account(...)
    constructor(id, name, bal) {

        // Public "fields"
        this.id = id;
        this.name = name;

        // Closure variable
        balance = bal;
    }

    // Member function
    print() {
        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,
            `Balance: ${balance}`
        )
    }
}

let eich = new Account(1, "Brendan", 10000);
eich.print(); // Should print 10000

let ray = new Account(2, "Romano", 20000);
ray.print(); // Should print 20000

eich.print(); // ?
And here is the output
ID: 1 Name: Brendan Balance: 10000
ID: 2 Name: Romano Balance: 20000
ID: 1 Name: Brendan Balance: 20000

Oops.. eich seems to be using ray’s balance!
Well, that’s because there is only ONE closure variable called balance, so whoever sets it last.. wins!

Fixing it.. with Map!!
/*
 * ES6 has a class keyword
 * But internally there are no classes
 */

// Local variable to module
let balances = new Map();

// Using ES6 syntax
class Account {

    // Instead of function Account(...)
    constructor(id, name, bal) {

        // Public "fields"
        this.id = id;
        this.name = name;

        // Closure map
        // each object (this) has its own balance
        balances.set(this, bal);
    }

    // Member function
    print() {
        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,
            `Balance: ${balances.get(this)}`
        )
    }
}

let eich = new Account(1, "Brendan", 10000);
eich.print(); // Should print 10000

let ray = new Account(2, "Romano", 20000);
ray.print(); // Should print 20000

eich.print(); // Should print 10000
ID: 1 Name: Brendan Balance: 10000
ID: 2 Name: Romano Balance: 20000
ID: 1 Name: Brendan Balance: 10000

There is however ONE problem… balances map will live on forever (technically until the process end). That’s because even after eich and ray objects are long gone, balances map keeps waiting that someday they will return and ask for their balance, but they are dead and long gone!

Hmm.. so what do we do? How can we ensure that if an object dies, it’s balance is also released?

WeakMap!!!

A WeakMap is a Map which does hold reference to other objects, but when it realises that it’s the only one holding on to that reference, it releases it. Problem SOLVED!

/*
 * ES6 has a class keyword
 * But internally there are no classes
 */

// Local variable to module
// WeakMap releases reference
// If it's the only one holding it
let balances = new WeakMap();

// Using ES6 syntax
class Account {

    // Instead of function Account(...)
    constructor(id, name, bal) {

        // Public "fields"
        this.id = id;
        this.name = name;

        // Closure map
        balances.set(this, bal);
    }

    // Member function
    print() {
        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,
            `Balance: ${balances.get(this)}`
        )
    }
}

let eich = new Account(1, "Brendan", 10000);
eich.print(); // Should print 10000

let ray = new Account(2, "Romano", 20000);
ray.print(); // Should print 20000

eich.print(); // Should print 10000

There! It’s done. We have implemented private fields in JavaScript ES6 with help of WeakMaps. Callers cannot access balance, as balance is not a “field” inside the object (ALL object properties are accessible, yes, even if you create them as Symbol, but balance being a closure variable, it’s accessible only within the class which has captured it.

Complete code with all fields as private
/*
 * ES6 has a class keyword
 * But internally there are no classes
 */

// Local variable to module
// WeakMap releases reference
// If it's the only one holding it
let _private = new WeakMap();

// Using ES6 syntax
class Account {

    // Instead of function Account(...)
    constructor(id, name, balance) {

        // Let's create an internal object 
        let _fields = {
            _id: id,
            _name: name,
            _balance: balance
        };

        // Now add it to closure variable
        _private.set(this, _fields);
    }

    // Nice lil touch... getters
    get id() {
        return _private.get(this)._id;
    };

    get name() {
        return _private.get(this)._name;
    };

    get balance() {
        return _private.get(this)._balance;
    }

    set balance(value) {
        throw 'balance is private';
    }

    // Account operations
    withdraw(amount) {
        
        // Modify our private field
        _private.get(this)._balance -= amount;
    }

    deposit(amount) {
        
        // _balance is in closure WeakMap
        // Not accessible from outside
        _private.get(this)._balance += amount;
    }

    // Member function
    print() {

        console.log(
            `ID: ${this.id}`,
            `Name: ${this.name}`,
            `Balance: ${this.balance}`
        )
    }
}

let eich = new Account(1, "Brendan", 10000);
eich.print(); // Should print 10000

let ray = new Account(2, "Romano", 20000);
ray.print(); // Should print 20000

// We can "access" balance thru getters
console.log(eich.id, eich.name, eich.balance);

// Cannot set balance though
eich.balance = 50000;      
console.log(eich.balance);
And here is the output
ID: 1 Name: Brendan Balance: 10000
ID: 2 Name: Romano Balance: 20000
1 Brendan 10000

throw 'balance is private';
^
balance is private

Yay! We have created “private” fields in ES6 class, but how about private functions?

Well, we can store functions also in the _fields object, but that would be wasteful because it will store a copy of the function in EACH object. The SAME function in EVERY object?

There has to be a more efficient way… If you have ideas about that, write them in comment

Private in JavaScript Series

Private in JavaScript ES5 (Part 1)

Did you know that private, protected and public are reserved words in JavaScript?
Really.. I mean it!!

Take a look at this MDN page – Reserved Identifiers.
Oh Wow! It means I can create private, protected fields and methods in JavaScript? Sadly NO!
These have been reserved since ages but JavaScript hasn’t used them.

Oh! So how do I create private field in JavaScript?
The answer is – It Depends!

Depends on what?
Depends on which version of ECMAScript you are writing code in.

Let's first understand why everything is public in JavaScript

JavaScript does NOT have classes (yea, yea.. that class keyword is just sugar coating)
JavaScript is a “prototype” object oriented language, which means there are ONLY objects.

Watch this DigitalCV video to find out How JavaScript Dictionaries look like

JavaScript Process Map and Dictionaries


And guess what these objects are? DICTIONARIES! That too with all members being visible, though you can change some properties to be non-enumerable, but it doesn’t really hide them.

So how do we hide properties of an object?

Even Object is made up of properties (keys in its dictionary)
Every property is again a dictionary (called Descriptors)

There are 4 descriptors

DescriptorDescription
valuevalue of the property
writable if false, the property is readonly
enumerableif false, the property does not show up in for..in or console.log
configurableif false, writable and enumerable can no longer be changed
Property Descriptors
Let's try hiding by property descriptor
/*
 * We are using old style for creating objects
 * Actually the "true" style as there is no class in JS
 */
function Person(id, name) {

    // Property descriptor for id property
    const idPropertyDescriptor = {
        value: id,
        writable: true,
        enumerable: false
    };
    Object.defineProperty(this, 
        "id", 
        idPropertyDescriptor
    );
    
    // Property descriptor for name property
    const namePropertyDescriptor = {
        value: name,
        writable: false,
        enumerable: true
    };
    Object.defineProperty(this, 
        "name", 
         namePropertyDescriptor
    );
}

// Let's create an object of type Person
let eich = new Person(1, "Brendan");

// id is enumerable false, hence won't show up
console.log('Printing entire object');
console.table(eich);

// But id exists and CAN be changed and printed
eich.id = 7;

// However, name is read-only... wont change
console.log('Trying to change name to Eich')
eich.name = "Eich";

// So id is not really private, just hidden
console.log('Printing properties by name')
console.log(`id=${eich.id}, name=${eich.name}`);

And.. here is the output!
Printing entire object
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ (index) β”‚  Values   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  name   β”‚ 'Brendan' β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Trying to change name to Eich

Printing properties by name
id=7, name=Brendan

Nope! Enumeration does not really make the property private, it merely “tries” to hide it, and that too is not very successfully because if you know the name, you can not only print it but even modify it.

We failed! πŸ˜”

Closures to the rescue

Huh? What is a closure?
When a function can “capture” (or close over) variables from outer functions, it creates a “closure” scope.

/*
 * Function is a way of creating objects in JS
 * Think of Account() as cosntructor
 */
function Account(id, name, bal) {

    // properties of the object
    this.id = id;
    this.name = name;

    // This is NOT a property
    // Local variable to Account
    let balance = bal;

    this.deposit = function(amount) {
        // deposit "closes over" balance 
        balance += amount;
    }

    this.withdraw = function(amount) {
        // balance is NOT a property
        // but local variable of outer function
        balance -= amount;
    }

    this.print = function print() {
        console.log(`ID=${this.id}, 
                     Name=${this.name}, 
                     Balance=${balance}`);
    }
};

Now lets runs this code
var eich = new Account(1, "Brendan", 10000.00);
eich.print();
eich.deposit(1000);
eich.print();
eich.withdraw(500);
eich.print();
And here is the output
ID=1, Name=Brendan, Balance=10000
ID=1, Name=Brendan, Balance=11000
ID=1, Name=Brendan, Balance=10500

Wait a minute… balance is behaving just like a field. We can update it, print it and YET it’s NOT a property of the object. So where is it?

The answer is – “It’s NOT inside the object dictionary but has been put in a different dictionary called “closure”, which is available to deposit and withdraw functions when they are called.

Sweet! So it it hidden from us? Like a private field? YES!

console.log(eich);
Account {
  id: 1,
  name: 'Brendan',
  deposit: [Function (anonymous)],
  withdraw: [Function (anonymous)],
  print: [Function: print]
}

YESS!! We have managed to “hide” balance and made it a “private” field of Account class.
But wait a minute.. what if we directly access it, like…

console.log(eich.id, eich.name, eich.balance);

Out of luck! It’s NOT inside the object, so we CANNOT access it at all. Only when the functions deposit and withdraw are called, they can “see” it in the closure dictionary

1 Brendan undefined
So, is this the way we create private methods too?

Absolutely! Just like local variables of outer function, even local functions are “captured” by closures.

Creating "private" functions
function Account(id, name, bal) {

    // properties of the object
    this.id = id;
    this.name = name;

    // Local variable to Account
    let balance = bal;

    // NOT part of the object
    // Local function (like private)
    // Cannot be called from outside
    function sendSMS() {
        console.log(`Sent SMS: Updated balance={balance}`);
    }

    this.deposit = function (amount) {
        // deposit "closes over" balance 
        balance += amount;

        // Can call local function
        sendSMS();
    }

    this.withdraw = function(amount) {
        // balance is NOT a property
        // but local variable of outer function
        balance -= amount;

        // Can call local function
        sendSMS();
    }

    this.print = function print() {
        console.log(`ID=${this.id}, 
                     Name=${this.name},
                     Balance=${balance}`);
    }    
};
Let's call it from outside
let eich = new Account(1, "Brendan", 10000.00);
eich.print();
eich.deposit(1000);
eich.print();
eich.withdraw(500);
eich.print();

// Try to sendSMS from outside the object
eich.sendSMS();

Nope! Does not work
ID=1, Name=Brendan, Balance=10000
Sent SMS: Updated balance=11000
ID=1, Name=Brendan, Balance=11000
Sent SMS: Updated balance=10500
ID=1, Name=Brendan, Balance=10500

eich.sendSMS();
     ^
TypeError: eich.sendSMS is not a function

Wonderful! We have managed to create private fields and functions using ES5 syntax!

Wait… what about ES6? class syntax?
Wait for the next blog post πŸ™‚

Private fields in JavaScript Series

Switch is EVIL! (Part 3b)

Its only fair that we did deworming of switch-case/if-else ladder from Java code, we should have equivalent C# code. But let’s take this opportunity to explore what .NET/C# have to offer in terms of deworming tools… like delegates, lambdas, Func<> and Action<>

Let's look at the beast
using System;
using System.Text;
class Evil
{
string changeCase(string caseType, params string[] words)
{
// Sanitize the input
caseType = caseType.Trim().ToLower();
// Accumulate result in StringBuilder
var result = new StringBuilder();
// We currently support lower and camel case
if (caseType == "lower")
{
// Every word is converted to lower
// And concatenated
foreach (var word in words)
{
result.Append(
word
.Trim()
.ToLower());
}
}
else if (caseType == "camel")
{
// First word is lower
result.Append(
words[0]
.Trim()
.ToLower()
);
// Subsequent words are Proper case
for (var offset = 1;
offset < words.Length;
offset++)
{
// First letter is uppercase
result.Append(
words[offset]
.Substring(0, 1)
.ToUpper());
// Rest of the letters are lowercase
result.Append(
words[offset].
Substring(1).
ToLower());
}
}
return result.ToString();
}
static void Main(string[] args)
{
Evil obj = new Evil();
Console.WriteLine(obj.changeCase("camel", "Two", "Words"));
}
}
view raw NetSwitch.cs hosted with ❤ by GitHub

The story is the same in all programming languages. Developers tend to use switch-case or if-else ladders as we were probably “taught” to use them when we started learning programming. It’s deeply ingrained into our brains, to the extent that we justify it as being simple.
I hear a a lot of people describe use of maps to eliminate switch as “OVER-ENGINEERING”.

So, we can use interface in C# (like we did with Java in Part 3a), but we could use something else this time… maybe delegate, lambda or Func<>

Get ready to Func
/**
* Func, Action and Predicate are generic deleagtes
* Action<int> -> takes one parameter and returns void
* Func<string, int> -> takes one parameter string and returns int
* Predicate<int> -> takes one parameter int and returns bool
*
* using in C# allows us to "typedef" a complex type
*/
using ConverterFunction = System.Func<string[], string>;
view raw TypeDef.cs hosted with ❤ by GitHub
How do we create a map of funcs?
/**
* Maps are called Dictionary in .NET
* We are creating a map between
* string -> "camel"
* Func<> -> camelConverter function
*/
Dictionary<string, ConverterFunction> converters =
new Dictionary<string, ConverterFunction>
{
{"lower", lowerConverter},
{"camel", camelConverter}
};
view raw Dictionary.cs hosted with ❤ by GitHub
We are set.. onward to refactoring
/**
* changeCase now uses converter Dictionary
* If a converter is found, it invokes it
* or else returns null
*/
public string changeCase(string caseType, params string[] words)
{
// Sanitize the input
caseType = caseType.Trim().ToLower();
// Lookup the converter function in dictonary
ConverterFunction converter;
converters.TryGetValue(caseType, out converter);
if (converter !=null)
return converter(words);
return null;
}
Now it's just a matter of creating converters
static string camelConverter(string[] words) {
var result = new StringBuilder();
// First word is lower
result.Append(
words[0]
.Trim()
.ToLower()
);
// Subsequent words are Proper case
for (var offset = 1;
offset < words.Length;
offset++)
{
// First letter is uppercase
result.Append(
words[offset]
.Substring(0, 1)
.ToUpper());
// Rest of the letters are lowercase
result.Append(
words[offset].
Substring(1).
ToLower());
}
return result.ToString();
}
static string lowerConverter(params string[] words) {
// Accumulate result in StringBuilder
var result = new StringBuilder();
// Every word is converted to lower
// And concatenated
foreach (var word in words)
{
result.Append(
word
.Trim()
.ToLower());
}
return result.ToString();
}
view raw Converters.cs hosted with ❤ by GitHub
It's all over... not!

In all the examples we have seen (JavaScript, Java and in this C# code), we have only used converter that are pre-defined. Which makes it difficult to add new converters at runtime. This means our code will have to be modified to include more converters. So what’s the use of eliminating switch-case if we still have to :manually” add the converter to the dictionary? Good point!

Let’s make our code flexible, so that apart from built in converters, we can add more converters at runtime WITHOUT having to modify our existing code.

It’s easy! Add two methods to our class which can add/remove converters.

Dynamism to the max!

Let’s first add a method to our Converter class, which will add more entries to the dictionary

/**
* Dynamically adding more converters
*/
public void AddConverter(string caseType, ConverterFunction func)
{
// That's it!
converters[caseType] = func;
}
Flexible client class
/**
* Client class can add a new converter
* either defined by itself or from some other source
*/
class ClientClass
{
// Snake case is very polular
// Where else? Python πŸ™‚
static string snakeConverter(params string[] words)
{
// snake_case_is_separated_by_underscore
var result = new StringBuilder();
// Convert each word to lower
// and separate using underscore
for (var offset = 0;
offset < words.Length1;
offset++)
{
result.Append(
words[offset]
.Trim()
.ToLower()
);
result.Append("_");
}
// Add the last word without _
result.Append(
words[words.Length 1]
.Trim()
.ToLower()
);
return result.ToString();
}
static void Main(string[] args)
{
var obj = new Angel();
// Add our own converter and call it
// No need to modify the coverter class
obj.AddConverter("snake", snakeConverter);
Console.WriteLine(obj.changeCase("snake", "Two", "Words"));
}
}

That’s it! We have created a Case Converter class which has a few built-in converters, like lower and camel, but we have also made it extensible, so that others can provide “plugins” for our class. third-party developers could come up with their own converters which the client can use plugin and call.

Here is the final code
using System;
using System.Collections.Generic;
using System.Text;
using ConverterFunction = System.Func<string[], string>;
class Angel
{
static string camelConverter(string[] words)
{
var result = new StringBuilder();
// First word is lower
result.Append(
words[0]
.Trim()
.ToLower()
);
// Subsequent words are Proper case
for (var offset = 1;
offset < words.Length;
offset++)
{
// First letter is uppercase
result.Append(
words[offset]
.Substring(0, 1)
.ToUpper());
// Rest of the letters are lowercase
result.Append(
words[offset].
Substring(1).
ToLower());
}
return result.ToString();
}
static string lowerConverter(params string[] words)
{
// Accumulate result in StringBuilder
var result = new StringBuilder();
// Every word is converted to lower
// And concatenated
foreach (var word in words)
{
result.Append(
word
.Trim()
.ToLower());
}
return result.ToString();
}
Dictionary<string, ConverterFunction> converters =
new Dictionary<string, ConverterFunction>
{
{"lower", lowerConverter},
{"camel", camelConverter}
};
public string changeCase(string caseType, params string[] words)
{
// Sanitize the input
caseType = caseType.Trim().ToLower();
ConverterFunction converter;
converters.TryGetValue(caseType, out converter);
if (converter != null)
return converter(words);
return null;
}
public void AddConverter(string caseType, ConverterFunction func)
{
converters[caseType] = func;
}
}
class ClientClass
{
static string snakeConverter(params string[] words)
{
// snake_case_is_separated_by_underscore
var result = new StringBuilder();
// Convert each word to lower
// and separate using underscore
for (var offset = 0;
offset < words.Length1;
offset++)
{
result.Append(
words[offset]
.Trim()
.ToLower()
);
result.Append("_");
}
// Add the last word without _
result.Append(
words[words.Length 1]
.Trim()
.ToLower()
);
return result.ToString();
}
static void Main(string[] args)
{
var obj = new Angel();
obj.AddConverter("snake", snakeConverter);
Console.WriteLine(obj.changeCase("snake", "Two", "Words"));
}
}

Are you now convinced that ALL switch MUST DIE! ?

Switch is EVIL Series

  1. Switch is EVIL! (Part 1) – Convert it to arrays
  2. Switch is EVIL! (Part 2) – Convert it to maps
  3. Switch is EVIL! (Part 3) – Create function maps
  4. Switch is EVIL! (Part 3a) – Implementing in Java
  5. Switch is EVIL! (Part 3b) – Implementing in C#