Why do programs produce garbage? 🤔
Well, we first have to understand memory internals of programming languages to understand what is dynamic memory and what constitutes “garbage”. So let’s take a look at this Digital ConceptVisual video which explains Process Map, Segments, Global Data area, Stack and Heap, the last one being the focus of our attention.
As we can see from the above video, global variables (if any in the language) are stored in DATA/BSS segment, local variables are in the STACK and any data created at run time are stored in HEAP.
Languages like C/C++ were mostly Global and Local variables and only when the developer specifically called malloc/calloc or new, it allocated memory in the HEAP.
One great thing about HEAP allocation is that it’s allocated at runtime and can be “released” by the developer. If the developer stops using the allocated memory and does not release it, well, that’s when it’s called “Garbage” memory.
Different languages have different strategy for releasing heap allocated memory once it’s no longer needed
Technology | Grabage Collection |
---|---|
C | You clean up your own mess |
C++ | You clean up your own mess |
Java | Mark and Sweep Garbage Collector (GC) |
Swift | Automatic Reference Counting (ARC) |
COM | Reference Counting (RC) |
C# | Mark and Sweep Garbage Collector (GC) |
Python | Automatic Reference Counting (ARC) + Mark and Sweep Garbage Collector (GC) |
JavaScript | Earlier was ARC but now most JS Engine implement Mark and Sweep (GC) |
So there are 4 Garbage collection strategies
- No automatic Garbage Collection – C/C++
- Manual Reference Counting (RC) – COM
- Automatic ReferenceCounting (ARC) – Swift
- Mark and Sweep (Compact or Copy) – Java, .NET, JavaScript
1. No Automatic Garbage Collection - C/C++
/*
* File: NoGC.c
* Project: CPP
* Author: Sanjay Vyas
*
* Description:
* C program to demonstarte No GC
*
* Revision History:
* 2020-Jun-20 [SV]: Created
*/
#include <stdio.h> // printf/scanf
#include <stdlib.h> // malloc/qsort
// Maximum random number
#define MAX (100)
// Helper function to print array
void print(int *array, int size)
{
int offset;
// Print comma (upto 2nd last)
for (offset = 0;
offset < size-1;
offset++)
{
printf("%d, ", array[offset]);
}
// Print last element
printf("%d\n", array[size-1]);
}
// Compare function required for qsort
// Return
// 0 if both are equal
// <0 if lhs is < rhs
// >0 if lhs is > rhs
int intcmp(const void *lhs, const void *rhs)
{
// We are given 2 elements as void *
// We need to typecast them back to int
return *(int *)lhs - *(int *)rhs;
}
int main()
{
int *pointer = NULL; // Defensive programming 🙂
int size;
int offset;
while (printf("Enter size (0 to quit): "),
scanf("%d", &size),
size)
{
// Allocate heap memory
pointer = (int *)malloc(size * sizeof(int));
// Randomize it
for (offset = 0;
offset < size;
offset++)
{
pointer[offset] = rand()%MAX;
}
// Now print the array
print(pointer, size);
// Let's sort it using built in function
qsort(pointer, size, sizeof(int), intcmp);
// And print it again
print(pointer, size);
// WARNING! Memory is LEAKING here
// We did not free(pointer)
// The loop will use same pointer
// to malloc again, losing the previous address
}
}
What happens we do a malloc or new

And what is a memory leak?

What we see in the above image is a result of not deallocating memory (free or delete) and losing the address. Over a period of time, the application will fill up its heap area and would no longer be able to allocate more memory as previously unused allocations are occupying the heap area. These unused allocations are called GARBAGE.
Remember to free your allocations
/*
* File: FreeMem.c
* Project: CPP
* Author: Sanjay Vyas
*
* Description:
* C program to demonstarte No GC
*
* Revision History:
* 2020-Jun-20 [SV]: Created
*/
#include <stdio.h> // printf/scanf
#include <stdlib.h> // malloc/qsort
// Maximum random number
#define MAX (100)
// Helper function to print array
void print(int *array, int size)
{
int offset;
// Print comma (upto 2nd last)
for (offset = 0;
offset < size - 1;
offset++)
{
printf("%d, ", array[offset]);
}
// Print last element
printf("%d\n", array[size - 1]);
}
// Compare function required for qsort
// Return
// 0 if both are equal
// <0 if lhs is < rhs
// >0 if lhs is > rhs
int intcmp(const void *lhs, const void *rhs)
{
// We are given 2 elements as void *
// We need to typecast them back to int
return *(int *)lhs - *(int *)rhs;
}
int main()
{
int *pointer = NULL; // Defensive programming 🙂
int size;
int offset;
while (printf("Enter size (0 to quit): "),
scanf("%d", &size),
size)
{
// Allocate heap memory
pointer = (int *)malloc(size * sizeof(int));
// Randomize it
for (offset = 0;
offset < size;
offset++)
{
pointer[offset] = rand() % MAX;
}
// Now print the array
print(pointer, size);
// Let's sort it using built in function
qsort(pointer, size, sizeof(int), intcmp);
// And print it again
print(pointer, size);
// Remember to deallocate memory
free(pointer);
}
}
In C programming there is no built-in GC (Garbage Collection) and as developers we must keep track of the memory we are allocating and runtime and deallocate it when not required. This is tough given that the codebase may be huge and memory addresses may be passed around, losing track of who allocated it and who will deallocate it. Add to this, many developers do not have visualisation of what is happening in memory.
No wonder, languages which came after C/C++ implemented some from of Garbage Collection (Java, .NET, Python, Swift)