How Programs Remember: Stack and Heap
Introduction
Learning high-level programming languages usually involves the concept that “unused variables are automatically garbage collected”. However, as one gets closer to the lower levels or starts exploring performance issues, it becomes apparent that the core memory concepts of Stack and Heap are not so clear. How do programming languages allocate and manage memory? Specifically, what does garbage collection (GC) do?
Why Do We Need Stack and Heap?
Programs need to store variables in memory while the operating system, acting as a mediator between the program and hardware, uses Stack and Heap to manage that memory:
- Efficiency: Modern CPUs use caches to store frequently accessed memory segments. When memory changes, the cache must be updated, which incurs a cost. From a performance standpoint, the more contiguous and compact the data in memory, the higher the CPU cache hit rate, leading to better performance. Using the predictable memory allocation of the Stack allows for extremely high access efficiency.
- Flexibility: For data that needs to change dynamically and for large storage demands, the flexible storage characteristics of the Heap complement the weaknesses of the Stack.
Stack
The Stack is a “last in, first out” data structure used to store information during function execution. Imagine a stack of pancakes eaten from the top downward, with the last pancake put on the stack being the first to be eaten, and content is modified through push (adding) and pop (removing) operations. The principle is as follows:
- The system creates a new Stack Frame on the Stack.
- It places the function’s parameters, local variables, return address, register state, etc., into it.
- After the function completes, the entire frame is popped off.
-
Advantages:
- Fast and Frequent Operations: Structurally, there’s no need to find positions or determine whether data is still in use, leading to high operational efficiency due to predictable outcomes.
- Compact Storage: Efficient use of memory leads to convenient caching.
-
Disadvantages:
- Inability to Handle Dynamic Size Data: Structurally, it is challenging to elegantly handle data with dynamic sizes.
- Limited Space: Stack requires fixed size allocation, usually several MB, which may lead to Stack Overflow in cases of large data or recursion.
Heap
Process ├─ Thread A → Stack A ├─ Thread B → Stack B └─ Heap (shared)The Heap is an area of memory used for storing “dynamically allocated data”. It does not have strict entry/exit order constraints and can adopt different allocation strategies (First Fit, Best Fit, Worst Fit).
Due to the Stack’s smaller fixed size and the need to compactly push data during writes, the Stack typically only holds pointers to the Heap, with actual data allocation occurring on the Heap to avoid small Stack capacity and lifecycle limitations.
-
Advantages:
- Flexible Writing and Release: Can dynamically request more space from the OS.
- Can Store Data That Lives Across Multiple Functions: Since the Stack has fixed capacity and limited data lifetimes, data that needs to persist across function calls or sizes that cannot be determined at compile time is usually allocated on the Heap, with the Stack only saving pointers to the Heap.
-
Disadvantages:
- Slower Allocation Speed: Requires searching for suitable memory blocks to store data.
- Requires Additional Management Mechanisms: In high-level languages, this is usually handled by garbage collectors (GC). In C, it is done through:
malloc / calloc / realloc: Allocating memory from the Heap.free: Releasing allocated memory.
- Can Cause Memory Fragmentation: Long-term allocation and deallocation of differently sized blocks can create non-contiguous available space.
Go: Observing Variables Escaping to the Heap
The Go compiler can use escape analysis to decide whether variables should be allocated on the Stack or Heap. If the compiler cannot prove that a variable will not be used after a function returns, the variable will “escape” to the Heap:
go build -gcflags="-m" main.go
./heap-test.go:5:5: moved to heap: x./heap-test.go:11:5: moved to heap: count./heap-test.go:20:9: moved to heap: arr./heap-test.go:27:12: 100 escapes to heappackage main
import "fmt"
// Example 1: Returning a pointer to a local variablefunc createPointer() *int { x := 42 // x is initially on the stack return &x // Returning the pointer causes x to escape to the heap}
// Example 2: Variable being captured by closurefunc createClosure() func() int { count := 0 // count escapes to the heap return func() int { count++ return count }}
// Example 3: Variable too largefunc largeArray() *[10000]int { var arr [10000]int // Too large, escapes to the heap return &arr}
// Example 4: Interface typefunc useInterface() interface{} { x := 100 // Assigning to interface causes escape return x}
func main() { // Example 1 ptr := createPointer() fmt.Println("Pointer value:", *ptr)
// Example 2 counter := createClosure() fmt.Println("Count:", counter()) // 1 fmt.Println("Count:", counter()) // 2
// Example 3 arr := largeArray() fmt.Println("First element of array:", arr[0])
// Example 4 val := useInterface() fmt.Println("Interface value:", val)}Further Reading
- Memory Explained - Why Your Code Crashes (Stack vs Heap) - LearnThatStack
- Golang pointers explained, once and for all - JamieGo
- WHY IS THE STACK SO FAST? - Core Dumped
- WHY IS THE HEAP SO SLOW? - Core Dumped
- CPU Core / Process / Thread / Fiber / Coroutine 差異與解釋 - WebDong