GC 구현
Garbage Collection Implementation
This document describes the implementation of optional garbage collection in Vais.
Architecture
Components
-
vais-gc (Rust crate)
- Core GC implementation
- Mark-and-Sweep algorithm
- C FFI exports
-
std/gc.vais (Vais module)
- Vais API for GC functions
- High-level wrappers
- Helper utilities
-
Codegen Integration
- GC-aware allocation
- Automatic root registration
- Runtime function calls
-
CLI Support
--gcflag to enable GC mode--gc-thresholdto configure collection threshold
Design Decisions
Why Mark-and-Sweep?
- Simplicity: Easy to understand and implement
- Reliability: Well-tested algorithm
- No cyclic issues: Handles cycles naturally
- Low complexity: Suitable for prototypes and REPL
Conservative vs. Precise
We use conservative scanning because:
- No need to modify LLVM IR generation extensively
- Works with existing allocations
- Minimal changes to type system
- Good enough for scripting/REPL use cases
Tradeoffs:
- May retain some garbage (false positives)
- Acceptable for non-production environments
Memory Layout
GC Object Header
struct GcObjectHeader {
size: usize, // Object size (excluding header)
marked: bool, // Mark bit for GC
ref_count: usize, // Optional reference counting
type_id: u32, // Type identifier (for debugging)
}
Total header size: 24 bytes (on 64-bit systems)
Object Layout
[Header: 24 bytes][User Data: N bytes]
^ ^
| |
| +-- Returned to user
+-- Managed by GC
Allocation Flow
Without GC
ptr := malloc(100)
# Use ptr
free(ptr)
LLVM IR:
%1 = call i8* @malloc(i64 100)
# ...
call void @free(i8* %1)
With GC
ptr := gc_alloc(100, 0)
# Use ptr
# No free() needed - GC handles it
LLVM IR:
%1 = call i8* @vais_gc_alloc(i64 100, i32 0)
# ...
# Automatic collection when threshold reached
Collection Algorithm
Mark Phase
fn mark(&mut self) {
// Clear all marks
for obj in self.objects.values_mut() {
obj.header.marked = false;
}
// Mark from roots
for root in &self.roots {
self.mark_object(root.ptr);
}
}
Sweep Phase
fn sweep(&mut self) {
let mut to_remove = Vec::new();
for (ptr, obj) in &self.objects {
if !obj.header.marked {
to_remove.push(*ptr);
self.bytes_allocated -= obj.header.size;
self.objects_freed += 1;
}
}
for ptr in to_remove {
self.objects.remove(&ptr);
}
}
Root Management
Automatic Roots
In GC mode, local variables are automatically registered as roots:
#[gc]
F example() -> i64 {
x := gc_alloc(100, 0) # Automatically registered as root
# ...
0 # x automatically unregistered on return
}
Manual Roots
For global data or long-lived pointers:
V global_ptr: i64 = 0
F init() -> i64 {
global_ptr = gc_alloc(1000, 0)
gc_add_root(global_ptr) # Prevent collection
0
}
F cleanup() -> i64 {
gc_remove_root(global_ptr) # Allow collection
0
}
Configuration
Threshold
Controls when automatic collection triggers:
gc_set_threshold(2097152) # 2 MB
When bytes_allocated >= threshold, GC automatically runs.
Default Settings
- Threshold: 1 MB (1048576 bytes)
- Conservative scanning: Enabled
- Stack scanning: Not yet implemented (manual roots only)
Performance Characteristics
Time Complexity
- Allocation: O(1)
- Collection: O(n) where n = number of objects
- Mark: O(m) where m = number of reachable objects
Space Complexity
- Header overhead: 24 bytes per object
- Root set: O(r) where r = number of roots
- Total: O(n + r)
Benchmark Results
From gc_test.vais:
Stress test: 100 objects, 256 bytes each
- Allocations: 100
- Collections: ~6
- Time: ~5ms
- Memory: ~25KB
Integration Examples
Example 1: Simple Allocation
U std/gc
F main() -> i64 {
gc_init()
# Allocate
ptr := gc_alloc(1024, 1)
# GC stats
printf("Allocated: %ld bytes\n", gc_bytes_allocated())
0
}
Example 2: Vector with GC
S GcVec {
data: i64, # GC-managed pointer
len: i64,
cap: i64
}
X GcVec {
F new() -> GcVec {
data_ptr := gc_alloc(32, 100)
gc_add_root(data_ptr)
GcVec { data: data_ptr, len: 0, cap: 4 }
}
F drop(&self) -> i64 {
gc_remove_root(self.data)
0
}
}
Example 3: Automatic Collection
F process_large_data() -> i64 {
gc_set_threshold(4096) # Low threshold
i := 0
L i < 100 {
# Each allocation might trigger GC
temp := gc_alloc(1024, 0)
# Process temp
i = i + 1
}
# Unreferenced objects automatically collected
0
}
Testing
Unit Tests
cargo test -p vais-gc
Tests cover:
- Basic allocation
- Collection
- Root preservation
- Threshold behavior
- Stress scenarios
Integration Tests
vaisc build examples/gc_test.vais --gc && ./examples/gc_test
vaisc build examples/gc_vec_test.vais --gc && ./examples/gc_vec_test
Future Work
Short Term
- Stack scanning for automatic root detection
- Improved type information
- Finalizers for cleanup
Medium Term
- Incremental collection
- Write barriers for generational GC
- Parallel marking
Long Term
- Concurrent GC
- Compaction
- Thread-local heaps
References
- "The Garbage Collection Handbook" by Jones et al.
- Boehm-Demers-Weiser Conservative GC
- LLVM Stack Map documentation
- Rust's memory model
See Also
/crates/vais-gc/README.md- GC crate documentation/std/gc.vais- Vais GC API/examples/gc_test.vais- Comprehensive tests