Constructors and Object Creation
This chapter assumes familiarity with function calls and type checking from previous parts.
When you write p = new Point(10, 20), several things happen:
- Memory is allocated on the heap
- The constructor (
__init__) is called - The object is initialized with the given values
- A pointer to the object is returned
Let us understand each step.
The __init__ Constructor
In Thirdlang (like Python), the constructor is named __init__:
class Point {
x: int
y: int
def __init__(self, x: int, y: int) {
self.x = x
self.y = y
}
}
The constructor:
- Always named
__init__ - First parameter is always
self - Has no explicit return type (implicitly returns nothing)
- Responsible for initializing all fields
Constructor Parameters
The constructor parameters (after self) become the arguments to new:
def __init__(self, x: int, y: int) { ... }
# ^^^^^^^^^^^^^ These become:
p = new Point(10, 20)
# ^^^^^^ Constructor arguments
When you call new Point(10, 20):
selfis the newly allocated objectxis10yis20
Initializing Fields
Inside __init__, we use self.field = value to set fields:
def __init__(self, x: int, y: int) {
self.x = x # Set the 'x' field to parameter 'x'
self.y = y # Set the 'y' field to parameter 'y'
}
Fields must be initialized before the object can be used. Accessing an uninitialized field is undefined behavior (like uninitialized variables in C).
The new Expression
The new keyword creates objects:
p = new Point(10, 20)
This is an expression that:
- Allocates memory - Enough bytes for all fields
- Calls
__init__- Passing the new object asself - Returns a pointer - To the newly created object
How new Works
The result is a pointer to initialized memory.
Type Checking Constructors
The type checker verifies constructor calls:
/// Type check and infer types for a program
pub fn typecheck(program: &mut Program) -> Result<ClassRegistry, String> {
let mut ctx = TypeContext::new();
// First pass: register all classes
for item in program.iter() {
if let TopLevel::Class(class) = item {
register_class(&mut ctx, class)?;
}
}
// Second pass: collect function signatures
for item in program.iter() {
if let TopLevel::Stmt(Stmt::Function {
name,
params,
return_type,
..
}) = item
{
let param_types: Vec<Type> = params.iter().map(|(_, t)| t.clone()).collect();
let func_type = Type::Function {
params: param_types,
ret: Box::new(return_type.clone()),
};
ctx.global_env.insert(name.clone(), func_type);
}
}
// Third pass: type check classes
for item in program.iter_mut() {
if let TopLevel::Class(class) = item {
typecheck_class(&mut ctx, class)?;
}
}
// Fourth pass: type check statements
// Use a persistent environment for top-level statements
let mut top_level_env = ctx.global_env.clone();
for item in program.iter_mut() {
if let TopLevel::Stmt(stmt) = item {
typecheck_stmt(&mut ctx, stmt, &mut top_level_env)?;
}
}
Ok(ctx.classes)
}
For new ClassName(args):
- Check class exists - Is there a class named
ClassName? - Check constructor exists - Does it have
__init__? - Check argument count - Right number of arguments (excluding
self)? - Check argument types - Do types match the constructor parameters?
- Return class type - The expression has type
Class("ClassName")
Example Type Check
class Point {
x: int
y: int
def __init__(self, x: int, y: int) { ... }
}
p = new Point(10, 20) # OK: 2 args match (int, int)
q = new Point(10) # ERROR: expected 2 args, got 1
r = new Point(10, true) # ERROR: expected int, got bool
Code Generation for new
Here is how we generate LLVM IR for object creation:
Expr::New { class, args } => {
// Get struct type and size
let struct_type = self.class_types.get(class).ok_or("Class type not found")?;
// Calculate size (number of fields * 8 bytes)
let class_info = self.classes.get(class).ok_or("Class not found")?;
let size = (class_info.size() * 8).max(8) as u64; // At least 8 bytes
let size_val = self.context.i64_type().const_int(size, false);
// Call malloc
let malloc_fn = self.module.get_function("malloc").unwrap();
let ptr = self
.builder
.build_call(malloc_fn, &[size_val.into()], "obj")
.unwrap()
.try_as_basic_value()
.unwrap_basic()
.into_pointer_value();
// Initialize fields to zero
for (i, _) in class_info.field_order.iter().enumerate() {
let field_ptr = self
.builder
.build_struct_gep(*struct_type, ptr, i as u32, "init_field")
.unwrap();
let zero = self.context.i64_type().const_int(0, false);
self.builder.build_store(field_ptr, zero).unwrap();
}
// Call constructor if exists
let ctor_name = format!("{}____init__", class);
if let Some(ctor) = self.functions.get(&ctor_name).cloned() {
let mut ctor_args: Vec<BasicMetadataValueEnum> = vec![ptr.into()];
for arg in args {
ctor_args.push(self.compile_expr(arg)?.into());
}
self.builder.build_call(ctor, &ctor_args, "").unwrap();
}
Ok(ptr.into())
}
The generated LLVM IR looks like:
; new Point(10, 20)
%size = call i64 @llvm.sizeof.s_Point()
%raw = call ptr @malloc(i64 %size)
call void @Point__init(ptr %raw, i64 10, i64 20)
; %raw is now a pointer to an initialized Point
Memory Layout
Objects are laid out in memory as LLVM structs:
class Point {
x: int # offset 0, 8 bytes
y: int # offset 8, 8 bytes
} # total: 16 bytes
In LLVM IR:
%Point = type { i64, i64 }
; ^^^ ^^^
; x y
Field order matters! We use field_order in ClassInfo to maintain consistent layout.
Constructors Without Parameters
Some classes have zero-parameter constructors:
class Counter {
count: int
def __init__(self) {
self.count = 0
}
}
c = new Counter() # No arguments
The constructor still receives self, but no other arguments.
Multi-Field Initialization
For classes with many fields, the constructor initializes them all:
class Rectangle {
x: int
y: int
width: int
height: int
def __init__(self, x: int, y: int, w: int, h: int) {
self.x = x
self.y = y
self.width = w
self.height = h
}
}
r = new Rectangle(0, 0, 100, 50)
Each field gets initialized in the constructor body.
Common Patterns
Default Values in Constructor
class Config {
value: int
enabled: bool
def __init__(self) {
self.value = 42 # Default value
self.enabled = true # Default enabled
}
}
Computed Initialization
class Square {
side: int
area: int
def __init__(self, side: int) {
self.side = side
self.area = side * side # Computed from input
}
}
Validation (sort of)
Since we do not have exceptions, validation is limited:
class PositiveInt {
value: int
def __init__(self, v: int) {
# Cannot truly validate, but can clamp
if (v < 0) {
self.value = 0
} else {
self.value = v
}
}
}
What About Failure?
Our constructors cannot fail. In real languages, constructors might:
- Throw exceptions (Java, Python)
- Return
ResultorOption(Rust) - Use factory methods instead
We keep things simple: constructors always succeed.
Summary
| Concept | Syntax | Purpose |
|---|---|---|
| Constructor | def __init__(self, ...) | Initialize new objects |
| Object creation | new ClassName(args) | Allocate and initialize |
| Field assignment | self.field = value | Set field values |
| Return type | Implicit Unit | Constructors don’t return values |
In the next chapter, we look at methods and the self parameter.