Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Constructors and Object Creation

This chapter assumes familiarity with function calls and type checking from previous parts.

When you write p = new Point(10, 20), several things happen:

  1. Memory is allocated on the heap
  2. The constructor (__init__) is called
  3. The object is initialized with the given values
  4. A pointer to the object is returned

Let us understand each step.

The __init__ Constructor

In Thirdlang (like Python), the constructor is named __init__:

class Point {
    x: int
    y: int

    def __init__(self, x: int, y: int) {
        self.x = x
        self.y = y
    }
}

The constructor:

  • Always named __init__
  • First parameter is always self
  • Has no explicit return type (implicitly returns nothing)
  • Responsible for initializing all fields

Constructor Parameters

The constructor parameters (after self) become the arguments to new:

def __init__(self, x: int, y: int) { ... }
#                  ^^^^^^^^^^^^^ These become:
p = new Point(10, 20)
#             ^^^^^^ Constructor arguments

When you call new Point(10, 20):

  1. self is the newly allocated object
  2. x is 10
  3. y is 20

Initializing Fields

Inside __init__, we use self.field = value to set fields:

def __init__(self, x: int, y: int) {
    self.x = x    # Set the 'x' field to parameter 'x'
    self.y = y    # Set the 'y' field to parameter 'y'
}

Fields must be initialized before the object can be used. Accessing an uninitialized field is undefined behavior (like uninitialized variables in C).

The new Expression

The new keyword creates objects:

p = new Point(10, 20)

This is an expression that:

  1. Allocates memory - Enough bytes for all fields
  2. Calls __init__ - Passing the new object as self
  3. Returns a pointer - To the newly created object

How new Works


How new works

The result is a pointer to initialized memory.

Type Checking Constructors

The type checker verifies constructor calls:

/// Type check and infer types for a program
pub fn typecheck(program: &mut Program) -> Result<ClassRegistry, String> {
    let mut ctx = TypeContext::new();

    // First pass: register all classes
    for item in program.iter() {
        if let TopLevel::Class(class) = item {
            register_class(&mut ctx, class)?;
        }
    }

    // Second pass: collect function signatures
    for item in program.iter() {
        if let TopLevel::Stmt(Stmt::Function {
            name,
            params,
            return_type,
            ..
        }) = item
        {
            let param_types: Vec<Type> = params.iter().map(|(_, t)| t.clone()).collect();
            let func_type = Type::Function {
                params: param_types,
                ret: Box::new(return_type.clone()),
            };
            ctx.global_env.insert(name.clone(), func_type);
        }
    }

    // Third pass: type check classes
    for item in program.iter_mut() {
        if let TopLevel::Class(class) = item {
            typecheck_class(&mut ctx, class)?;
        }
    }

    // Fourth pass: type check statements
    // Use a persistent environment for top-level statements
    let mut top_level_env = ctx.global_env.clone();
    for item in program.iter_mut() {
        if let TopLevel::Stmt(stmt) = item {
            typecheck_stmt(&mut ctx, stmt, &mut top_level_env)?;
        }
    }

    Ok(ctx.classes)
}

thirdlang/src/typeck.rs

For new ClassName(args):

  1. Check class exists - Is there a class named ClassName?
  2. Check constructor exists - Does it have __init__?
  3. Check argument count - Right number of arguments (excluding self)?
  4. Check argument types - Do types match the constructor parameters?
  5. Return class type - The expression has type Class("ClassName")

Example Type Check

class Point {
    x: int
    y: int
    def __init__(self, x: int, y: int) { ... }
}

p = new Point(10, 20)      # OK: 2 args match (int, int)
q = new Point(10)          # ERROR: expected 2 args, got 1
r = new Point(10, true)    # ERROR: expected int, got bool

Code Generation for new

Here is how we generate LLVM IR for object creation:

            Expr::New { class, args } => {
                // Get struct type and size
                let struct_type = self.class_types.get(class).ok_or("Class type not found")?;

                // Calculate size (number of fields * 8 bytes)
                let class_info = self.classes.get(class).ok_or("Class not found")?;
                let size = (class_info.size() * 8).max(8) as u64; // At least 8 bytes
                let size_val = self.context.i64_type().const_int(size, false);

                // Call malloc
                let malloc_fn = self.module.get_function("malloc").unwrap();
                let ptr = self
                    .builder
                    .build_call(malloc_fn, &[size_val.into()], "obj")
                    .unwrap()
                    .try_as_basic_value()
                    .unwrap_basic()
                    .into_pointer_value();

                // Initialize fields to zero
                for (i, _) in class_info.field_order.iter().enumerate() {
                    let field_ptr = self
                        .builder
                        .build_struct_gep(*struct_type, ptr, i as u32, "init_field")
                        .unwrap();
                    let zero = self.context.i64_type().const_int(0, false);
                    self.builder.build_store(field_ptr, zero).unwrap();
                }

                // Call constructor if exists
                let ctor_name = format!("{}____init__", class);
                if let Some(ctor) = self.functions.get(&ctor_name).cloned() {
                    let mut ctor_args: Vec<BasicMetadataValueEnum> = vec![ptr.into()];
                    for arg in args {
                        ctor_args.push(self.compile_expr(arg)?.into());
                    }
                    self.builder.build_call(ctor, &ctor_args, "").unwrap();
                }

                Ok(ptr.into())
            }

thirdlang/src/codegen.rs

The generated LLVM IR looks like:

; new Point(10, 20)
%size = call i64 @llvm.sizeof.s_Point()
%raw = call ptr @malloc(i64 %size)
call void @Point__init(ptr %raw, i64 10, i64 20)
; %raw is now a pointer to an initialized Point

Memory Layout

Objects are laid out in memory as LLVM structs:

class Point {
    x: int    # offset 0, 8 bytes
    y: int    # offset 8, 8 bytes
}             # total: 16 bytes

In LLVM IR:

%Point = type { i64, i64 }
;              ^^^  ^^^
;               x    y

Field order matters! We use field_order in ClassInfo to maintain consistent layout.

Constructors Without Parameters

Some classes have zero-parameter constructors:

class Counter {
    count: int

    def __init__(self) {
        self.count = 0
    }
}

c = new Counter()   # No arguments

The constructor still receives self, but no other arguments.

Multi-Field Initialization

For classes with many fields, the constructor initializes them all:

class Rectangle {
    x: int
    y: int
    width: int
    height: int

    def __init__(self, x: int, y: int, w: int, h: int) {
        self.x = x
        self.y = y
        self.width = w
        self.height = h
    }
}

r = new Rectangle(0, 0, 100, 50)

Each field gets initialized in the constructor body.

Common Patterns

Default Values in Constructor

class Config {
    value: int
    enabled: bool

    def __init__(self) {
        self.value = 42      # Default value
        self.enabled = true  # Default enabled
    }
}

Computed Initialization

class Square {
    side: int
    area: int

    def __init__(self, side: int) {
        self.side = side
        self.area = side * side  # Computed from input
    }
}

Validation (sort of)

Since we do not have exceptions, validation is limited:

class PositiveInt {
    value: int

    def __init__(self, v: int) {
        # Cannot truly validate, but can clamp
        if (v < 0) {
            self.value = 0
        } else {
            self.value = v
        }
    }
}

What About Failure?

Our constructors cannot fail. In real languages, constructors might:

  • Throw exceptions (Java, Python)
  • Return Result or Option (Rust)
  • Use factory methods instead

We keep things simple: constructors always succeed.

Summary

ConceptSyntaxPurpose
Constructordef __init__(self, ...)Initialize new objects
Object creationnew ClassName(args)Allocate and initialize
Field assignmentself.field = valueSet field values
Return typeImplicit UnitConstructors don’t return values

In the next chapter, we look at methods and the self parameter.