Building an Interpreter in Go - Chapter 2: Pratt Parsing

Photo by Andrew Neel on Unsplash

Building an Interpreter in Go - Chapter 2: Pratt Parsing

Overview

In the previous chapter, we built a basic lexer that could break down our source code into individual tokens. This provided the foundation for the next step - parsing those tokens into an Abstract Syntax Tree (AST) that represents the structure of the program.

In this chapter, we dive into Pratt Parsing, an elegant parsing technique that allows us to handle complex expressions in a clean and extensible way. We'll learn how to parse let statements, return statements, if-else expressions, function definitions, and function calls using this method.

Key Concepts Learned

Pratt Parsing

Pratt Parsing is a top-down parsing technique developed by Vaughan Pratt. It allows us to parse expressions of arbitrary complexity by associating precedence and binding power to our language's operators.

The core idea is to have separate parsing functions for each precedence level, allowing the parser to "climb up and down" the expression tree as it encounters different operators.

Parsing let Statements

Let's start with the basic let statement, which assigns a value to a variable:

let x = 5;
let y = x + 3;

We'll need to parse the variable name, the assignment operator, and the expression on the right-hand side. Pratt Parsing makes this relatively straightforward.

Parsing Return Statements

Next, we'll tackle return statements, which allow us to exit a function and provide a return value:

let addTwo = fn(x) { return x + 2; };

Parsing a return statement involves identifying the keyword, parsing the expression being returned, and creating the appropriate AST node.

Parsing If-Else Expressions

We'll also implement parsing for if-else expressions, which provide conditional branching in our language:

let result = if (x > 10) {
    return "greater";
} else {
    return "less";
};

This requires parsing the if keyword, the condition expression, the consequent block, the else keyword (if present), and the alternative block.

Parsing Function Definitions

Next, we'll handle function definitions, which allow us to create reusable pieces of code:

let addTwo = fn(x) { return x + 2; };

Parsing a function definition involves identifying the fn keyword, parsing the parameter list, and parsing the function body.

Parsing Function Calls

Finally, we'll implement parsing for function calls, which allow us to execute the defined functions:

let result = addTwo(5);

Parsing a function call requires identifying the function name, parsing the argument list, and creating the appropriate AST node.

Code Implementation Highlights

Parsing the Expression Statement and the Prefix and Infix Expressions does all the heavy lifting so I am including those as snippets here.


var precedences = map[token.TokenType]int{
    token.EQ:       EQUALS,
    token.NOT_EQ:   EQUALS,
    token.LT:       LESSGREATER,
    token.GT:       LESSGREATER,
    token.PLUS:     SUM,
    token.MINUS:    SUM,
    token.SLASH:    PRODUCT,
    token.ASTERISK: PRODUCT,
    token.LPAREN:   CALL,
}

func (p *Parser) parseExpressionStatement() *ast.ExpressionStatement {
    stmt := &ast.ExpressionStatement{Token: p.curToken}

    stmt.Expression = p.parseExpression(LOWEST)

    if p.peekTokenIs(token.SEMICOLON) {
        p.nextToken()
    }

    return stmt
}

func (p *Parser) parseExpression(precedence int) ast.Expression {
    prefix := p.prefixParseFns[p.curToken.Type]
    if prefix == nil {
        p.noPrefixParseFnError(p.curToken.Type)
        return nil
    }
    leftExp := prefix()

    for !p.peekTokenIs(token.SEMICOLON) && precedence < p.peekPrecedence() {
        infix := p.infixParseFns[p.peekToken.Type]
        if infix == nil {
            return leftExp
        }

        p.nextToken()

        leftExp = infix(leftExp)
    }

    return leftExp
}


func (p *Parser) parsePrefixExpression() ast.Expression {
    pexp := &ast.PrefixExpression{
        Token:    p.curToken,
        Operator: p.curToken.Literal,
    }

    p.nextToken()

    pexp.Right = p.parseExpression(PREFIX)

    return pexp
}

func (p *Parser) parseInfixExpression(left ast.Expression) ast.Expression {
    expression := &ast.InfixExpression{
        Token:    p.curToken,
        Operator: p.curToken.Literal,
        Left:     left,
    }

    precedence := p.curPrecedence()
    p.nextToken()
    expression.Right = p.parseExpression(precedence)

    return expression
}

Results

As can be seen below, the parser is able to assign correct precedence to operators and is able to identify synctatical errors as well (notice how I got a cute monkey face, when I mistyped “let” as “ley”).

Challenges Faced

Understanding how the Pratt Parser Works was the biggest challenge in this stage. It would require multiple revisits and implementations for a greater clarity.

Go Language Features Learned

(Any new Go features used for the function definition and call parsing)

Project Progress

  • [✓] Lexer Implementation

  • [✓] Parsing let Statements

  • [✓] Parsing Return Statements

  • [✓] Parsing If-Else Expressions

  • [✓] Parsing Function Definitions

  • [✓] Parsing Function Calls

  • Parsing Array Literals

  • Parsing Hash Literals

Code Repository

github: https://github.com/AkshayContributes/interpreter

Resources

Next Steps

The next step would be adding Evaluative Capabilities to our Monkey Language

Personal Notes

Feels really good to be able to complete half of the book in a couple of weeks despite the time constraints at work.


Last Updated: November 5, 2024

Progress: 50% through the book

Next Update Expected: November 10, 2024