Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .claude/commands/checkpoint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
description: Update teaching-todo.md with progress and notes from our session
---

Review the conversation and update the learning progress in @teaching-todo.md:

1. Mark completed items with `[x]`
2. Mark partially completed items with `[~]`
3. Add **Notes from session:** sections under completed topics with key learnings, Q&A, and decisions made

Focus on what we actually discussed and learned - don't mark items complete unless we covered them.
120 changes: 120 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Teaching Mode

This repository is being used as a learning environment for CPython internals. The goal is to teach the user how CPython works, not to write code for them.

**Behavior Guidelines:**
- Describe implementations and concepts, don't write code unless explicitly asked
- Ask questions to verify understanding ("What do you think ob_refcnt does?")
- Point to specific files and line numbers for the user to read
- When the user is stuck, give hints before giving answers
- Reference `teaching-todo.md` for the structured curriculum
- Reference `teaching-notes.md` for detailed research (student should not read this)
- Encourage use of `dis` module, GDB, and debug builds for exploration

**IMPORTANT - Don't write code for the student:**
- Give hints, not implementations
- Point to similar code in tupleobject.c as reference
- Explain what needs to happen, let them write it
- If they're stuck on API: name the function/macro, don't write the call
- Only write code if explicitly asked ("write this for me")
- 2-3 line snippets for syntax are OK; 10+ line functions are NOT

**The learning project:** Implementing a `Record` type and `BUILD_RECORD` opcode (~300 LoC). This comprehensive project covers:
- PyObject/PyVarObject fundamentals (custom struct, refcounting)
- Type slots (tp_repr, tp_hash, tp_dealloc, tp_getattro, sq_length, sq_item)
- The evaluation loop (BUILD_RECORD opcode in ceval.c)
- Build system integration

A working solution exists on the `teaching-cpython-solution` branch for reference.

## Build Commands

```bash
# Debug build (required for learning - enables assertions and refcount tracking)
./configure --with-pydebug
make

# Smoke test
./python.exe --version
./python.exe -c "print('hello')"

# Run specific test
./python.exe -m test test_sys
```

After modifying opcodes or grammar:
```bash
make regen-all # Regenerate generated files
make # Rebuild
```

## Architecture Overview

### The Object Model (start here)
- `Include/object.h` - PyObject, PyVarObject, Py_INCREF/DECREF
- `Include/cpython/object.h` - PyTypeObject (the "metaclass" of all types)
- `Objects/*.c` - Concrete type implementations

### Core Data Structures
| Type | Header | Implementation |
|------|--------|----------------|
| int | `Include/cpython/longintrepr.h` | `Objects/longobject.c` |
| tuple | `Include/cpython/tupleobject.h` | `Objects/tupleobject.c` |
| list | `Include/cpython/listobject.h` | `Objects/listobject.c` |
| dict | `Include/cpython/dictobject.h` | `Objects/dictobject.c` |
| set | `Include/setobject.h` | `Objects/setobject.c` |

### Execution Engine
- `Include/opcode.h` - Opcode definitions
- `Lib/opcode.py` - Python-side opcode definitions (source of truth)
- `Include/cpython/code.h` - Code object structure
- `Include/cpython/frameobject.h` - Frame object (execution context)
- `Python/ceval.c` - **The interpreter loop** - giant switch on opcodes, stack machine

### Compiler Pipeline
- `Grammar/python.gram` - PEG grammar
- `Parser/` - Tokenizer and parser
- `Python/compile.c` - AST to bytecode
- `Python/symtable.c` - Symbol table building

## Key Concepts for Teaching

**Everything is a PyObject:**
```c
typedef struct {
Py_ssize_t ob_refcnt; // Reference count
PyTypeObject *ob_type; // Pointer to type object
} PyObject;
```

**The stack machine:** Bytecode operates on a value stack. `LOAD_FAST` pushes, `BINARY_ADD` pops two and pushes one, etc.

**Type slots:** `PyTypeObject` has function pointers (tp_hash, tp_repr, tp_call) that define behavior. `len(x)` calls `x->ob_type->tp_as_sequence->sq_length`.

## Useful Commands for Learning

```bash
# Disassemble Python code
./python.exe -c "import dis; dis.dis(lambda: [1,2,3])"

# Check reference count (debug build)
./python.exe -c "import sys; x = []; print(sys.getrefcount(x))"

# Show total refcount after each statement (debug build)
./python.exe -X showrefcount

# Run with GDB
gdb ./python.exe
(gdb) break _PyEval_EvalFrameDefault
(gdb) run -c "1 + 1"
```

## External Resources

- Developer Guide: https://devguide.python.org/
- CPython Internals Book: https://realpython.com/products/cpython-internals-book/
- PEP 3155 (Qualified names): Understanding how names are resolved
9 changes: 8 additions & 1 deletion Grammar/python.gram
Original file line number Diff line number Diff line change
Expand Up @@ -696,7 +696,7 @@ atom[expr_ty]:
| NUMBER
| &'(' (tuple | group | genexp)
| &'[' (list | listcomp)
| &'{' (dict | set | dictcomp | setcomp)
| &'{' (dict | set | dictcomp | setcomp | record)
| '...' { _PyAST_Constant(Py_Ellipsis, NULL, EXTRA) }

strings[expr_ty] (memo): a=STRING+ { _PyPegen_concatenate_strings(p, a) }
Expand Down Expand Up @@ -725,6 +725,13 @@ dict[expr_ty]:
CHECK(asdl_expr_seq*, _PyPegen_get_values(p, a)),
EXTRA) }
| '{' invalid_double_starred_kvpairs '}'
record[expr_ty]:
| '{' '|' a=[kwargs] '|' '}' {
_PyAST_Record(
CHECK(asdl_expr_seq*, _PyPegen_get_record_keys(p, a)),
CHECK(asdl_expr_seq*, _PyPegen_get_record_values(p, a)),
EXTRA) }
| '{' '|' invalid_kwarg '|' '}'

dictcomp[expr_ty]:
| '{' a=kvpair b=for_if_clauses '}' { _PyAST_DictComp(a->key, a->value, b, EXTRA) }
Expand Down
1 change: 1 addition & 0 deletions Include/Python.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@
#include "rangeobject.h"
#include "memoryobject.h"
#include "tupleobject.h"
#include "recordobject.h"
#include "listobject.h"
#include "dictobject.h"
#include "cpython/odictobject.h"
Expand Down
12 changes: 12 additions & 0 deletions Include/cpython/recordobject.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#ifndef Py_CPYTHON_RECORDOBJECT_H
# error "this header file must not be included directly"
#endif

typedef struct {
PyObject_VAR_HEAD
PyObject *names;
/* ob_item contains space for 'ob_size' elements.
Items must normally not be NULL, except during construction when
the record is not yet visible outside the function that builds it. */
PyObject *ob_item[1];
} PyRecordObject;
23 changes: 16 additions & 7 deletions Include/internal/pycore_ast.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Include/internal/pycore_ast_state.h
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ struct ast_state {
PyObject *RShift_singleton;
PyObject *RShift_type;
PyObject *Raise_type;
PyObject *Record_type;
PyObject *Return_type;
PyObject *SetComp_type;
PyObject *Set_type;
Expand Down
1 change: 1 addition & 0 deletions Include/opcode.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 22 additions & 0 deletions Include/recordobject.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
/* Record object interface */

#ifndef Py_RECORDOBJECT_H
#define Py_RECORDOBJECT_H
#ifdef __cplusplus
extern "C" {
#endif

PyAPI_DATA(PyTypeObject) PyRecord_Type;
PyAPI_FUNC(PyObject *) PyRecord_New(Py_ssize_t size);
#define PyRecord_Check(op) Py_IS_TYPE(op, &PyRecord_Type)

#ifndef Py_LIMITED_API
# define Py_CPYTHON_RECORDOBJECT_H
# include "cpython/recordobject.h"
# undef Py_CPYTHON_RECORDOBJECT_H
#endif

#ifdef __cplusplus
}
#endif
#endif /* !Py_RECORDOBJECT_H */
1 change: 1 addition & 0 deletions Lib/opcode.py
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,7 @@ def jabs_op(name, op):
def_op('BUILD_LIST', 103) # Number of list items
def_op('BUILD_SET', 104) # Number of set items
def_op('BUILD_MAP', 105) # Number of dict entries
def_op('BUILD_RECORD', 166) # Number of record fields
name_op('LOAD_ATTR', 106) # Index in name list
def_op('COMPARE_OP', 107) # Comparison operator
hascompare.append(107)
Expand Down
3 changes: 3 additions & 0 deletions Makefile.pre.in
Original file line number Diff line number Diff line change
Expand Up @@ -438,6 +438,7 @@ OBJECT_OBJS= \
Objects/sliceobject.o \
Objects/structseq.o \
Objects/tupleobject.o \
Objects/recordobject.o \
Objects/typeobject.o \
Objects/unicodeobject.o \
Objects/unicodectype.o \
Expand Down Expand Up @@ -1100,6 +1101,7 @@ PYTHON_HEADERS= \
$(srcdir)/Include/traceback.h \
$(srcdir)/Include/tracemalloc.h \
$(srcdir)/Include/tupleobject.h \
$(srcdir)/Include/recordobject.h \
$(srcdir)/Include/unicodeobject.h \
$(srcdir)/Include/warnings.h \
$(srcdir)/Include/weakrefobject.h \
Expand Down Expand Up @@ -1138,6 +1140,7 @@ PYTHON_HEADERS= \
$(srcdir)/Include/cpython/sysmodule.h \
$(srcdir)/Include/cpython/traceback.h \
$(srcdir)/Include/cpython/tupleobject.h \
$(srcdir)/Include/cpython/recordobject.h \
$(srcdir)/Include/cpython/unicodeobject.h \
\
$(srcdir)/Include/internal/pycore_abstract.h \
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"name":"Local: object","url":"/Users/nickvandermerwe/repos/cpython/Objects/object.c","tests":[{"id":1766539391971,"input":"","output":""}],"interactive":false,"memoryLimit":1024,"timeLimit":3000,"srcPath":"/Users/nickvandermerwe/repos/cpython/Objects/object.c","group":"local","local":true}
Loading
Loading