This weekend, I was cleaning out my dad’s boat. It was full of leaves. As I was pushing everything toward the scupper, I noticed something:
Some leaves flowed through easily.
Some got stuck and blocked the whole drain.
That moment reminded me of what I was teaching earlier this week with my exploratory CS students. We were looking at how tokenization works in large language models. (DM us for more info about this amazing lesson!)
We used the tiktoken
Python library to experiment with how GPT-4 tokenizes text. The students were trying to guess how many tokens a word or phrase would generate and getting pretty into it. Spoiler: the token count isn’t always what you expect.
Here’s what we learned:
Tokens aren’t the same as words. A token can be a full word, part of a word, or punctuation.
Models tokenize differently.
They use something called Byte Pair Encoding (BPE) to merge common letter combinations and reduce token count.
We also talked about token limits: That includes both your prompt and the AI’s response. If you go over, the oldest tokens get dropped—basically, the model forgets what you told it earlier.
It’s like trying to shove too many leaves through a scupper. The system can only handle so much before something gets stuck or falls off the end.
And just like that scupper, certain language makes things worse:
Ambiguous words like “pitch” or “bank”
Slang or rare words the model hasn’t seen much
Typos, run-ons, idioms—they all mess with the flow
For CS teachers, this is gold. Tokenization is a great way to connect string handling, memory management, and prompt design. It’s also a good reminder that input clarity matters—especially when working with systems that rely on structured text.
We’re not just teaching kids to code. We’re teaching them how machines “think.”
Share this post