I Built a Personalized Study System with Claude. Here Is the Science Behind Why It Worked.

A few weeks ago, I had to prepare for something high-stakes. The content was deep, the breadth was wide, and I had a week to be ready. I had been close to the subject matter for years, but knowing something and being able to retrieve it under pressure, respond to it in real time, and demonstrate fluency across a broad range of topics are entirely different cognitive demands.

I could have gone the traditional route: read through notes, highlight documents, review a few key points the night before. Most professionals do exactly this. Most professionals also walk into high-stakes performances feeling underprepared.

I decided to do it differently. I used Claude and the principles of the science of learning to build a full, personalized study system from scratch. What follows is an honest account of how I did it, what I built, and what I learned about the gap between passive review and genuine preparation. My goal is to illustrate this to showcase how we can integrate the science of learning with AI to promote learning in a constructive manner with AI versus cognitively offloading to AI and negatively impacting my learning. Much of what I am discussing is in the draft of an upcoming book on integrating cognitive science principles with AI to support learning.


The Science of Learning: What the Research Actually Says

Before I describe the tools, I want to name the research principles that shaped every design decision. These are not new ideas. Cognitive scientists have studied how humans learn and retain information for decades. The problem is that most of us, educators included, still default to study methods that feel productive but produce shallow encoding.

Retrieval practice is the process of actively recalling information rather than passively re-reading it. The testing effect, documented extensively by Roediger and Karpicke (2006), shows that retrieving information from memory strengthens that memory trace far more than additional exposure to the material. Every time you try to pull something from memory and succeed, you make it easier to pull again later. Every time you try and fail, you create a desirable difficulty that strengthens encoding when you do get the answer.

Spaced practice is the distribution of study sessions over time rather than massed in a single sitting. Ebbinghaus’s forgetting curve established over a century ago that memory decays rapidly without reinforcement. Spacing study across multiple days, with gaps between sessions, leverages the way memory reconsolidates during rest and sleep. One long session the night before a performance is one of the least effective study strategies known to science.

Interleaving is the mixing of different content types and topics within a single study session rather than blocking by category. Bjork and colleagues have demonstrated that interleaving feels harder and less efficient in the moment but produces significantly stronger long-term retention and transfer. When you mix topics, your brain cannot rely on context to retrieve the answer — it has to do the actual cognitive work of discrimination and categorization.

Formative assessment with immediate feedback closes the loop. Knowing whether your retrieval attempt was correct, and understanding why, accelerates learning. Without feedback, incorrect responses get reinforced just as strongly as correct ones.

These four principles shaped everything I built.


Step One: Building the Knowledge Base

The first thing I did was upload the relevant documents, data, and research to Claude. This is where the process starts and where most AI-assisted study attempts end: with a summary or a set of bullet points someone reads once and forgets.

I did not ask for a summary. I asked Claude to build.

Working from the uploaded materials and our ongoing conversation, Claude and I built a comprehensive knowledge base that covered conceptual frameworks, key data points, specific names and roles, vocabulary, and procedural knowledge. The breadth of the content was organized into distinct categories — roughly a dozen — which became the architecture for everything that followed.


Step Two: Building the Artifacts

This is where the science of learning came in as the design principle. I did not ask for one study tool. I asked for several, each targeting a different cognitive demand.

Flashcards were the retrieval practice foundation. Claude built a bank of over 100 flashcards organized across all twelve content categories, with a flip mechanic that required me to actively retrieve the answer before revealing it. A “Critical Only” filter surfaced the highest-stakes cards — the ones where retrieval failure during the actual performance would hurt most. I marked each card as “Got It” or “Still Learning,” which the system used to track my mastery over time.

Matching activities required a different kind of retrieval — not just recognition of a fact but the ability to map relationships between concepts, people, data points, and frameworks. Ten separate matching sets covered different content areas. The left-right format meant I had to commit to a choice before seeing whether I was correct.

Multiple choice data drills introduced the pressure of discrimination — selecting the right answer from plausible alternatives. Twenty-five questions covering the most specific and easily confused content. Immediate feedback on each answer with a full explanation of why the correct answer was correct and why the distractors were wrong.

A mock performance assessment was the highest-order artifact. Claude asked me open-ended performance questions one at a time. I typed my answers as I would actually deliver them. Claude then assessed each response, identified what landed and what was missing, scored it on a ten-point scale, and provided specific coaching on exactly what the stronger answer would have included. This was formative assessment with feedback at its most direct.

A block-assembly activity was the final artifact and the one that pushed my thinking most. For each performance question, Claude presented six answer blocks — three that belonged in a strong response and three distractors that sounded relevant but actually belonged to different questions. I selected three, assembled them into a preview of what the answer would sound like as a complete response, then checked my selections for accuracy. This activity specifically targeted the kind of synthesis and discrimination that retrieval practice flashcards alone cannot develop.


Step Three: How I Studied: Spacing and Interleaving in Practice

Building the tools was not the work. Using them consistently over time was.

I spread my study across multiple sessions over several days rather than concentrating it in one long block. Each session began with a category I had already worked through in a previous session before introducing new categories. This is the interleaving principle in action: old content alongside new content, so I could not rely on recency or context to retrieve answers. I had to do the cognitive work each time.

I used the flashcard progress tracker to identify which cards I had marked “Still Learning” across sessions and returned to those deliberately in later sessions rather than randomly cycling through the full deck. Struggling with a card in Session 1 and successfully retrieving it in Session 3 is stronger encoding than getting it right both times on the same day.

The mock performance assessment happened in the middle of the study week, not at the end. This was intentional. Using it as a midpoint diagnostic rather than a final review meant I had time to act on the feedback. The ten questions I answered were scored and analyzed, and the two or three areas where my performance was weakest became the focus of the sessions that followed.


What Happened When It Actually Mattered

By the time I had to perform, I noticed something I did not fully anticipate: I was not just remembering information. I was thinking in the frameworks I had studied. The content had moved from recall, knowing the answer when prompted: to fluency. Using the knowledge naturally in response to questions I had not specifically prepared for.

That is the difference the science of learning makes. Passive review produces recognition memory: you see a term and remember having seen it before. Active retrieval practice, spaced across time, interleaved across topics, and calibrated by feedback produces something closer to genuine expertise — knowledge that is accessible under pressure, generative under uncertainty, and stable over time.

I felt confident walking in. Not because I had read everything the night before. Because I had retrieved it, been wrong about some of it, corrected the gaps, and retrieved it again.


What This Looks Like for Your Context

You do not need to be preparing for a high-stakes performance to apply this model. Any professional who needs to build fluency in a new domain: a new role, a new curriculum, a new policy landscape, a new technology can use this approach.

The process is replicable with Claude or any capable AI assistant. Upload your relevant materials. Ask for flashcards across the key categories, not a summary. Ask for matching activities, multiple choice drills, and open-ended performance questions with scored feedback. Use the artifacts across multiple sessions, not one. Mix old content with new in every session. Take the performance assessment before you feel ready. Then, the gaps it reveals are the most valuable information you will get to study once again over time to improve your performance from last time

The science is not complicated. The application just requires the discipline to study the way the research says works rather than the way that feels comfortable.


References

Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe & A. P. Shimamura (Eds.), Metacognition: Knowing about knowing (pp. 185-205). MIT Press.

Ebbinghaus, H. (1885/1913). Memory: A contribution to experimental psychology. Teachers College, Columbia University.

Roediger, H. L., & Karpicke, J. D. (2006). Test-enhanced learning: Taking memory tests improves long-term retention. Psychological Science, 17(3), 249-255.

Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12(2), 257-285.

Published by Matthew Rhoads, Ed.D.

Innovator, EdTech Trainer and Leader, University Lecturer & Teacher Candidate Supervisor, Consultant, Author, and Podcaster

Leave a Reply

Discover more from Dr. Matt Rhoads — AI, Instruction, and the Science of Learning

Subscribe now to keep reading and get access to the full archive.

Continue reading