- Future Reps
- Posts
- The Illusion of Intelligence
The Illusion of Intelligence
What AI gets right...and wrong in rehab

Volume 11 June 9, 2025
From the Editor
We got an upgrade! Future Reps is now coming to you from Beehiiv. Same great content, just a new sender address. Be sure to star this email so future issues don’t land in your junk folder.
This week, we’re diving into ChatGPT’s ability to assist with plantar fasciitis care and zooming out to ask some tough questions about how much these AI tools actually understand. Are they actually smart or just really good at feigning intelligence? Plus, if you remember the lumbosacral radiculopathy study from Volume 7, we’ve got a fresh podcast link that takes the discussion even deeper. Enjoy the read…and the listen.
📈 This Week's Highlights
ChatGPT Adheres to Plantar Fasciitis Guidelines—Mostly
A recent study tested ChatGPT-4’s ability to follow the 2023 APTA clinical practice guidelines (CPGs) for plantar fasciitis. Researchers posed 21 structured questions based on guideline recommendations, then evaluated ChatGPT’s responses. The model achieved 85.7% agreement with the official CPGs, with slight variations between GPT-4o and GPT-4 Turbo. Notably, the strongest agreement came from higher-grade evidence questions, while performance dipped slightly for lower-grade or more nuanced recommendations.
Why this matters for PTs:
If used wisely, LLMs could help streamline evidence-based decision-making—especially for newer clinicians or when quick refreshers are needed. But “close enough” isn’t always good enough in patient care. PTs should double-check AI suggestions against trusted guidelines and clinical reasoning.
Training AI Agents with Simulation Before Deployment
Apple and academic collaborators released a framework called EVA, which trains AI agents in simulated environments before releasing them into real-world settings. The idea is to allow agents to "practice" decision-making under safe, high-volume conditions—a promising approach for robotics or automated assistants that might eventually support clinical workflows.
Why this matters for PTs:
Think of this as an AI residency. Before tools like this assist with patient mobility, triage, or scheduling, simulation-first training could help minimize error and boost safety.
Embodied AI Tackles Real-World Movement Challenges
A new benchmark released in May tests how well AI systems can integrate visual, proprioceptive, and decision-making abilities. This type of "embodied intelligence" is critical for robotics that may assist with gait training, balance, or ADLs. The benchmark stresses how these systems perform not just in code but in physical space.
Why this matters for PTs:
Rehab robotics isn’t just about motors—it’s about movement quality, context, and adaptation. Advances like this bring us a step closer to tools that actually understand the body in motion.
The Limits of Multimodal AI
An essay from The Gradient argues that today's AI models—despite being "multimodal"—still lack general intelligence. Tools like GPT-4 or Gemini can process text, image, and audio inputs, but the article argues they still rely on surface-level pattern matching rather than deep understanding.
Why this matters for PTs:
If you’re relying on AI to help with patient education or clinical decisions, it helps to know the ceiling. Understanding the limitations of current models can keep expectations (and safety) in check.
Apple on the "Illusion of Thinking" in AI
Apple researchers recently published an introspective paper on how LLMs often appear more intelligent than they really are. These models may produce fluent, confident-sounding answers that mask shallow reasoning—a phenomenon the authors call "the illusion of thinking."
Why this matters for PTs:
Even when AI sounds right, it might not be right. This illusion is especially risky in clinical documentation or patient communication, where clarity and accuracy matter most.
🎧 Featured Podcast
AI in MSK Rehab: Insights from Dr. Giacomo Rossettini
In a podcast episode from Summit Physical Therapy Perspectives, Dr. Giacomo Rossettini discusses how large language models like ChatGPT compare to clinical practice guidelines in musculoskeletal care. Drawing from recent research, he explores where LLMs align—and where they fall short—when responding to clinical scenarios. The conversation also covers ethical boundaries, potential misuse, and the future of AI as a clinical co-pilot, not a replacement.
Why this matters for PTs:
This episode expands on a research article we first highlighted in Volume 7 of Future Reps. If you’ve been wondering how seriously to take AI tools in your daily clinical decisions, it’s worth a listen.
Bonus Reads
Harvard HealthTech Review: "When AI Gives Bad Advice" — a breakdown of what happens when clinicians over-rely on flawed outputs.
AI Snake Oil Newsletter: "The Trouble with Trusting Models That Sound Smart" — short and sharp piece that pairs well with Apple’s new paper.