June 9, 2026

ChatGPT as a Running Coach: The Best Prompt We Could Build (and Where It Breaks)

A copy-paste coaching prompt that gets the most out of ChatGPT, plus what peer-reviewed research, expert coach ratings, and OpenAI's own postmortems say about where AI training advice goes wrong.

Search "ChatGPT running coach" and you'll find two camps: people who swear it wrote them a great marathon block, and people who got hurt following confident nonsense. Both camps are real, and we now have actual research explaining why both exist.

In 2024, a group of sports scientists had expert coaches rate ChatGPT-generated six-week training plans across 22 quality criteria. With sparse input ("make me a plan"), the plans scored below 3 out of 5 on 19 of the 22 criteria. With detailed athlete information, the ratings improved substantially, though the coaches still didn't rate any plan optimal (Düking et al., Journal of Sports Science and Medicine). A separate study scoring ChatGPT's exercise advice against ACSM guidelines found it 90.7% accurate but only 41.2% comprehensive: rarely flat-out wrong, routinely missing things that matter.

Read those two findings together and you get the practical takeaway: output quality tracks input quality, and the model's failure mode is omission more than error. Which means the prompt is most of the game. We build an AI running coach for a living; here's the best ChatGPT coaching prompt we know how to write, followed by where even this one breaks.

The prompt

You are an experienced running coach with a background in exercise physiology.
You coach conservatively and prioritize long-term health over short-term gains.

Before giving me any plan, ask me for anything you need. Assume nothing.
At minimum, establish:
- My goal race, distance, and date (and goal time, if I have one)
- A recent race result or all-out time trial (distance and time), if I have one
- My weekly mileage for each of the last 4 weeks, and my longest recent run
- My current easy pace, and any tested thresholds or HR zones
- Injury history and any current niggles
- Days per week I can train, and any fixed schedule constraints
- My age, and how long I've been running

Then build a week-by-week plan with these rules:
- Build volume gradually. Hold week-over-week increases to roughly 10% as a
  guardrail, never jump volume by 30% or more, and schedule a down week
  about every 4th week.
- Keep ~80% of running easy (conversational, zone 2) and ~20% hard. Give me
  the target pace OR heart-rate zone AND the perceived effort (RPE 1-10) for
  every session, and explain what each quality session is for.
- Never prescribe a pace faster than my demonstrated fitness implies. If I
  haven't given you a recent race or tested threshold, say so explicitly and
  use effort-based targets instead of inventing numbers.
- Flag any week where the plan assumes fitness I haven't shown you.
- After each week, ask how the sessions actually went (completed? how did
  they feel? any pain?) and adjust the NEXT week based on my answers. Do not
  hand me a fixed block and move on.

Be honest. If my goal or timeline is unrealistic or risky, say so plainly
and explain why. Do not flatter me.

Every line in there is doing a job, and the jobs are worth understanding, because they map directly onto the documented failure modes.

Why the prompt looks like this

It demands a recent race result before any pace. This is how real pace prescription works. Jack Daniels' VDOT system, the standard for forty years, derives every training pace (easy, marathon, threshold, interval, repetition) from one input: a recent all-out race performance. No race, no paces. A language model without that anchor doesn't leave the field blank; it interpolates from population averages and the round numbers in its training data, and hands you something authoritative-looking like "4:30/km" that was derived from nothing about you.

It encodes 80/20, which has real evidence behind it. Stephen Seiler's analyses of elite endurance athletes (2006, 2010) found they converge on roughly 80% of training at low intensity. Intervention studies in recreational runners are smaller and more mixed, but they lean the same direction: one 10-week randomized trial found a polarized group improved 10K times by 5.0% versus 3.5% for threshold-heavy training. Structure-wise, this is the safest bet in endurance training.

It treats the 10% rule as a guardrail, not science. Here's something ChatGPT will confidently get wrong: it recites the 10% rule as settled fact. The one large randomized trial that tested it (Buist et al., 2008, 532 novice runners) found a graded 10%-rule program produced the same injury rate as a standard program, about 20% in both groups. What the data does support is avoiding spikes: a GPS study of 874 runners found those who jumped weekly distance by more than 30% got injured more than those who progressed under 10%. So: gradual, yes; the magic number, folklore. We wrote the prompt accordingly.

It orders the model not to flatter you. That line earns its place. More below.

Where it breaks anyway

A good prompt narrows the gap between ChatGPT and a coach. It does not close it, and the remaining gap isn't fixable with better wording.

1. Numbers and dates without provenance

Even with the race-result rule, models drift toward clean, confident figures. And the failure isn't only physiological. When a GearJunkie writer trained for a half marathon with ChatGPT as his coach in late 2025, the model twice produced plans where the days of the week didn't match the calendar dates. Twice. His overall verdict was that it made "a dedicated running assistant rather than a coach," which matches our experience exactly. You must check its arithmetic, its dates, and its paces, which is a strange relationship to have with a coach.

2. It doesn't really remember your season

ChatGPT has had memory since April 2025: saved memories plus the ability to reference your past chats. It helps. It is also not what it sounds like. The recall is model-mediated and selective, not a verbatim training log; you can't audit what it retained, and it will cheerfully miss that your easy runs have been creeping faster for three weeks, because nobody saved that as a memory. The arc of a season (the calf strain in March, the down week you skipped, the long-run progression) lives with you, and you re-supply it or it's gone.

3. Sycophancy is a measured property, not a vibe

Anthropic researchers showed in 2023 that state-of-the-art assistants consistently exhibit sycophancy across tasks, partly because human raters prefer agreeable answers, so agreeableness gets trained in. In April 2025 this stopped being academic: OpenAI shipped a GPT-4o update so sycophantic that it rolled the update back within days, writing that the model had skewed toward "responses that were overly supportive but disingenuous."

Coaching is close to the worst-case domain for this trait, because the most valuable thing a coach says is no. No, not a 3:00 marathon off 30km weeks. No, not intervals on that calf. Tell ChatGPT your plan and it wants to help you execute it; the prompt's "do not flatter me" line resists the pull but doesn't remove it.

4. It can't see your runs

The prompt assumes you report your own training, accurately, forever. The model has no live connection to your watch. You can pipe data in by hand (we've written up the real options for Strava and Garmin, neither is pretty, and Strava's June 2026 API policy actively restricts the DIY versions now), but a coach who only knows what you remember to mention is working from your self-image, not your training. The signal that predicts trouble (heart rate drifting on easy runs, paces stalling, load spiking) is exactly the signal that never makes it into the chat.

5. The adaptation loop runs on you

The prompt tells the model to adjust each week based on your report, and it will, every time you show up and report. Skip a check-in and the plan freezes. There is no background process noticing Thursday's bad sleep or Saturday's blown workout. Adaptation you have to trigger manually is just a smarter spreadsheet, and a 16-week marathon block is precisely the length of time over which manual habits decay.

Questions people actually ask

Can ChatGPT write me a marathon training plan? Yes, and with the prompt above a decent one: mainstream structure is 16 to 20 weeks (Higdon and Pfitzinger's classic plans are both 18) with a 2-to-3-week taper, and the model knows those conventions cold. The first plan isn't the problem. The 16 weeks of adjusting it are.

Should I trust the paces it gives me? Trust the structure (easy/hard balance, progression, down weeks) more than the numbers. Unless you gave it a recent race result, anchor easy runs to breathing and effort, not its pace tables.

Will it remember my training between chats? Partially, since the 2025 memory features, and not dependably. Assume continuity is your job.

Is it better than a free plan off the internet? For understanding why a plan looks the way it does, clearly yes; it's interactive in a way a PDF can't be. One of the earliest serious attempts we know of, a half-marathon plan on r/AdvancedRunning back in 2022, fell apart when the runner pushed past the template into specific pacing. The models have improved enormously since; the shape of the failure hasn't.

Can it put workouts on my watch? No. Reading your data is a manual pipeline you maintain; writing a structured workout to a Garmin goes through partner APIs no chat setup touches.

So is ChatGPT a good running coach?

It's a good running advisor, sincerely. Prompted well, it explains training theory clearly, sanity-checks plans, and talks you through decisions at 11pm when no human coach is awake. The research backs the "mostly accurate, persistently incomplete" read, and the prompt above pushes it about as far as prompting goes.

What it can't be is the other thing: a system with a durable model of you, eyes on every run as it happens, the standing to push back, and a loop that adapts the plan and puts the next session on your wrist without being asked. That isn't a prompt. It's infrastructure. We know because we tried the prompt first, and then we went and built the infrastructure.