Routine work

Reduce Maintenance Costs with AI in 10 Easy Prompts

Dustin Boston ·

You Need AI That Reduces Maintenance Costs is a great article. You should definitely read it. It doesn't cover how to reduce maintenance costs with AI, though, so I thought I'd take a stab at that. First let's talk about the non-AI scenario.

How to Reduce Maintenance Costs without AI (traditional advice)

Before diving into how to reduce maintenance costs with AI, I think it's worth going over the traditional ways we might reduce costs:

  1. Write less code. I'm not trying to be a smartass, I swear. The cheapest code to maintain is code that doesn't exist. Aggressively delete unused features and dead branches. Resist premature generalization - build for the problem you have, not the one you imagine (this one has been huge for me).

  2. Favor boring, simple solutions. For example maybe just stick with React, or better yet, use plain HTML. Clever code is expensive code. Use mature, well-understood languages and frameworks; standardize on a small stack so you're not paying a cognitive tax across dozens of tools. This is what I used to do at the web agency I worked for. Then I automated creation of new projects. "Choose Boring Technology" by Dan McKinley is the canonical essay here.

  3. Write tons of automated tests. Without a solid test suite, every change becomes risky and slow. Tests are what make refactoring safe, which is what keeps a codebase from rotting. A reasonable target is enough coverage that developers feel comfortable changing things without fear.

  4. Refactor infinitely. Technical debt compounds. Small, ongoing cleanups are dramatically cheaper than periodic "rewrite the legacy system" projects. "Leave the code cleaner than you found it," should be the norm. NOTE: If you feel like being chaotic-neutral at work, which I sometimes like to do, start referring to the current system as the "legacy system."

  5. Modular architecture with clear boundaries. Loose coupling and high cohesion let you change one part without breaking others. Architecture is a bit of a weak spot for me. AI can help me learn by applying specific architectures and walking me through them.

  6. Good documentation. Document the "why," not the "what." Code shows what it does. What's missing is why a decision was made. Inline comments for non-obvious choices, and runbooks for ops scenarios save lots of time when someone unfamiliar (probably future-you) is debugging at 2am.

  7. Strong observability. Good logging, metrics, and tracing turn hour-long debugging sessions into ten-minute ones. You only pay the cost once.

  8. Invest in developer experience. Fast builds, fast tests, one-command local setup, fast CI. Every minute of friction multiplied by every developer over years adds up to staggering costs.

  9. Manage dependencies carefully. Each dependency is a long-term liability. Prefer fewer, more mature ones. Keep them updated incrementally. Falling years behind on a major framework is one of the most common ways maintenance costs explode.

  10. Knowledge transfer. Pair programming, code reviews, rotating ownership, and good onboarding docs reduce bus-factor risk. A system only one person understands is one resignation away from a crisis.

For starting a new project, see my Enterprise Project Checklist. For an existing codebase, here are ten ways to reduce maintenance costs with AI.

How to Reduce Maintenance Costs with AI

1. Write Less Code

Tell AI to delete some stuff. Ask it to find dead code, unused exports, unreachable branches, and duplicated logic that could be combined. That kind of audit used to be too tedious to do regularly. But AI is good at simplifying clever code, pointing out unnecessary abstractions, and spotting similar modules that should be merged. One could go so far as to set explicit "lines deleted" as a metric alongside features shipped, if one wanted to.

Audit this codebase for code that should be deleted or consolidated. Look for:

1. Dead code - functions, classes, or files that are never called or imported
2. Unused exports - public symbols with no consumers
3. Unreachable branches - conditionals that can never be true given surrounding logic
4. Duplicated logic - multiple places implementing the same thing with minor variations
5. Speculative abstractions - interfaces, factories, or generic wrappers with one
   concrete implementation and no near-term plan for more
6. Over-clever code that would be clearer as straightforward procedural code

For each finding, return: file and line range, category, a one-paragraph reason it's
a candidate for removal, and a confidence level (high/medium/low) with what would
change your mind.

Sort by confidence, highest first. Don't propose changes that require business
context you don't have - flag those as "needs human input" instead.

2. Boring Solutions

AI can lay out trade-offs between technology choices. Before adopting a new library or pattern, ask it what the maintenance implications are, what the common pitfalls look like, and what you give up by sticking with the boring option.

It's also useful for translating clever code into boring code. For example, it can turn a dense functional one-liner into the readable, debuggable equivalent. Push back when it suggests a pattern for what could be a function. Its training data over-indexes on enterprise patterns.

Audit this codebase for clever code that would be clearer as boring code.
Optimize for the person who will debug this at 2am with no context. Look for:

1. Nested ternaries, point-free chains, dense reduce/flatMap stacks, or
   functional one-liners that hide what the code is actually doing
2. Single-letter variable names outside trivial scopes
3. Deep nesting that could be flattened with early returns
4. Anonymous functions doing non-trivial work that should be named
5. Premature abstractions - factories, generic wrappers, or interfaces with
   one concrete implementation and no near-term plan for more
6. Macros, decorators, or metaprogramming where straightforward code would do

For each finding, return: file and line range, category, a one-paragraph
explanation of why a boring version would be clearer, a proposed rewrite,
and a confidence level (high/medium/low) with what would change your mind.

Sort by confidence, highest first. Skip code where the cleverness is earning
its keep - hot paths, library boundaries, places where the plain version
would be substantially longer. Flag those as "intentional - skipping"
rather than dropping them from the report.

3. Automated Tests

This is probably the biggest way to reduce maintenance costs with AI. AI can generate unit tests for code, propose edge cases humans tend to miss and write tests that lock in current behavior before you refactor. It can also find flaky tests, slow tests, and redundant tests, and convert integration tests into faster unit tests where that makes sense. There is absolutely no reason to have anything less than 100% code coverage. 100% is easy for AI to track. Anyway, going from 20% coverage to even 80% used to be a quarter-long slog. Now you can do it in a week.

Audit this codebase for test coverage gaps and weak tests. Aim for full
coverage by surfacing what's missing. Look for:

1. Public functions, methods, and exported modules with no tests at all
2. Tested code where only the happy path is covered, missing: null/undefined
   inputs, empty collections, boundary values (0, 1, max), unicode and
   encoding, concurrency, clock and timezone dependencies
3. Source code paths that aren't reached by any test
4. Tests that don't actually assert anything meaningful (no expects, only
   verifying that the code didn't throw)
5. Flaky tests, slow tests, and tests that duplicate what another test
   already covers
6. Integration tests that could be unit tests if a dependency were mocked

For each finding: file and line range (or test file), category, a
one-paragraph reason it matters, and a confidence level.

Sort by impact: untested code central to the system first, edge cases in
critical paths next, redundancy and flake last. For each untested function,
draft a test in the existing framework and style, including a one-line
comment on what it verifies. Flag any behavior you can't test because it
requires external state or context you don't have - don't fake what you
don't know.

4. Refactor Infinitely

The small refactors engineers used to skip - rename a poorly-named variable across the codebase, extract a duplicated block, split a 2,000-line file - become minutes of work. A weekly audit surfaces dozens in one pass, sorted by impact. "Leave it cleaner than you found it" used to mean 30 seconds at the end of each task. Now you sweep the whole codebase at once.

Audit this codebase for small refactors worth doing - the kind that used
to get skipped because they weren't worth the context switch. Look for:

1. Poorly-named variables, functions, or types (cryptic abbreviations,
   misleading names, single letters outside trivial scopes)
2. Duplicated code blocks that could be extracted into a function
3. Files over 500 lines that have grown two or more distinct
   responsibilities and should be split
4. Functions over 50 lines doing more than one thing
5. Long parameter lists (5+) that should be objects
6. Conditional ladders that should be lookup tables or polymorphism
7. Magic numbers and string literals that should be named constants

For each finding: file and line range, category, the current state, the
proposed refactor, estimated work (lines touched, files affected), and a
confidence level.

Sort by impact divided by risk - high-leverage, low-risk refactors first.
Don't propose refactors that require business context you don't have; flag
those as "needs human input."

5. Modular Architecture

AI can analyze a codebase for boundary violations: modules reaching into each other's internals, circular dependencies, leaky abstractions. Combine it with static analysis and get you a dependency graph plus the worst offenders highlighted. Then have it report back with suggestions. This one catches a lot for me.

For new code, AI can enforce architectural rules in code review. For existing, tangled systems, it's useful to propose where a monolith can be split, and for doing the mechanical work of moving code once you've decided.

Audit this codebase for boundary violations:

1. Modules importing each other's internals instead of going through a public interface
2. Circular dependencies between modules
3. Layer violations - a lower layer (data, infrastructure) importing from a higher
   layer (business logic, presentation)
4. Leaked abstractions - a module that's supposed to hide a detail but exposes it
   through return types, error messages, or required parameters
5. Modules that have grown two unrelated responsibilities and should be split
6. Two modules with overlapping responsibilities that should merge

For each finding: the modules involved and the specific import or call that's the
violation, why it's a problem concretely (what it couples, what it blocks), a
suggested fix including the rough shape of any new boundary, and a rough estimate
of how much work the fix is.

6. Good Documentation

AI can draft docs about decisions made from your commit history that might otherwise get buried. It can turn incident postmortems into first-draft runbooks, generate onboarding docs by reading the codebase, and correct docs that have drifted apart from the code.

The new problem is making sure AI-generated docs (and comments) show the reasoning, not just restating what the code obviously does. Some docs you might want to have are: architecture, auth, data-migrations, database, environment, features, design guide, and a test plan.

Audit this codebase's documentation. Compare what exists in /docs (or
equivalent) and any README files against what's actually in the code.
Look for:

1. Modules, features, or subsystems with no documentation at all
2. Documentation that describes behavior that no longer exists, or
   contradicts the current code
3. Decisions captured nowhere - non-obvious architectural choices, library
   selections, or design tradeoffs that a new engineer would have to
   reverse-engineer from git blame
4. Setup or operational steps documented partially but missing crucial
   details (env vars, gotchas, recovery procedures)
5. API surface that's documented for some endpoints or functions but not
   others
6. Comments that restate what the code does without explaining why

Useful targets if you find gaps: architecture, auth, data migrations,
database schema, environment setup, features, design guide, test plan.

For each finding: location, category, what's wrong or missing, what
should exist, and a confidence level. For each gap, draft the missing
documentation - capture the "why," not the "what." Cite commit messages,
PR descriptions, or comments when you can find the reasoning. If you
can't, mark it "needs human input" rather than inventing.

Sort by impact: things a new engineer would hit on day one first.

7. Strong Observability

AI can add logs, instrumentation, and analytics. It's also useful for writing alerts that catch problems without firing constantly, building better dashboards, and summarizing noisy log streams into something actionable. Over time, incidents resolve faster.

Audit this codebase and propose observability instrumentation. For each suggestion:
- Location (file and line)
- Type: log (with level), metric (counter/gauge/histogram), or tracing span
- Exact name and tags/attributes
- The question this instrumentation helps answer ("Is the payment gateway slow?"
  not "It logs the duration.")

Prioritize:
- Seams between services (network calls, queue publishes, DB queries)
- Decision points where the code chooses between paths
- Error handlers and retry logic
- Anything with a timeout, backoff, or circuit breaker

Don't suggest logging inside tight loops, logging that restates the function name,
or metrics no dashboard would ever look at. If you're unsure something's worth
instrumenting, leave it out.

8. Developer Experience

AI can audit your build, test, and CI pipelines for slowness and propose specific fixes like parallelization, caching, test splitting. It can write the scripts that make local setup one command. It's good at upgrading dev tooling, writing better error messages in internal tools, and generating IDE configs and pre-commit hooks tailored to your codebase.

Audit this project for developer experience problems. Look at:

1. Build config (package.json, Makefile, build scripts) - what's slow, redundant,
   or could be cached or parallelized
2. Test config - slow suites, sequential runs that could parallelize, repeated
   setup that could be hoisted
3. CI pipeline - jobs blocking other jobs unnecessarily, repeated work, missing
   caching
4. Local setup - how many commands does a new dev run before they can boot the
   app? How many implicit dependencies (specific Node version, local Postgres,
   env vars to copy)?
5. Error messages in internal tooling - anything that says "Error" with no
   actionable suggestion

For each finding: current state with the offending file/line, a concrete fix,
and estimated time saved per developer per day if you can estimate it. Sort by
impact (time saved × developers affected).

9. Dependencies

Major version upgrades that used to be quarter-long projects can often be done in days. AI reads the migration guide, applies the breaking changes across the codebase, updates the tests, and asks about anything that needs human judgment.

It can also audit your codebase for unmaintained dependencies, known vulnerabilities, or things that could be replaced with standard-library equivalents. Run dependency audits regularly.

Audit every dependency in this project. For each one, assess:

1. Currency - how many major/minor versions behind latest? When was the
   package last released? Is it still actively maintained?
2. Security - any known CVEs? Any transitive dependencies with known issues?
3. Necessity - how much of the dependency does this codebase actually use?
   Could it be replaced with standard-library equivalents or a few lines of
   in-house code?
4. Migration cost - for any major version we're behind on, read the
   migration guide and estimate the breaking-change surface area in our code.
5. Risk concentration - are we depending on multiple packages from a single
   maintainer or org that's a single point of failure?

For each dependency, return: name, current version, latest version, last
release date, maintenance status, CVE summary, usage scope in our code,
upgrade recommendation (upgrade now / upgrade soon / replace / remove /
leave alone), estimated migration effort, and a confidence level.

Sort by risk: unmaintained or vulnerable packages first, then large
version-lag packages with security implications, then everything else.
Don't suppress deprecation warnings to make things look healthier than
they are - if a deprecation will become a breaking change in the next
version, surface it.

10. Spread Knowledge

AI partially solves bus-factor risk by reading the codebase and producing the docs that should already exist. Anyone can ask "how does payments work in this system?" and get a useful answer, instead of waiting for the one person who built it. It can also generate role-specific onboarding curricula, summarize how features evolved over time, and create searchable indexes of past decisions. (The manual diagnostic lives in a few git log commands - same analysis, more typing.)

Don't let it replace human knowledge-sharing, though. Pair programming and code review still matter because they build judgment, not just transfer facts. Use AI for the "where is X?" questions so humans can focus on the "why did we decide X?" conversations that actually require dialogue.

Audit this codebase for bus-factor risk and produce documentation that
reduces it. The goal: any engineer should be able to answer "how does X
work?" by reading what you produce, without waiting for the one person who
built it. Look for:

1. Subsystems where git blame is dominated by a single author
2. Modules with no documentation, sparse comments, and non-obvious behavior
3. Critical paths (auth, payments, data migrations, anything customer-facing)
   that lack runbooks or onboarding notes
4. Tribal knowledge encoded in code review history, commit messages, or
   PR descriptions but not in any document a new engineer would find

For each high-risk area, produce a short knowledge doc covering: what the
subsystem does, how it works at a high level, why it's designed this way
(cite sources when you can), the failure modes worth knowing about at 2am,
and where to look first when debugging.

Ground every claim in specific files, functions, and line ranges. Skip
what's obvious from reading the code - focus on the why and the gotchas.
If two parts of the codebase contradict each other, say so. Mark anything
you couldn't determine as "needs human input" rather than inventing.

Sort output by bus-factor risk: highest concentration of single-author
ownership first.

Schedule the Prompts

You don't have to run these prompts manually. Claude Code has a /schedule slash command that sets up recurring runs as "routines" - saved configurations that fire on a schedule, clone your repo, and write back a report or open a PR. The Cowork tab in the Claude app does the same thing for people who'd rather not be in the terminal.

The setup is a conversation. Claude asks what you want to schedule, then handles the boring stuff: converting plain English into cron, scoping tool permissions, picking a model, and adding repo-specific context so the agent isn't completely cold when it wakes up next Monday.

Here's me setting up the dead-code audit (truncated):

❯ /schedule

● What should this agent do?

❯ [pasted the dead-code audit prompt]
  Let's use claude-opus-4-7. We can run it every Monday at 10 am.

● Quick conversion: 10am Monday America/Los_Angeles = 17:00 UTC.
  Cron: 0 17 * * 1.

  Here's the full config:
  - Name: Weekly dead-code audit
  - Schedule: 0 17 * * 1 (Mondays 10am PDT / 9am PST)
  - Model: claude-opus-4-7
  - Repo: dustinboston/dblogit-next
  - Tools: Read, Glob, Grep, Bash (read-only - analysis only)

  Prompt (your text, with a preamble so the agent has repo context):

      You are running a weekly dead-code audit on the dblogit-next repo
      (Next.js 16 App Router, TypeScript, Drizzle ORM, Vitest + Playwright).

      Focus on `src/`, `scripts/`, and `tests/`. Use Grep liberally to verify
      "unused" claims - search for the symbol across the whole repo, not
      just the file where it's defined. Remember that Next.js App Router
      files (page.tsx, layout.tsx, route.ts, server actions) are entrypoints
      and won't have explicit imports.

      [...original prompt...]

● Routine created.
  Next run: Mon, May 18, 10:08 AM PDT
  Manage: https://claude.ai/code/routines/trig_011f32v4Q9YuyH4xU1ZHTKBX

A few things to notice. "10am Monday" turned into 0 17 * * 1 automatically. I didn't have to think about timezones or learn cron. Claude added a preamble to my prompt with the repo's stack and a note about Next.js entrypoint files not having explicit imports, which is the kind of context-setting that makes a cold-start agent useful instead of confused. Tools were locked to read-only because the audit is analysis, not changes. I asked for Opus 4.7 instead of the Sonnet default since these audits are worth a bit more thinking.

All of these prompts work as routines now that they're framed as codebase audits. Setting them up takes about a minute each, and then I just read the PRs that show up Monday morning. The routines surface the work; the engineer in the loop still decides what to merge.

To reduce maintenance costs with AI, you don't need new advice. Most of it has been canon for decades, and every senior engineer would nod along reading the list. What changed is the cost. Maintenance work was always worth doing; it just wasn't affordable for most teams. Now it is.