Most teams can tell you when their app crashes. Few can tell you why it's slow. An API endpoint that should respond in 100ms creeps toward 500ms, page loads that used to take 2 seconds stretch to 8, and your users blame the internet while the real culprit hides in your backend. Application performance monitoring (APM) is the practice of measuring how fast your app actually responds and pinpointing where it slows down. Unlike error tracking, which answers "what broke?", APM answers "why is it slow?" — and that matters more often.
This guide explains what APM is, how it differs from error tracking and logging, and why every team serious about reliability needs all three.
What APM actually measures
At its core, APM captures timing data from your application. Every request — an HTTP call, a database query, an async task — is measured from start to finish. That timing data, combined with trace context, lets you see exactly where your requests spend their time: waiting for a database, calling a third-party API, processing data, or something else entirely.
A real APM system gives you four things:
- Transactions — high-level requests you care about (a web request, a scheduled job, a message handler).
- Spans — the smaller operations inside a transaction (a database query, a cache lookup, an external API call).
- Percentiles — not just average, but P50, P95, P99 so you see the worst experiences.
- Cross-service correlation — when one request touches multiple services, traces stitch them together into a single waterfall.
The moment you have this data, problems that seemed mysterious become obvious. A slow checkout endpoint isn't slow everywhere — it's slow for 5% of requests, and only when the inventory service is saturated. An average latency of 200ms hides that some requests take 10 seconds. This is what APM reveals.
The four core performance metrics
Every APM system tracks these four metrics (though vendors often give them different names):
Response time is how long a request takes from start to finish. If your API endpoint takes 500ms on average, that's your baseline. But average hides the truth.
P95 and P99 latency tell you what your slowest users actually experience. If P95 is 2 seconds, that means 5% of your users wait at least 2 seconds per request. At scale, that matters — percentiles reveal the tail of your distribution, where the most frustrated users live.
Throughput is requests per second. This matters because performance degrades under load. An endpoint that serves 100 req/s at 100ms may start timing out at 500 req/s. APM shows you the relationship.
Error rate is often tracked alongside latency. Some errors are fast (a rejected request), others slow (a timeout). Performance and correctness are intertwined.
A transaction is like a container. A span is the work inside it. If your transaction is "POST /checkout", the spans are "query inventory", "call payment processor", "log order", each with its own timing.
How APM works: tracing under the hood
Modern APM relies on distributed tracing. When a request enters your system, it gets a trace ID. As the request bounces between services — web server to API to database to cache — every hop adds a span with that same trace ID. At the end, you have a complete waterfall showing every millisecond, every service, every operation.
This is how you find slow database queries. You look at a span waterfall, see that your database spans take 400ms out of a 500ms request, query your database directly to find slow queries, and there's your bottleneck.
Without tracing, you see "requests are slow" and spend hours guessing. With it, you see "this specific query to the inventory table takes 380ms and runs 1000x more than it should" — and the N+1 query problem is obvious.
The same principle applies across services. A request that touches your web app, your API, your payment processor, and a third-party analytics service becomes one readable waterfall. You're not staring at separate dashboards wondering which service is the bottleneck. You see it.
Why APM is separate from error tracking
Teams often think error tracking and APM are the same tool. They're not — they answer different questions.
| Aspect | Error Tracking | APM |
|---|---|---|
| Answers | What broke? | Why is it slow? |
| Trigger | An exception is thrown | A request is made |
| Context | Stack trace, breadcrumbs, user | Timing, database queries, external calls |
| Grouping | By root cause (fingerprint) | By transaction type / endpoint |
Error tracking captures exceptions. APM captures performance. An endpoint can be fast and full of errors, or slow and perfect. A well-tuned system needs both: error tracking tells you what's breaking, APM tells you what's slowing down.
What they share is context. When an error occurs inside a trace, that error carries the trace ID. So you can jump from "500 errors on checkout" to "every timeout happened when the inventory service was saturated." Both pieces of observability matter. Error tracking without APM means you find bugs but not bottlenecks. APM without error tracking means you know your service is slow but not why.
Look for tools that combine both. LightTrace gives you error tracking and distributed tracing so you see failures and slowdowns in the same place. Separate dashboards multiply the time to diagnosis.
Where slow things hide
Slow performance rarely announces itself. Here's where to look:
Database queries — the most common culprit. A well-written query returns in 10ms. A bad one, or one that runs 1000 times per request, can stretch an endpoint from 100ms to 2 seconds. Trace spans show every query so you spot the offender.
External APIs — third-party services you call from your code (payment processors, analytics, CDNs). A 500ms call to a slow external API doesn't fail your request — it just slows it down. At scale, this dominates latency.
Cascading calls — sometimes a single request triggers a chain of operations that could run in parallel. Tracing shows the critical path and where you can parallelize.
Load and contention — performance isn't constant. A request that takes 100ms at noon takes 2 seconds at 2 p.m. Percentile tracking catches this; averages hide it.
Getting started with application performance monitoring
If you're already sending errors to LightTrace, you're most of the way there. LightTrace includes distributed tracing so you can see transaction performance and trace spans across services. Every Sentry SDK already supports tracing — you just set a sample rate and deploy.
Here's the minimal setup for Node.js:
import * as Sentry from "@sentry/node";
Sentry.init({
dsn: "https://<key>@your-lighttrace-host/1",
tracesSampleRate: 0.1, // Capture 10% of transactions
environment: "production",
release: "api@2.1.0",
});
That's it. Every request now becomes a transaction with spans for your database queries, external calls, and any slow operations you care about. You'll see response times, percentiles, and error rates grouped by endpoint.
If performance is already a known pain point, start with a higher sample rate — capture 100% of your slowest percentile transactions so you catch the worst cases. As your infrastructure stabilizes, lower the sample rate to keep volume manageable.
Sampling is essential at high volume. Capturing every transaction from a service that does 10,000 req/s will cost you. Sample 10%, and you still catch patterns. Catch 100%, and your bill explodes.
The payoff is massive. Reducing API latency by even 100ms improves user experience and reduces load on your infrastructure. APM shows you where those 100ms live, so you fix the right thing. Tag every transaction with a release and you'll spot the exact deploy that introduced slowness — and roll it back before your users notice.
Start tracking errors in minutes
Start monitoring your app's performance and see where every millisecond goes — set up distributed tracing in minutes with LightTrace.
The difference between a fast app and a slow one often isn't code quality. It's visibility. Error tracking tells you what's wrong. APM tells you what's slow. Together, they're how modern teams ship fast and stable.