When Your SQL Query "Works" But Everything Is Wrong

This week, I've been debugging a SQL query that runs perfectly. No errors. Executes in seconds. Returns results.

There's just one problem: it returns 148 rows.

January this year, the same query returned 1.5 million rows.

Something is very, very wrong.

And this is the part of being a data analyst that no bootcamp, course, or tutorial prepares you for.

The Story: When "Working" Code Is Actually Broken

Let me set the scene: I'm working with a long, complex query that creates a specific table. This query depends on multiple intermediate tables, each built from other queries.

It's one of those queries where you inherit someone else's work, or your own work from months ago, and you're just running it again with updated data.

The query executes. No syntax errors. No obvious issues. Everything looks fine.

Until you look at the output.

148 rows.

I stare at the number. Run the table for January 2025. 1.5 million rows.

That's not a small difference. That's not "oh, the business slowed down a bit." That's "something is fundamentally broken."

But the query works. The code is fine. So what's wrong?

The Investigation Begins

When I was just starting to work with SQL, my ultimate goal was to write a query that executes. No errors, no syntax issues — just get it to run.

What I wasn't prepared for is what to do when your technically correct query returns completely incorrect results.

This is where the real work begins.

My debugging process:

Step 1: Confirm the problem

  • Pull historical results to compare

  • Check if business reality actually changed this dramatically

Step 2: Break down the complex query

  • This isn't a simple SELECT statement

  • It's a chain of intermediate tables, each depending on the previous one

  • Multiple CTEs, multiple JOINs, filters, aggregations

Step 3: Debug one table at a time

When debugging complex queries, you have two approaches:

Option 1: Work backwards

  • Start with the final output table: 148 rows (wrong)

  • Go back one step: Check the table that feeds into it

  • Go back another step: Check that table's input

  • Keep going backwards until you find where the numbers diverge

Option 2: Work forwards

  • Start from the very beginning

  • Run the first query/subquery and compare to previous period

  • Then the next one, comparing each step

  • Keep going forward until you spot where it breaks

What I found: The culprit is an input table. One of the source tables feeding into this complex query.

And I'm still investigating exactly what changed and why, wish me luck!

This is the reality of being a data analyst:

Understanding your data:

  • Where does it come from?

  • How is it structured?

  • What does each field actually mean?

  • What are the business rules?

Validating your results:

  • Does this number make business sense?

  • How does it compare to previous periods?

  • Should this metric be going up or down?

  • What's normal vs. what's a red flag?

Detective work:

  • When something's wrong, tracing back through the pipeline

  • Understanding dependencies between tables

  • Figuring out what changed and why

Business logic:

  • Understanding what the number SHOULD roughly be (Feel for numbers comes with time. You will roughly know how many customers you have, what monthly revenue is etc)

  • Asking "does this make sense?" constantly

This is the job. The SQL is just the tool.

How to Develop Business Logic Intuition

1. Always know the rough order of magnitude

Before running a query, ask yourself:

  • What do I expect to see? (Roughly)

  • Hundreds? Thousands? Millions? - check previous periods, old files/presentations, ask colleagues

You don't need to know the exact number. But you should know if you're expecting thousands vs. millions.

2. Compare to previous periods

Always. Every time. Without exception.

  • Same query last month?

  • Same query last quarter?

  • Same query last year?

Another story from this week, I have a report that in Jan-Oct’24 was showing ~ 24k users, Nov’24 - 950 users, Dec’24 - 60 users. The business hasn’t changed, so….

3. Understand the business context

  • Did a promotion end? (Expect sales to drop)

  • Did we launch in a new country? (Expect new customers)

  • Is this our busy season? (Expect higher numbers)

  • Did we stop offering a product? (Expect that product's sales to be zero)

Changes should map to business reality.

4. Sanity check aggregations

  • Does total revenue = sum of all transactions?

  • Do regional sales add up to national sales?

  • Is average between min and max?

  • Are percentages between 0 and 100?

These seem obvious, but you'd be surprised how often they catch errors.

For example, the other day I did sum(sales_jan) as jan, sum(sales_jan) as oct - as you can guess my Jan and Oct were suspiciously the same. Sometimes it is important to step back and check your numbers once again.

5. Check for silent data loss

This is what happened to me this week.

Places where data can disappear without errors:

  • JOINs that exclude unmatched rows

  • WHERE clauses that filter too aggressively

  • Data type mismatches

  • Changed table structures

  • Incomplete data loads

Always compare row counts at each step.

Here's what I wish someone had told me when I started:

You will spend more time debugging than writing new code.

You will spend more time understanding data than analysing it.

You will question your results more than you trust them.

And this is normal. This is the job.

As I write this, I'm still debugging that query. My query isn't "broken" in the traditional sense. It runs. It returns results.

It's just returning the wrong results.

The next time your query runs successfully, don't stop there.

Ask yourself:

  • Does this make sense?

  • How does this compare to last time?

  • What story is this data telling, and is it a believable story?

Because "working" code isn't the goal.

Correct code is the goal.

And sometimes, you're the only one who can tell the difference.

Keep pushing 💪

Karina

SQL Tip

If you're copy-pasting the same query 4 times just to change one value, it works... but there's a better way.

SELECT * FROM Orders WHERE Region = 'West';
SELECT * FROM Orders WHERE Region = 'East';
SELECT * FROM Orders WHERE Region = 'South';
SELECT * FROM Orders WHERE Region = 'Central';

Copy. Paste. Change value. Repeat. It works. But it's painful.

The better way: Use a variable

Change the region? Change one line. Done.

DECLARE @Region VARCHAR(50) = 'West';

SELECT *
FROM Orders
WHERE Region = @Region;
  • No more find-and-replace across 50 lines

  • Change values in one place

  • Easier to test different scenarios

  • Cleaner stored procedures

  • Reusable scripts

Grab your freebies if you haven’t done already:

Data Playbook (CV template, Books on Data Analytics and Data Science, Examples of portfolio projects)

Need more help?

Just starting with Python? Wondering if programming is for you?

Master key data analysis tasks like cleaning, filtering, pivot and grouping data using Pandas, and learn how to present your insights visually with Matplotlib with ‘Data Analysis with Python’ masterclass.

Building your portfolio?
Grab the Complete EDA Portfolio Project — a full e-commerce analysis (ShopTrend 2024) with Python notebook, realistic dataset, portfolio template, and step-by-step workflow. See exactly how to structure professional portfolio projects.

Grab your Pandas CheatSheet here. Everything you need to know about Pandas - from file operations to visualisations in one place.

More from me: YouTube | TikTok | Instagram | Threads | LinkedIn

Data Analyst & Data Scientist