SQL Tutorial for QA Database Testing and Verification

🗄️
🗄️ SQL for QA
SQL Introduction
April 2026 | TechWorld Labs

SQL is a critical QA skill. Use it to verify database state, validate data integrity, set up test data, and investigate backend issues.

Tutorial
Interview Q&A

What is SQL and Why It's Essential for QA

SQL (Structured Query Language) is the standard language for interacting with relational databases. Every modern application stores its data in a database — whether it's PostgreSQL, MySQL, Oracle, SQL Server, or others. As a QA engineer, you need to be able to directly query that database to verify data integrity, set up test scenarios, and investigate test failures.

Many QA engineers make the mistake of only testing through the UI. But the real test is in the database: did the application actually save the data? Did it transform it correctly? Did it enforce constraints? SQL skills allow you to answer these questions directly, making you a far more effective and credible QA professional.

Why QA Engineers Must Master SQL

Database verification is one of the most critical yet often overlooked aspects of QA. A UI test might show a success message, but if the data wasn't actually saved to the database, that's a critical bug that a UI-only test would never catch.

🔍
Data Verification
Query the database directly to confirm data is saved correctly, transactions completed, and constraints enforced.
🏗️
Test Data Setup
Insert, update, and delete test data directly. No need to manually create data through the UI — much faster and more reliable.
🐛
Root Cause Investigation
When a test fails, query the database to investigate. Is the data missing? Incorrect format? Corrupted? SQL gives you answers.
ETL & Data Pipeline Testing
Validate data transformations, aggregations, and migrations — all done in SQL before UI verification.

SQL Statement Types

CategoryStatementsQA Use
DQLSELECTQuery data for verification
DMLINSERT UPDATE DELETEManage test data
DDLCREATE ALTER DROPSchema validation
TCLCOMMIT ROLLBACKTransaction testing
Q1

What is the difference between WHERE and HAVING?

WHERE filters rows before grouping. HAVING filters groups after GROUP BY — used with aggregate functions like COUNT, SUM.

Q2

What is a PRIMARY KEY?

A column (or set of columns) that uniquely identifies each row in a table. It cannot be NULL and must be unique.

Q3

Difference between TRUNCATE and DELETE?

DELETE removes rows one by one and can be rolled back. TRUNCATE removes all rows instantly, cannot be rolled back, and resets auto-increment.

🔎
🗄️ SQL
SELECT Queries
April 2026
SQL
-- Select all columns
SELECT * FROM users;

-- Select specific columns
SELECT id, name, email FROM users;

-- With column aliases
SELECT name AS full_name, email AS contact FROM users;

-- Distinct values (find duplicates)
SELECT DISTINCT role FROM users;

-- Count rows
SELECT COUNT(*) AS total_users FROM users;

-- Order results
SELECT * FROM orders ORDER BY created_at DESC LIMIT 10;
🔬
🗄️ SQL
WHERE & Filters
April 2026
SQL — Filter examples
-- Exact match
SELECT * FROM users WHERE role = 'admin';

-- Numeric comparison
SELECT * FROM orders WHERE amount > 1000;

-- Pattern match (LIKE)
SELECT * FROM users WHERE email LIKE '%@gmail.com';

-- IN list
SELECT * FROM tests WHERE status IN ('PASS', 'FAIL');

-- NULL check
SELECT * FROM tests WHERE executed_at IS NULL;

-- Range (BETWEEN)
SELECT * FROM orders WHERE amount BETWEEN 100 AND 500;

-- Combined conditions
SELECT * FROM users
WHERE role = 'admin' AND created_at > '2025-01-01';
✏️
🗄️ SQL
INSERT / UPDATE / DELETE
April 2026
SQL — Data Manipulation
-- INSERT single row (test data setup)
INSERT INTO users (name, email, role, created_at)
VALUES ('QA Tester', '[email protected]', 'user', NOW());

-- INSERT multiple rows
INSERT INTO users (name, email) VALUES
  ('Test User 1', '[email protected]'),
  ('Test User 2', '[email protected]');

-- UPDATE (always use WHERE!)
UPDATE users
SET role = 'admin', updated_at = NOW()
WHERE email = '[email protected]';

-- DELETE (test data cleanup)
DELETE FROM users WHERE email LIKE '%@test.com';
⚠️
Safety: Always use a WHERE clause with UPDATE and DELETE. A missing WHERE clause modifies or deletes ALL rows in the table!
🔗
🗄️ SQL
JOINs Explained
April 2026
JOIN TypeReturnsQA Use
INNER JOINOnly rows matching in BOTH tablesUsers who placed orders
LEFT JOINAll left rows + matching right (NULL if no match)Users even without orders
RIGHT JOINAll right rows + matching leftOrders even with deleted users
FULL OUTER JOINAll rows from both tablesComplete reconciliation
SQL — JOIN examples
-- INNER JOIN: orders with user details
SELECT o.id, u.name, u.email, o.amount, o.status
FROM orders o
INNER JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending'
ORDER BY o.created_at DESC;

-- LEFT JOIN: all users including those without orders
SELECT u.name, COUNT(o.id) AS order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id;
📊
🗄️ SQL
Aggregate Functions
April 2026
FunctionPurposeQA Verification Use
COUNT(*)Count rowsVerify number of records created
SUM(col)Sum valuesValidate order totals
AVG(col)Average valueCheck calculated statistics
MIN(col)Minimum valueBoundary value testing
MAX(col)Maximum valueBoundary value testing
SQL — Aggregates + GROUP BY
-- Test results summary
SELECT
  status,
  COUNT(*) AS count,
  ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM test_runs), 1) AS pct
FROM test_runs
GROUP BY status
ORDER BY count DESC;
🪆
🗄️ SQL
Subqueries
April 2026
SQL — Subquery types
-- IN subquery: users who placed orders
SELECT * FROM users
WHERE id IN (
  SELECT DISTINCT user_id FROM orders
);

-- NOT IN: users who have NOT placed orders
SELECT * FROM users
WHERE id NOT IN (
  SELECT DISTINCT user_id FROM orders WHERE user_id IS NOT NULL
);

-- Scalar subquery: compare to average
SELECT * FROM orders
WHERE amount > (SELECT AVG(amount) FROM orders);
🗄️ SQL
DB Verification Testing
April 2026

Use SQL assertions to verify that your application stored data correctly after test actions.

SQL — QA Verification Patterns
-- ✅ After registration test: verify user created
SELECT COUNT(*) FROM users WHERE email = '[email protected]';
-- Expected: 1

-- ✅ After order test: verify correct amount saved
SELECT amount, status, user_id
FROM orders WHERE reference = 'TEST-ORDER-001';

-- ✅ Verify no duplicate emails
SELECT email, COUNT(*) AS count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
-- Expected: 0 rows

-- 🧹 Test data cleanup (run after tests)
DELETE FROM orders WHERE reference LIKE 'TEST-%';
DELETE FROM users  WHERE email     LIKE '%@test.com';
📦
🗄️ SQL for QA
Stored Procedures
April 2026  |  TechWorld Labs

Package SQL logic for reuse — great for consistent test data setup and teardown scripts across your automation framework.

Tutorial
Programs
Interview Q&A

What is a Stored Procedure?

A stored procedure is a named, precompiled SQL block stored in the database. It accepts parameters, executes multiple SQL statements, and can return result sets. Call it with CALL procedure_name(args);

Parameter Types

TypeDirectionUse
INInput onlyPass values into the procedure
OUTOutput onlyReturn values to the caller
INOUTBothPass in and get a modified value back
1

Test Data Setup

Call a procedure before each test to insert a clean, consistent set of test data — no manual INSERT statements in test scripts.

2

Test Data Cleanup

Call a teardown procedure in @AfterMethod to delete all test records — keeps the database clean between runs.

3

IF / LOOP / Cursors

Procedures support conditionals, loops, and cursors — enabling complex data generation logic impossible in a single SQL statement.

4

CALL from Java

Use JDBC CallableStatement to invoke stored procedures directly from Selenium/Java test code using {CALL procedure_name(?,?)}.

Stored Procedure — MySQL

SQL — SetupTestData procedure
DELIMITER //

CREATE PROCEDURE SetupTestData(
  IN  p_email    VARCHAR(100),
  IN  p_role     VARCHAR(20),
  OUT p_user_id  INT
)
BEGIN
  -- Clean any existing record first
  DELETE FROM users WHERE email = p_email;

  -- Insert fresh test user
  INSERT INTO users (email, role, created_at)
  VALUES (p_email, p_role, NOW());

  -- Return the generated ID
  SET p_user_id = LAST_INSERT_ID();
END //

DELIMITER ;

-- Call the procedure
CALL SetupTestData('qa@test.com', 'user', @uid);
SELECT @uid;

Teardown Procedure

SQL — CleanupTestData procedure
DELIMITER //
CREATE PROCEDURE CleanupTestData()
BEGIN
  DELETE FROM orders WHERE reference LIKE 'TEST-%';
  DELETE FROM users  WHERE email     LIKE '%@test.com';
END //
DELIMITER ;

-- Call in test teardown
CALL CleanupTestData();

Call from Java (JDBC)

JAVA — CallableStatement
Connection conn = DriverManager.getConnection(url, user, pass);
CallableStatement cs = conn.prepareCall(
    "{CALL SetupTestData(?, ?, ?)}");

cs.setString(1, "qa@test.com");
cs.setString(2, "user");
cs.registerOutParameter(3, Types.INTEGER);
cs.execute();

int userId = cs.getInt(3);
System.out.println("Created user ID: " + userId);
Q1

What is a stored procedure and why use it in QA?

A stored procedure is a precompiled, named SQL block stored in the database. QA uses them for reusable test data setup/teardown — call one procedure instead of writing 10 INSERT statements in every test.

Q2

Difference between IN, OUT, and INOUT parameters?

IN passes a value into the procedure (read-only inside). OUT returns a value to the caller (write-only inside). INOUT does both — the value is passed in, modified, and returned back.

Q3

How do you call a stored procedure from Java?

Use JDBC's CallableStatement with the syntax {CALL procedure_name(?,?,?)}. Register OUT parameters with registerOutParameter() before calling execute(). Retrieve OUT values with getInt(), getString(), etc.

Q4

Stored procedure vs function in SQL?

A function returns a single value and can be used in SELECT statements. A stored procedure can return multiple result sets, has OUT params, and is called with CALL — it cannot be used inside a SELECT expression.

🏗️
🗄️ SQL for QA
CREATE TABLE & DDL
April 2026  |  TechWorld Labs

Data Definition Language — create, modify, and drop database objects. Understand data types, constraints, and schema design for database testing.

Tutorial
Programs
Interview Q&A

Common SQL Data Types

TypeUseExample
INT / BIGINTWhole numbers, IDsuser_id INT
VARCHAR(n)Variable-length textname VARCHAR(100)
CHAR(n)Fixed-length textstatus CHAR(1)
TEXTLong text contentdescription TEXT
DECIMAL(p,s)Exact decimal numbersamount DECIMAL(10,2)
BOOLEANTrue/false flagsis_active BOOLEAN
DATEDate only (YYYY-MM-DD)dob DATE
DATETIMEDate + timecreated_at DATETIME
TIMESTAMPDate + time + timezoneupdated_at TIMESTAMP

Constraints

ConstraintPurpose
PRIMARY KEYUniquely identifies each row — cannot be NULL, must be unique
FOREIGN KEYLinks to primary key of another table — enforces referential integrity
UNIQUEAll values in column must be unique (allows one NULL)
NOT NULLColumn cannot store NULL values
CHECKValidates that values meet a condition
DEFAULTProvides a default value when none is supplied

CREATE TABLE with Constraints

SQL — CREATE TABLE
CREATE TABLE users (
  id          INT           AUTO_INCREMENT PRIMARY KEY,
  name        VARCHAR(100)  NOT NULL,
  email       VARCHAR(150)  NOT NULL UNIQUE,
  role        VARCHAR(20)   NOT NULL DEFAULT 'user',
  age         INT           CHECK (age >= 18),
  is_active   BOOLEAN       DEFAULT TRUE,
  created_at  DATETIME      DEFAULT NOW()
);

CREATE TABLE orders (
  id          INT           AUTO_INCREMENT PRIMARY KEY,
  user_id     INT           NOT NULL,
  amount      DECIMAL(10,2) NOT NULL,
  status      VARCHAR(20)   DEFAULT 'pending',
  created_at  DATETIME      DEFAULT NOW(),
  FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
);

ALTER TABLE & DROP

SQL — ALTER & DROP
-- Add a column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Modify a column type
ALTER TABLE users MODIFY COLUMN name VARCHAR(200) NOT NULL;

-- Drop a column
ALTER TABLE users DROP COLUMN phone;

-- Rename a table
ALTER TABLE users RENAME TO app_users;

-- Drop table (permanent — cannot be rolled back)
DROP TABLE IF EXISTS temp_test_data;

-- Truncate (remove all rows, reset auto-increment)
TRUNCATE TABLE test_logs;
Q1

What is the difference between DROP, TRUNCATE, and DELETE?

DELETE removes specific rows (can use WHERE, can be rolled back, triggers fire). TRUNCATE removes ALL rows instantly, resets auto-increment, cannot be rolled back, no triggers. DROP removes the entire table structure and data permanently.

Q2

Difference between CHAR and VARCHAR?

CHAR(n) is fixed-length — always stores exactly n bytes, padding with spaces if shorter. VARCHAR(n) is variable-length — stores only as many bytes as the actual value plus a length prefix. Use CHAR for codes with fixed length (status, country code), VARCHAR for names and emails.

Q3

What is a FOREIGN KEY and what does ON DELETE CASCADE do?

A FOREIGN KEY enforces referential integrity — it ensures a value in one table references a valid row in another table. ON DELETE CASCADE automatically deletes child rows when the parent row is deleted, preventing orphaned records.

Q4

What is the difference between PRIMARY KEY and UNIQUE?

PRIMARY KEY = UNIQUE + NOT NULL + only one per table. UNIQUE allows one NULL value and a table can have many UNIQUE constraints. Both create an index on the column automatically.

Q5

What is a CHECK constraint?

CHECK validates that column values meet a condition at insert/update time. Example: CHECK(age >= 18) prevents inserting users under 18. If the condition is violated, the INSERT/UPDATE is rejected with an error.

🗄️ SQL for QA
NULL Handling
April 2026  |  TechWorld Labs

NULL is not zero, not an empty string — it means unknown or missing. Mishandling NULLs is one of the most common bugs in database-driven applications.

Tutorial
Programs
Interview Q&A

NULL Functions

FunctionPurposeReturns
IS NULLCheck if value is NULLTRUE / FALSE
IS NOT NULLCheck if value has a valueTRUE / FALSE
COALESCE(a,b,c)Return first non-NULL valueFirst non-NULL
IFNULL(val, default)Return default if val is NULL (MySQL)val OR default
NULLIF(a, b)Return NULL if a equals b, else aa OR NULL
ISNULL(val)Returns 1 if val is NULL (MySQL)1 or 0
⚠️
NULL trap: NULL = NULL is not TRUE — it returns NULL. You cannot use = NULL or != NULL. Always use IS NULL or IS NOT NULL.
💡
QA tip: Test how your application handles NULL in inputs, calculations, and display. A SUM with one NULL row returns NULL unless handled with COALESCE(val, 0).

NULL Query Examples

SQL — NULL Handling
-- Find rows where phone is missing
SELECT * FROM users WHERE phone IS NULL;

-- Find rows where phone is provided
SELECT * FROM users WHERE phone IS NOT NULL;

-- COALESCE: use first non-null value
SELECT name, COALESCE(phone, mobile, 'No contact') AS contact
FROM users;

-- IFNULL: replace NULL with a default (MySQL)
SELECT name, IFNULL(discount, 0) AS discount
FROM orders;

-- NULLIF: avoid division by zero
SELECT total / NULLIF(quantity, 0) AS unit_price
FROM order_items;

-- SUM ignores NULLs — use COALESCE to treat NULL as 0
SELECT SUM(COALESCE(discount, 0)) FROM orders;

-- Count non-null values only
SELECT COUNT(phone) FROM users; -- ignores NULLs
SELECT COUNT(*) FROM users;    -- counts ALL rows including NULLs
Q1

What is NULL in SQL?

NULL means unknown or missing — it is not zero, not an empty string, and not false. It represents the absence of a value. Any arithmetic operation with NULL returns NULL, and any comparison with NULL using = returns NULL (not TRUE or FALSE).

Q2

Why can't you use = NULL to check for NULL?

Because NULL = NULL evaluates to NULL (unknown), not TRUE. SQL's three-valued logic (TRUE, FALSE, NULL) means comparisons with NULL are always NULL. Use IS NULL or IS NOT NULL — these are the only operators that correctly detect NULL values.

Q3

What is COALESCE?

COALESCE(a, b, c, ...) returns the first non-NULL argument in the list. It works on any number of arguments and is ANSI SQL standard. Use it to provide fallback values: COALESCE(phone, mobile, 'N/A') returns phone if available, else mobile, else 'N/A'.

Q4

What is NULLIF and when would you use it?

NULLIF(a, b) returns NULL if a equals b, otherwise returns a. The main use case is preventing division by zero: total / NULLIF(quantity, 0) — if quantity is 0, NULLIF returns NULL, so the division returns NULL instead of an error.

🔀
🗄️ SQL for QA
CASE Statement
April 2026  |  TechWorld Labs

CASE is SQL's IF-THEN-ELSE. Use it to classify data, create computed columns, assign labels, and conditionally aggregate results — all inside a single query.

Tutorial
Programs
Interview Q&A

CASE Syntax

TypeSyntaxUse When
Searched CASECASE WHEN condition THEN result ENDDifferent conditions per branch — most flexible
Simple CASECASE col WHEN val THEN result ENDComparing one column to multiple values
1

In SELECT (computed column)

Create a label or category column based on data — classify order amounts as Low/Medium/High without changing the underlying data.

2

In GROUP BY reports

Pivot data by using CASE inside SUM() — count or sum values per category in a single row instead of multiple rows.

3

In ORDER BY

Apply custom sort order — for example, sort statuses as FAILED first, then PENDING, then PASSED, rather than alphabetically.

4

In WHERE (via subquery)

CASE cannot be used directly in WHERE — wrap it in a subquery or use it in a computed column and filter on the alias in an outer query.

CASE in SELECT

SQL — CASE Statement
-- Classify order amounts
SELECT
  id, amount,
  CASE
    WHEN amount < 100               THEN 'Low'
    WHEN amount BETWEEN 100 AND 999 THEN 'Medium'
    ELSE 'High'
  END AS amount_tier
FROM orders;

-- Simple CASE: translate status code
SELECT id,
  CASE status
    WHEN 'P' THEN 'Pending'
    WHEN 'A' THEN 'Approved'
    WHEN 'R' THEN 'Rejected'
    ELSE 'Unknown'
  END AS status_label
FROM orders;

CASE for Pivot Report & Custom Sort

SQL — Pivot & Custom ORDER
-- Pivot: count test results per type in one row
SELECT
  SUM(CASE WHEN status = 'PASS' THEN 1 ELSE 0 END) AS passed,
  SUM(CASE WHEN status = 'FAIL' THEN 1 ELSE 0 END) AS failed,
  SUM(CASE WHEN status = 'SKIP' THEN 1 ELSE 0 END) AS skipped
FROM test_runs;

-- Custom sort: FAIL first, then PENDING, then PASS
SELECT * FROM test_runs
ORDER BY
  CASE status
    WHEN 'FAIL'    THEN 1
    WHEN 'PENDING' THEN 2
    ELSE 3
  END;
Q1

What is a CASE statement in SQL?

CASE is SQL's conditional expression — equivalent to IF-THEN-ELSE. It evaluates conditions in order and returns the result for the first matching WHEN. If no condition matches, it returns the ELSE value (or NULL if ELSE is omitted).

Q2

Difference between Simple CASE and Searched CASE?

Simple CASE compares one expression to multiple values: CASE status WHEN 'A' THEN... — clean and readable for exact matches. Searched CASE evaluates independent boolean conditions per branch: CASE WHEN amount > 100 THEN... — more flexible, supports any condition.

Q3

How do you use CASE to create a pivot table in SQL?

Use SUM(CASE WHEN status = 'PASS' THEN 1 ELSE 0 END) inside a SELECT. Each CASE expression counts or sums rows that match its condition, turning rows into columns — a manual pivot without using PIVOT syntax.

Q4

Can you use CASE in ORDER BY?

Yes. CASE in ORDER BY allows custom sort logic — for example, sort 'FAILED' rows first regardless of alphabetical order. This is useful in QA reports where you want failures at the top for immediate visibility.

🔧
🗄️ SQL for QA
String & Date Functions
April 2026  |  TechWorld Labs

Built-in SQL functions to manipulate text, calculate date differences, format timestamps, and extract date parts — essential for data verification queries.

Tutorial
Programs
Interview Q&A

String Functions

FunctionPurposeExample
UPPER(str)Convert to uppercaseUPPER('hello') → 'HELLO'
LOWER(str)Convert to lowercaseLOWER('HELLO') → 'hello'
LENGTH(str)String length in bytesLENGTH('abc') → 3
TRIM(str)Remove leading/trailing spacesTRIM(' hi ') → 'hi'
CONCAT(a,b,...)Concatenate stringsCONCAT('a','b') → 'ab'
SUBSTRING(str,pos,len)Extract substringSUBSTRING('hello',1,3) → 'hel'
REPLACE(str,old,new)Replace substringREPLACE('a-b','-','_') → 'a_b'
INSTR(str,sub)Find position of substringINSTR('hello','ll') → 3

Date & Time Functions

FunctionPurposeReturns
NOW()Current date and time2026-04-15 10:30:00
CURDATE()Current date only2026-04-15
DATE(datetime)Extract date partDATE('2026-04-15 10:00') → 2026-04-15
DATEDIFF(d1,d2)Days between datesDATEDIFF('2026-04-15','2026-04-01') → 14
DATE_ADD(date, INTERVAL n unit)Add to a dateDATE_ADD(NOW(), INTERVAL 7 DAY)
DATE_FORMAT(date, format)Format a dateDATE_FORMAT(NOW(),'%d-%m-%Y')
YEAR/MONTH/DAY(date)Extract partYEAR(NOW()) → 2026

String Function Examples

SQL — String Functions
-- Case-insensitive email search
SELECT * FROM users WHERE LOWER(email) = 'qa@test.com';

-- Build full name from parts
SELECT CONCAT(first_name, ' ', last_name) AS full_name FROM users;

-- Check email format (must contain @)
SELECT * FROM users WHERE INSTR(email, '@') = 0;

-- Trim whitespace from imported data
UPDATE users SET name = TRIM(name) WHERE name != TRIM(name);

-- Extract domain from email
SELECT email,
  SUBSTRING(email, INSTR(email, '@') + 1) AS domain
FROM users;

Date Function Examples

SQL — Date Functions
-- Orders created in the last 7 days
SELECT * FROM orders
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 7 DAY);

-- Format date for reports
SELECT DATE_FORMAT(created_at, '%d %M %Y') AS order_date
FROM orders;

-- Days since registration
SELECT name, DATEDIFF(NOW(), created_at) AS days_registered
FROM users ORDER BY days_registered DESC;

-- Users registered in April 2026
SELECT * FROM users
WHERE YEAR(created_at) = 2026
  AND MONTH(created_at) = 4;
Q1

How do you find records from the last 7 days?

Use DATE_SUB: WHERE created_at >= DATE_SUB(NOW(), INTERVAL 7 DAY). Alternatively: WHERE created_at >= NOW() - INTERVAL 7 DAY. Both return rows created in the past 7 days relative to the current timestamp.

Q2

Difference between NOW() and CURDATE()?

NOW() returns the full current datetime including hours, minutes, seconds (e.g. 2026-04-15 10:30:00). CURDATE() returns only the date part (e.g. 2026-04-15). Use CURDATE() when you want to compare dates without the time component.

Q3

How do you do a case-insensitive search in SQL?

Use LOWER() or UPPER() on both sides: WHERE LOWER(email) = 'qa@test.com'. In MySQL with utf8_general_ci collation, comparisons are case-insensitive by default. In PostgreSQL, use ILIKE instead of LIKE for case-insensitive pattern matching.

Q4

How do you calculate the number of days between two dates?

Use DATEDIFF(end_date, start_date) — it returns the number of days end_date minus start_date. DATEDIFF('2026-04-15', '2026-04-01') returns 14. Note: DATEDIFF only returns days — use TIMESTAMPDIFF for hours, minutes, or months.

↕️
🗄️ SQL for QA
ORDER BY, LIMIT & Pagination
April 2026  |  TechWorld Labs

Sort result sets with ORDER BY, restrict row counts with LIMIT, and implement efficient pagination using LIMIT + OFFSET — critical for verifying list and feed features.

Tutorial
Programs
Interview Q&A

ORDER BY

SyntaxMeaning
ORDER BY col ASCAscending (A→Z, 1→9, oldest→newest). Default when ASC omitted.
ORDER BY col DESCDescending (Z→A, 9→1, newest→oldest)
ORDER BY col1, col2Sort by col1 first, then col2 for ties
ORDER BY 3Sort by the 3rd column in SELECT list (positional)
ORDER BY RAND()Random order (MySQL) — for test data sampling

LIMIT & OFFSET — Pagination

1

LIMIT n

Return at most n rows. Use it to get the top N results or to test with a small sample from a large table.

2

LIMIT n OFFSET m

Skip m rows then return n rows. Page 2 with 10 per page = LIMIT 10 OFFSET 10. Page 3 = LIMIT 10 OFFSET 20.

3

Pagination Formula

OFFSET = (page - 1) × topicsize. Always pair with ORDER BY — without it, the page order is non-deterministic.

4

QA use case

Verify that the first page of results is the most recent, that pagination boundaries are correct, and that total count matches the UI page count indicator.

⚠️
Always use ORDER BY with LIMIT — without it, the database returns rows in any order it chooses. Your pagination results will be inconsistent and tests will be flaky.
SQL — ORDER BY, LIMIT & Pagination
-- Top 10 most recent orders
SELECT id, user_id, amount, status
FROM orders
ORDER BY created_at DESC
LIMIT 10;

-- Sort by multiple columns: status ASC, then amount DESC
SELECT * FROM orders
ORDER BY status ASC, amount DESC;

-- Page 1 (10 per page)
SELECT * FROM products
ORDER BY name ASC
LIMIT 10 OFFSET 0;

-- Page 2
SELECT * FROM products
ORDER BY name ASC
LIMIT 10 OFFSET 10;

-- Page N formula
-- LIMIT topicsize OFFSET (page-1)*topicsize

-- Get a random sample of 5 test records
SELECT * FROM users
ORDER BY RAND()
LIMIT 5;

-- QA: verify first page = most recent (newest first)
SELECT id, created_at FROM orders
ORDER BY created_at DESC
LIMIT 1; -- should match what UI shows as latest
Q1

What is the default sort order for ORDER BY?

ASC (ascending) — A to Z for text, smallest to largest for numbers, oldest to newest for dates. You must explicitly specify DESC for descending. Always be explicit in test queries to avoid assumptions about sort order.

Q2

How do you implement pagination in SQL?

Use LIMIT and OFFSET: LIMIT topicsize OFFSET (page-1)*topicsize. Always include an ORDER BY clause — without it, row order is non-deterministic and paginated results will be inconsistent across queries, causing flaky test assertions.

Q3

What is the difference between LIMIT 10 OFFSET 10 and LIMIT 10, 10?

They are identical in MySQL. LIMIT 10, 10 is shorthand for LIMIT 10 OFFSET 10 — offset first, then count. However, the LIMIT n OFFSET m syntax is more readable and is the ANSI SQL standard supported across all databases.

Q4

What happens to NULL values in ORDER BY?

In MySQL, NULLs sort first in ASC order and last in DESC order. In PostgreSQL, NULLs sort last in ASC and first in DESC by default, but you can control this with NULLS FIRST / NULLS LAST. Always verify NULL sort behavior when testing sort functionality.

🔗
🗄️ SQL for QA
UNION & Set Operations
April 2026  |  TechWorld Labs

Combine results from multiple SELECT queries using UNION, UNION ALL, INTERSECT, and EXCEPT — essential for comparing datasets, reconciliation, and cross-table QA verification.

Tutorial
Programs
Interview Q&A

Set Operations Compared

OperatorReturnsDuplicatesQA Use
UNIONRows from A OR BRemovedMerge two lists, get distinct combined set
UNION ALLRows from A OR BKeptFaster merge, preserves all rows including dupes
INTERSECTRows in BOTH A AND BRemovedFind records that exist in both tables
EXCEPT / MINUSRows in A but NOT BRemovedFind records missing from target table (ETL testing)

Rules for Set Operations

1

Same column count

Both SELECT statements must return the same number of columns. Column names come from the first SELECT.

2

Compatible data types

Corresponding columns must have compatible types — you cannot UNION a VARCHAR column with an INT column.

3

One ORDER BY at the end

Put ORDER BY at the end of the entire UNION — not inside individual SELECT statements. It sorts the combined result.

4

UNION vs UNION ALL speed

UNION ALL is faster — it skips the duplicate-removal step. Use UNION ALL whenever you know duplicates are impossible or acceptable.

SQL — UNION & Set Operations
-- UNION: all admin and manager emails (no duplicates)
SELECT email FROM users WHERE role = 'admin'
UNION
SELECT email FROM users WHERE role = 'manager';

-- UNION ALL: all orders from two regions (keep dupes)
SELECT id, amount, 'EU' AS region FROM eu_orders
UNION ALL
SELECT id, amount, 'US' AS region FROM us_orders
ORDER BY amount DESC;

-- INTERSECT: users who exist in BOTH tables
-- (PostgreSQL/SQL Server — MySQL use JOIN instead)
SELECT email FROM users
INTERSECT
SELECT email FROM newsletter_subscribers;

-- MySQL equivalent of INTERSECT using JOIN
SELECT DISTINCT u.email
FROM users u
JOIN newsletter_subscribers ns ON u.email = ns.email;

-- EXCEPT: rows in source but NOT in target (ETL QA)
SELECT id, email FROM source_users
EXCEPT
SELECT id, email FROM target_users;

-- MySQL EXCEPT equivalent using LEFT JOIN
SELECT s.id, s.email
FROM source_users s
LEFT JOIN target_users t ON s.id = t.id
WHERE t.id IS NULL;
Q1

What is the difference between UNION and UNION ALL?

UNION combines results from two SELECT statements and removes duplicate rows. UNION ALL does the same but keeps all rows including duplicates. UNION ALL is faster because it skips the deduplication step — use it when duplicates are impossible or you need them preserved.

Q2

What is EXCEPT (or MINUS) used for in QA testing?

EXCEPT returns rows in the first query that do NOT exist in the second query. In QA/ETL testing: SELECT from source EXCEPT SELECT from target shows records that failed to migrate or were lost in a transformation — a powerful data reconciliation query.

Q3

What are the rules for combining queries with UNION?

Both SELECT statements must have the same number of columns, and corresponding columns must have compatible data types. Column names in the result come from the first SELECT. ORDER BY can only appear once at the very end of the combined query.

Q4

Does MySQL support INTERSECT and EXCEPT?

INTERSECT and EXCEPT were added in MySQL 8.0.31+. For older versions, simulate INTERSECT with an INNER JOIN and EXCEPT with a LEFT JOIN + WHERE target_col IS NULL. PostgreSQL and SQL Server support them natively in all modern versions.

🔎
🗄️ SQL for QA
EXISTS & NOT EXISTS
April 2026  |  TechWorld Labs

EXISTS checks whether a subquery returns any rows — it stops as soon as the first match is found. Faster than IN for large datasets and essential for verifying data presence or absence in QA queries.

Tutorial
Programs
Interview Q&A

EXISTS vs IN vs JOIN

ApproachHow it worksBest for
EXISTSReturns TRUE if subquery yields any row — stops at first matchLarge subquery, only checking presence
NOT EXISTSReturns TRUE if subquery yields zero rowsFinding records with no related rows
IN (subquery)Runs full subquery, builds list, checks membershipSmall static value lists
LEFT JOIN + IS NULLJoin and filter where right side is NULLSame as NOT EXISTS — sometimes faster
1

Correlated Subquery

EXISTS subqueries are usually correlated — they reference the outer query's row. The subquery re-evaluates for each outer row, checking a specific condition per row.

2

Short-Circuit Evaluation

EXISTS stops scanning the inner table as soon as one matching row is found — making it highly efficient when you only need to know if something exists, not how many or what data.

3

QA: Verify Record Exists

After a test action (e.g. user registration), use EXISTS to assert the record was created without fetching the full row data.

4

NOT IN vs NOT EXISTS with NULLs

NOT IN fails silently when the subquery contains any NULL — returns no rows. NOT EXISTS handles NULLs correctly. Always prefer NOT EXISTS over NOT IN for subqueries.

SQL — EXISTS & NOT EXISTS
-- EXISTS: users who have at least one order
SELECT u.id, u.name, u.email
FROM users u
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.user_id = u.id   -- correlated: references outer row
);

-- NOT EXISTS: users who have NEVER placed an order
SELECT u.id, u.name, u.email
FROM users u
WHERE NOT EXISTS (
  SELECT 1 FROM orders o
  WHERE o.user_id = u.id
);

-- QA: assert a user was created after registration test
SELECT CASE WHEN EXISTS (
  SELECT 1 FROM users WHERE email = 'new@test.com'
) THEN 'PASS' ELSE 'FAIL' END AS test_result;

-- NOT EXISTS: safer than NOT IN when NULLs may exist
-- BAD (fails if any user_id in orders is NULL):
SELECT * FROM users WHERE id NOT IN (
  SELECT user_id FROM orders
);
-- GOOD:
SELECT * FROM users u
WHERE NOT EXISTS (
  SELECT 1 FROM orders o WHERE o.user_id = u.id
);
Q1

What is EXISTS in SQL?

EXISTS is a boolean operator that returns TRUE if the subquery returns at least one row, FALSE if it returns no rows. It ignores the actual column values in the subquery — only the presence of rows matters. SELECT 1 in the subquery is conventional since the value is never used.

Q2

When is EXISTS faster than IN?

EXISTS short-circuits — it stops at the first matching row. IN evaluates the full subquery and builds a complete list before comparing. When the subquery returns a large result set, EXISTS is significantly faster because it doesn't need to scan all rows.

Q3

Why is NOT EXISTS safer than NOT IN?

NOT IN behaves unexpectedly when the subquery contains any NULL value — it returns zero rows because NULL comparisons are always NULL (unknown), not FALSE. NOT EXISTS handles NULLs correctly and always returns logically expected results. Always use NOT EXISTS over NOT IN for subqueries.

Q4

What is a correlated subquery?

A correlated subquery references columns from the outer query. It re-executes once per outer row — the inner query uses the outer row's values as filter conditions. EXISTS subqueries are almost always correlated: WHERE o.user_id = u.id references the outer u table.

🔄
🗄️ SQL for QA
SELF JOIN
April 2026  |  TechWorld Labs

A SELF JOIN joins a table to itself using two aliases. Essential for querying hierarchical data (employee-manager), finding duplicate rows, and comparing rows within the same table.

Tutorial
Programs
Interview Q&A

When to Use SELF JOIN

1

Hierarchical Data

Employee-manager relationships stored in the same table — join employees to their manager row using manager_id = id.

2

Find Duplicates

Join a table to itself on matching column values with different primary keys — surfaces rows that share the same email, phone, or name.

3

Compare Adjacent Rows

Compare each row to the next or previous row in the same table — e.g. compare two consecutive orders from the same user.

4

Category Relationships

Parent-child category trees stored in one table — join category to its parent using parent_id = id.

💡
SELF JOIN requires aliases — you must give the table two different aliases (e.g. e for employee, m for manager) so SQL knows which instance each column reference belongs to.
SQL — SELF JOIN Examples
-- employees(id, name, manager_id)
-- SELF JOIN: show employee alongside their manager
SELECT
  e.name    AS employee,
  m.name    AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.id
ORDER BY m.name;

-- Find duplicate emails in the users table
SELECT
  a.id   AS id1,
  b.id   AS id2,
  a.email
FROM users a
JOIN users b
  ON  a.email = b.email
  AND a.id < b.id;  -- avoid pairing (1,2) AND (2,1)

-- Find orders from the same user placed on the same day
SELECT
  a.id  AS order1,
  b.id  AS order2,
  a.user_id,
  DATE(a.created_at) AS order_date
FROM orders a
JOIN orders b
  ON  a.user_id = b.user_id
  AND DATE(a.created_at) = DATE(b.created_at)
  AND a.id < b.id;
Q1

What is a SELF JOIN?

A SELF JOIN is when a table is joined to itself. The same table appears twice in the FROM clause with two different aliases. It is used to query hierarchical data (employee-manager), compare rows within the same table, or find duplicates.

Q2

Why are aliases required in a SELF JOIN?

Without aliases, SQL cannot distinguish which instance of the table a column reference belongs to. Two aliases (e.g. a and b) let you write a.column and b.column to reference columns from each "copy" of the table independently.

Q3

How do you find duplicate rows using a SELF JOIN?

JOIN the table to itself ON the duplicate column (e.g. email) AND a.id < b.id. The id condition prevents the same pair appearing twice ((1,2) and (2,1)) and prevents a row pairing with itself. Each returned row shows one duplicate pair.

Q4

When would you use a recursive CTE instead of a SELF JOIN?

A SELF JOIN retrieves only one level at a time (employee → manager). For traversing multiple levels of a hierarchy (all descendants of a node), a recursive CTE is the right tool — it iterates until no more rows are returned, handling unlimited depth.

📦
🗄️ SQL for QA
GROUP BY & HAVING
April 2026  |  TechWorld Labs

GROUP BY groups rows by column values so aggregate functions run per group. HAVING filters those groups — the aggregate-level equivalent of WHERE.

Tutorial
Programs
Interview Q&A

SQL Execution Order

1

FROM / JOIN

Fetch rows from tables and apply joins.

2

WHERE

Filter individual rows before grouping. Cannot use aggregate functions here.

3

GROUP BY

Group the remaining rows by the specified columns.

4

HAVING

Filter groups after grouping. CAN use aggregate functions like COUNT, SUM, AVG.

5

SELECT

Project the requested columns and computed values.

6

ORDER BY / LIMIT

Sort and paginate the final result set.

💡
Rule: Every column in SELECT must either be in GROUP BY or wrapped in an aggregate function (COUNT, SUM, AVG, MIN, MAX). Violating this is a syntax error.

GROUP BY & HAVING Examples

SQL — GROUP BY & HAVING
-- Count orders per user
SELECT user_id, COUNT(*) AS order_count
FROM orders
GROUP BY user_id;

-- Users with MORE than 5 orders (HAVING filters groups)
SELECT user_id, COUNT(*) AS order_count
FROM orders
GROUP BY user_id
HAVING COUNT(*) > 5
ORDER BY order_count DESC;

-- WHERE + GROUP BY + HAVING together
-- Only 2026 orders, grouped by month, months with > 100 orders
SELECT
  MONTH(created_at) AS month,
  COUNT(*) AS total,
  SUM(amount)       AS revenue
FROM orders
WHERE YEAR(created_at) = 2026      -- filter rows first
GROUP BY MONTH(created_at)           -- then group
HAVING COUNT(*) > 100               -- then filter groups
ORDER BY month;

-- Detect duplicate emails (QA use case)
SELECT email, COUNT(*) AS count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
Q1

What is the difference between WHERE and HAVING?

WHERE filters individual rows before grouping — it cannot reference aggregate functions. HAVING filters groups after GROUP BY — it can reference aggregate functions like COUNT(), SUM(), AVG(). They operate at different stages of query execution.

Q2

Can you use HAVING without GROUP BY?

Yes — the entire result set is treated as one group. HAVING COUNT(*) > 5 on a table without GROUP BY checks if the total row count exceeds 5. Rarely useful, but syntactically valid.

Q3

How do you find duplicate records in a table?

GROUP BY the column(s) that should be unique, then HAVING COUNT(*) > 1: SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1. This is one of the most common QA database validation queries.

Q4

What is the SQL execution order?

FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY → LIMIT. This order explains why you cannot reference a SELECT alias in a WHERE clause (WHERE runs before SELECT), but you CAN reference it in ORDER BY (ORDER BY runs after SELECT).

👁️
🗄️ SQL for QA
Views
April 2026  |  TechWorld Labs

A view is a saved SELECT query treated as a virtual table. Use views to simplify complex joins, standardize QA verification queries, and restrict column access.

Tutorial
Programs
Interview Q&A

View Benefits

🔁
Reusability
Save a complex JOIN query once — query the view like a table from anywhere.
🔒
Security
Expose only specific columns — hide sensitive fields like passwords or salary.
🧹
Simplicity
Abstract complex multi-table queries into a single named object.
🧪
QA Queries
Create a standard QA verification view shared across the test team.

View Syntax

StatementPurpose
CREATE VIEW name AS SELECT ...Create a new view
CREATE OR REPLACE VIEW name AS ...Update existing view definition
DROP VIEW IF EXISTS nameRemove a view
SELECT * FROM view_nameQuery a view like a table
SHOW CREATE VIEW nameSee view definition

Create & Use Views

SQL — Views
-- Create a QA verification view (orders with user info)
CREATE VIEW vw_order_details AS
  SELECT
    o.id          AS order_id,
    u.name        AS customer_name,
    u.email,
    o.amount,
    o.status,
    o.created_at
  FROM orders o
  INNER JOIN users u ON o.user_id = u.id;

-- Query the view exactly like a table
SELECT * FROM vw_order_details WHERE status = 'failed';

-- Update view definition
CREATE OR REPLACE VIEW vw_order_details AS
  SELECT o.id, u.email, o.amount, o.status, o.created_at
  FROM orders o JOIN users u ON o.user_id = u.id;

-- QA view: all test users (exclude production)
CREATE VIEW vw_test_users AS
  SELECT * FROM users WHERE email LIKE '%@test.com';

-- Drop when no longer needed
DROP VIEW IF EXISTS vw_test_users;
Q1

What is a SQL view?

A view is a stored SELECT query that behaves like a virtual table. It does not store data itself — each time you query the view, the underlying SELECT runs and returns fresh data. Views simplify complex queries and can restrict visible columns for security.

Q2

Can you INSERT or UPDATE data through a view?

Only if the view is "updatable" — it must reference a single base table with no GROUP BY, DISTINCT, aggregate functions, or UNION. Most complex views (with JOINs or aggregates) are read-only. Attempting to INSERT through a non-updatable view throws an error.

Q3

What is a materialized view?

A materialized view physically stores the query results and refreshes them periodically. Unlike a regular view (which re-runs on every query), a materialized view is pre-computed — much faster for complex aggregations but may show stale data. Supported in PostgreSQL and Oracle, not standard MySQL.

Q4

How are views useful in QA database testing?

Create a shared QA view that joins users, orders, and statuses — the entire QA team queries the view instead of rewriting the same complex JOIN every time. If the schema changes, update the view definition in one place rather than fixing 20 test queries.

🧩
🗄️ SQL for QA
CTEs — WITH Clause
April 2026  |  TechWorld Labs

Common Table Expressions define named temporary result sets at the top of a query. They make complex multi-step queries readable, reusable within the query, and debuggable step by step.

Tutorial
Programs
Interview Q&A

CTE Syntax

SQL — CTE Structure
WITH cte_name AS (
  -- your SELECT here
  SELECT ...
),
another_cte AS (
  -- can reference cte_name above
  SELECT ... FROM cte_name
)
-- final query uses any CTE defined above
SELECT * FROM another_cte;

CTE vs Subquery vs View

FeatureCTESubqueryView
Readability✅ Best⚠️ Gets messy✅ Good
Reuse in same query✅ Yes❌ Must repeat✅ Yes
Persists after query❌ No❌ No✅ Yes
Recursive support✅ Yes❌ No❌ No

CTE Examples

SQL — CTE Examples
-- Step 1: find high-value orders
-- Step 2: get those customers
WITH high_value_orders AS (
  SELECT user_id, SUM(amount) AS total
  FROM orders
  GROUP BY user_id
  HAVING SUM(amount) > 5000
)
SELECT u.name, u.email, h.total
FROM users u
JOIN high_value_orders h ON u.id = h.user_id;

-- Multiple CTEs: chained steps
WITH
active_users AS (
  SELECT id, name FROM users WHERE is_active = TRUE
),
recent_orders AS (
  SELECT user_id, COUNT(*) AS cnt
  FROM orders
  WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
  GROUP BY user_id
)
SELECT a.name, r.cnt
FROM active_users a
JOIN recent_orders r ON a.id = r.user_id;

Recursive CTE

SQL — Recursive CTE (org hierarchy)
-- employees(id, name, manager_id)
WITH RECURSIVE org_tree AS (
  -- Anchor: start from the CEO (no manager)
  SELECT id, name, manager_id, 0 AS level
  FROM employees WHERE manager_id IS NULL

  UNION ALL

  -- Recursive: join each employee to their manager
  SELECT e.id, e.name, e.manager_id, o.level + 1
  FROM employees e
  JOIN org_tree o ON e.manager_id = o.id
)
SELECT * FROM org_tree ORDER BY level;
Q1

What is a CTE and how is it different from a subquery?

A CTE (Common Table Expression) is a named temporary result set defined with WITH before the main query. Unlike a subquery, a CTE can be referenced multiple times in the same query, is defined at the top (not inline), and is much more readable for complex multi-step logic.

Q2

Can a CTE reference another CTE?

Yes. When you define multiple CTEs separated by commas, each CTE can reference any CTE defined before it. This enables chaining multiple transformation steps cleanly — like a pipeline of named queries.

Q3

What is a recursive CTE?

A recursive CTE references itself. It has two parts: an anchor (base case — runs once) and a recursive member (references the CTE, runs repeatedly until no more rows). Used for tree/hierarchy traversal like org charts, categories, or bill of materials.

Q4

Does a CTE improve query performance?

Not necessarily. Most databases treat a CTE like an inline view — it may be re-executed each time it is referenced. For performance, consider a temp table if the CTE result is large and referenced many times. Materialized CTEs (persisted) are available in some databases like PostgreSQL.

🪟
🗄️ SQL for QA
Window Functions
April 2026  |  TechWorld Labs

Window functions compute a value for each row using a sliding "window" of related rows — without collapsing rows like GROUP BY. Essential for rankings, running totals, and row comparisons.

Tutorial
Programs
Interview Q&A

Window Function Syntax

SQL — Window Syntax
function_name() OVER (
  [PARTITION BY column]   -- split into groups
  [ORDER BY column]        -- order within each group
  [ROWS/RANGE frame_spec]  -- define the frame (optional)
)

Common Window Functions

FunctionPurposeTies?
ROW_NUMBER()Unique sequential row numberNo tie — always unique
RANK()Rank with gaps after tiesGaps: 1,1,3
DENSE_RANK()Rank without gapsNo gaps: 1,1,2
NTILE(n)Divide rows into n buckets
LAG(col, n)Access previous row's value
LEAD(col, n)Access next row's value
SUM() OVERRunning total
AVG() OVERMoving average

Ranking & Deduplication

SQL — Window Functions
-- Rank users by total spend
SELECT
  name, total_spent,
  RANK()       OVER (ORDER BY total_spent DESC) AS rnk,
  DENSE_RANK() OVER (ORDER BY total_spent DESC) AS dense_rnk,
  ROW_NUMBER() OVER (ORDER BY total_spent DESC) AS row_num
FROM user_totals;

-- Deduplicate: keep latest record per user
WITH deduped AS (
  SELECT *,
    ROW_NUMBER() OVER (
      PARTITION BY user_id
      ORDER BY created_at DESC
    ) AS rn
  FROM orders
)
SELECT * FROM deduped WHERE rn = 1;

-- LAG: compare each order to the previous one
SELECT
  id, amount,
  LAG(amount, 1) OVER (ORDER BY created_at) AS prev_amount,
  amount - LAG(amount, 1) OVER (ORDER BY created_at) AS change
FROM orders;

-- Running total per user
SELECT
  user_id, amount, created_at,
  SUM(amount) OVER (
    PARTITION BY user_id
    ORDER BY created_at
  ) AS running_total
FROM orders;
Q1

What are window functions in SQL?

Window functions perform calculations across a set of rows related to the current row (the "window") without collapsing rows like GROUP BY does. Each row keeps its identity and gets an additional computed column — e.g. its rank, running total, or comparison to the previous row.

Q2

Difference between ROW_NUMBER, RANK, and DENSE_RANK?

All three rank rows. For tied values: ROW_NUMBER always assigns unique numbers (ties broken arbitrarily — 1,2,3). RANK skips numbers after ties (1,1,3). DENSE_RANK never skips (1,1,2). Use DENSE_RANK when you don't want gaps in the ranking sequence.

Q3

What is PARTITION BY in a window function?

PARTITION BY divides the result set into groups (partitions) and the window function resets for each partition — like GROUP BY but without collapsing rows. RANK() OVER (PARTITION BY dept ORDER BY salary DESC) ranks employees within each department independently.

Q4

How do you find duplicate rows and keep only the latest one?

Use ROW_NUMBER() OVER (PARTITION BY unique_col ORDER BY created_at DESC) AS rn in a CTE, then SELECT * WHERE rn = 1 in the outer query. This is the standard deduplication pattern — the partition resets per key, and rn=1 is always the most recent row.

🗄️ SQL for QA
Indexes & Performance
April 2026  |  TechWorld Labs

Indexes speed up SELECT queries by creating a data structure the database can search quickly. Understand when to use them, when not to, and how to use EXPLAIN to diagnose slow queries.

Tutorial
Programs
Interview Q&A

Types of Indexes

TypeDescriptionUse Case
PRIMARY KEYAuto-created clustered indexEvery table — the row lookup anchor
UNIQUEEnforces uniqueness + speeds lookupEmail, username, reference codes
INDEX (B-Tree)General-purpose sorted indexWHERE, JOIN, ORDER BY columns
COMPOSITEIndex on multiple columnsQueries filtering on multiple columns
FULLTEXTOptimised for text searchMATCH...AGAINST on text columns

When NOT to Use Indexes

Low-cardinality columns

A boolean column (true/false) has only 2 values — an index on it provides no benefit. Use indexes on high-cardinality columns like email or user_id.

Small tables

Full table scan is faster than index lookup for tables with fewer than a few hundred rows — the optimizer usually ignores the index anyway.

Heavy write tables

Every INSERT/UPDATE/DELETE must update all indexes on the table. Too many indexes on a frequently-written table hurts write performance significantly.

Create & Drop Indexes

SQL — Indexes
-- Single-column index
CREATE INDEX idx_users_email ON users(email);

-- Unique index (enforces uniqueness)
CREATE UNIQUE INDEX idx_users_email_uniq ON users(email);

-- Composite index (column order matters!)
-- Useful for: WHERE status = ? AND created_at > ?
CREATE INDEX idx_orders_status_date ON orders(status, created_at);

-- Drop an index
DROP INDEX idx_users_email ON users;

-- Show all indexes on a table
SHOW INDEXES FROM orders;

EXPLAIN — Diagnose Slow Queries

SQL — EXPLAIN
-- EXPLAIN shows the query execution plan
EXPLAIN SELECT * FROM orders WHERE user_id = 42;

-- Key columns to check in EXPLAIN output:
-- type: ALL = full scan (bad), ref/eq_ref = index (good)
-- key: which index is being used (NULL = no index)
-- rows: estimated rows scanned (lower = faster)
-- Extra: "Using where", "Using index" (index-only = very fast)

-- EXPLAIN ANALYZE (MySQL 8+): shows actual execution stats
EXPLAIN ANALYZE SELECT * FROM orders
WHERE status = 'pending' AND created_at > '2026-01-01';
Q1

What is an index in SQL?

An index is a separate data structure (typically a B-Tree) that stores column values alongside pointers to the actual rows. It lets the database find rows without scanning the entire table — like a book index vs reading every page to find a topic.

Q2

What is a composite index and why does column order matter?

A composite index covers multiple columns. Column order determines which queries benefit from it — an index on (status, date) can be used for WHERE status = ? AND date > ?, or WHERE status = ?, but NOT for WHERE date > ? alone (the leftmost prefix rule).

Q3

What does EXPLAIN show and why is it useful?

EXPLAIN shows the query execution plan: which indexes are used, how many rows are scanned, and the join type. "type: ALL" means a full table scan (slow). "type: ref" means an index is used (fast). Use EXPLAIN to diagnose slow queries before adding indexes.

Q4

When would you NOT add an index?

On low-cardinality columns (boolean, status with 2-3 values), on very small tables (full scan is faster), or on columns that are written to frequently. Each index adds overhead to every INSERT/UPDATE/DELETE — too many indexes hurt write performance.

🔐
🗄️ SQL for QA
Transactions & ACID
April 2026  |  TechWorld Labs

A transaction groups multiple SQL statements into a single atomic unit — all succeed or all fail together. ACID properties guarantee database integrity even in concurrent and failure scenarios.

Tutorial
Programs
Interview Q&A

ACID Properties

A

Atomicity

All statements in a transaction succeed, or none do. If any statement fails, the database rolls back to the state before the transaction started — no partial writes.

C

Consistency

A transaction brings the database from one valid state to another. All constraints, rules, and cascades are enforced — the database is never left in an invalid state.

I

Isolation

Concurrent transactions cannot see each other's uncommitted changes. The level of isolation is configurable (READ COMMITTED, REPEATABLE READ, SERIALIZABLE).

D

Durability

Once a transaction is committed, the changes are permanent — even if the server crashes immediately after. Data is written to disk (WAL / redo log).

Isolation Levels

LevelDirty ReadNon-repeatable ReadPhantom Read
READ UNCOMMITTED✅ Possible✅ Possible✅ Possible
READ COMMITTED❌ Prevented✅ Possible✅ Possible
REPEATABLE READ❌ Prevented❌ Prevented✅ Possible
SERIALIZABLE❌ Prevented❌ Prevented❌ Prevented

BEGIN / COMMIT / ROLLBACK

SQL — Transactions
-- Transfer $100 from account A to account B atomically
START TRANSACTION;

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;

-- If both succeed: commit permanently
COMMIT;

-- If something went wrong: undo everything
-- ROLLBACK;

SAVEPOINT — Partial Rollback

SQL — SAVEPOINT
START TRANSACTION;

INSERT INTO orders (user_id, amount) VALUES (1, 500);
SAVEPOINT after_order;

INSERT INTO payments (order_id, method) VALUES (LAST_INSERT_ID(), 'card');

-- Payment failed — roll back only to savepoint (keep order)
ROLLBACK TO SAVEPOINT after_order;

-- Retry with different payment method
INSERT INTO payments (order_id, method) VALUES (LAST_INSERT_ID(), 'bank');
COMMIT;

QA: Use Transactions for Safe Test Data

SQL — QA Test Data Pattern
-- Wrap test setup in a transaction -- Roll back after test to leave DB clean START TRANSACTION; INSERT INTO users (email, role) VALUES ('qa@test.com', 'user'); INSERT INTO orders (user_id, amount) VALUES (LAST_INSERT_ID(), 250); -- Run your assertions here... SELECT * FROM orders WHERE amount = 250; -- Undo all test data — DB is clean again ROLLBACK;
Q1

What is a database transaction?

A transaction is a sequence of SQL statements executed as a single unit of work. Either all statements succeed and are committed, or any failure triggers a rollback that undoes all changes — leaving the database in the state it was before the transaction started.

Q2

What does ACID stand for?

Atomicity (all or nothing), Consistency (valid state to valid state), Isolation (concurrent transactions don't interfere), Durability (committed data survives crashes). These four properties together guarantee reliable database operations.

Q3

What is a dirty read?

A dirty read occurs when one transaction reads data written by another transaction that has not yet committed. If the second transaction rolls back, the first has read data that never officially existed — a data integrity problem. READ COMMITTED isolation level prevents dirty reads.

Q4

What is a SAVEPOINT?

A SAVEPOINT marks a point within a transaction you can partially roll back to. ROLLBACK TO SAVEPOINT name undoes only the work done after that savepoint — the work before the savepoint is preserved. COMMIT still commits everything including pre-savepoint work.

Q5

How can QA engineers use transactions in testing?

Wrap test data setup in a BEGIN TRANSACTION. Run all test assertions. Then ROLLBACK — leaving the database in a perfectly clean state with zero cleanup code needed. This is especially powerful in integration tests against a real database.

🔢
🗄️ SQL for QA
Math & Numeric Functions
April 2026  |  TechWorld Labs

Built-in SQL math functions for rounding, truncating, absolute values, modulo, and power calculations — used in financial verifications, boundary testing, and data transformation validation.

Tutorial
Programs
Interview Q&A

Numeric Functions

FunctionPurposeExample → Result
ROUND(n, d)Round to d decimal placesROUND(3.456, 2) → 3.46
FLOOR(n)Round down to nearest integerFLOOR(3.9) → 3
CEIL(n) / CEILING(n)Round up to nearest integerCEIL(3.1) → 4
TRUNCATE(n, d)Truncate to d decimals (no rounding)TRUNCATE(3.99, 1) → 3.9
ABS(n)Absolute valueABS(-42) → 42
MOD(n, d)Remainder after divisionMOD(10, 3) → 1
POWER(base, exp)Raise to powerPOWER(2, 8) → 256
SQRT(n)Square rootSQRT(144) → 12
SIGN(n)Returns -1, 0, or 1SIGN(-5) → -1
RAND()Random float 0–1RAND() → 0.7341...
💡
ROUND vs TRUNCATE: ROUND(3.456, 2) → 3.46 (rounds up). TRUNCATE(3.456, 2) → 3.45 (chops off, no rounding). For financial QA, always check which one the application uses — it affects totals on large datasets.
SQL — Math Functions
-- Verify order total is rounded to 2 decimal places
SELECT id, amount,
  ROUND(amount, 2)    AS rounded,
  TRUNCATE(amount, 2) AS truncated
FROM orders
WHERE amount != ROUND(amount, 2); -- find rounding issues

-- Calculate percentage of passed tests
SELECT
  ROUND(
    SUM(CASE WHEN status = 'PASS' THEN 1 ELSE 0 END) * 100.0
    / COUNT(*), 1
  ) AS pass_rate_pct
FROM test_runs;

-- Find odd-numbered records (MOD)
SELECT * FROM users WHERE MOD(id, 2) = 1;

-- Absolute difference between expected and actual amounts
SELECT order_id,
  expected_amount, actual_amount,
  ABS(expected_amount - actual_amount) AS discrepancy
FROM order_audit
WHERE ABS(expected_amount - actual_amount) > 0.01;

-- Generate random test price between 10 and 100
SELECT ROUND(10 + RAND() * 90, 2) AS random_price;
Q1

Difference between ROUND and TRUNCATE?

ROUND applies standard rounding — ROUND(3.456, 2) = 3.46. TRUNCATE simply drops digits beyond the specified position without rounding — TRUNCATE(3.456, 2) = 3.45. For financial QA, test both scenarios because the difference accumulates over many records.

Q2

Difference between FLOOR and CEIL?

FLOOR(n) always rounds down toward negative infinity — FLOOR(3.9) = 3, FLOOR(-3.1) = -4. CEIL(n) always rounds up — CEIL(3.1) = 4, CEIL(-3.9) = -3. Use FLOOR for integer division scenarios and CEIL for "at least n" calculations like page count.

Q3

What is MOD used for in QA testing?

MOD(n, d) returns the remainder of n divided by d. Common QA uses: find odd/even IDs (MOD(id, 2) = 0), find records at fixed intervals, validate batch processing that runs every N records, or test paging logic.

Q4

How do you find records with rounding errors?

Compare the stored value to its rounded version: WHERE amount != ROUND(amount, 2). This surfaces records where the application stored more than 2 decimal places — a common bug in financial calculations that causes total mismatches.

🗄️ SQL for QA
Triggers
April 2026  |  TechWorld Labs

A trigger is a stored SQL block that automatically fires BEFORE or AFTER an INSERT, UPDATE, or DELETE event on a table — used for audit logging, validation, and cascading business logic.

Tutorial
Programs
Interview Q&A

Trigger Timing & Events

TimingEventCommon Use
BEFORE INSERTBefore row is insertedValidate / transform data before save
AFTER INSERTAfter row is insertedWrite audit log, populate related table
BEFORE UPDATEBefore row is updatedValidate new values, prevent illegal changes
AFTER UPDATEAfter row is updatedAudit trail, sync denormalized data
BEFORE DELETEBefore row is deletedArchive row, prevent deletion of critical records
AFTER DELETEAfter row is deletedClean up related data, write deletion log

NEW and OLD Row References

ReferenceAvailable inMeaning
NEW.columnINSERT, UPDATEThe new value being written
OLD.columnUPDATE, DELETEThe value before the change
💡
QA tip: When testing features that use triggers (audit logs, auto-timestamps, status syncing), always verify the trigger fired by querying the affected tables — the application code may not expose what happened at the database level.

Audit Log Trigger

SQL — Triggers
-- Audit log table
CREATE TABLE user_audit (
  id         INT AUTO_INCREMENT PRIMARY KEY,
  user_id    INT,
  action     VARCHAR(10),
  old_role   VARCHAR(20),
  new_role   VARCHAR(20),
  changed_at DATETIME DEFAULT NOW()
);

-- Trigger: log every role change
DELIMITER //
CREATE TRIGGER trg_user_role_audit
AFTER UPDATE ON users
FOR EACH ROW
BEGIN
  IF OLD.role != NEW.role THEN
    INSERT INTO user_audit (user_id, action, old_role, new_role)
    VALUES (OLD.id, 'UPDATE', OLD.role, NEW.role);
  END IF;
END //
DELIMITER ;

-- Trigger: set updated_at automatically
DELIMITER //
CREATE TRIGGER trg_set_updated_at
BEFORE UPDATE ON users
FOR EACH ROW
BEGIN
  SET NEW.updated_at = NOW();
END //
DELIMITER ;

-- Trigger: prevent deletion of admin users
DELIMITER //
CREATE TRIGGER trg_no_admin_delete
BEFORE DELETE ON users
FOR EACH ROW
BEGIN
  IF OLD.role = 'admin' THEN
    SIGNAL SQLSTATE '45000'
      SET MESSAGE_TEXT = 'Cannot delete admin users';
  END IF;
END //
DELIMITER ;

-- View existing triggers
SHOW TRIGGERS FROM your_database;

-- Drop a trigger
DROP TRIGGER IF EXISTS trg_user_role_audit;
Q1

What is a database trigger?

A trigger is a stored SQL block that automatically executes when a specified event (INSERT, UPDATE, DELETE) occurs on a table. Triggers fire without any explicit call — they are driven by DML events and are invisible to the application layer.

Q2

What is the difference between BEFORE and AFTER triggers?

BEFORE triggers fire before the row change is applied — you can modify NEW values or cancel the operation with SIGNAL. AFTER triggers fire after the change is committed to the table — used for audit logging and cascading updates that need the final row state.

Q3

What are OLD and NEW in a trigger?

OLD refers to the row's values before the operation (available in UPDATE and DELETE). NEW refers to the values being written (available in INSERT and UPDATE). In a BEFORE UPDATE trigger, you can change NEW.column to transform data before it is saved.

Q4

How do triggers affect QA testing?

Triggers execute invisibly — the application doesn't know they fired. In QA, always verify trigger side effects directly in the database: check audit tables were populated, timestamps were updated, and constraints were enforced. A test that only checks the UI may miss trigger failures entirely.

{ }
🗄️ SQL for QA
JSON Functions
April 2026  |  TechWorld Labs

Modern databases store semi-structured data as JSON columns. SQL JSON functions let you query, extract, and validate JSON data directly in SQL — essential for APIs that persist JSON payloads to the database.

Tutorial
Programs
Interview Q&A

MySQL JSON Functions

FunctionPurposeExample
JSON_EXTRACT(col, path)Extract value at JSON pathJSON_EXTRACT(data, '$.name')
col->>'$.path'Shorthand for JSON_EXTRACT (unquoted)data->>'$.name'
JSON_OBJECT(k,v,...)Build a JSON objectJSON_OBJECT('id', 1, 'name', 'QA')
JSON_ARRAY(v,...)Build a JSON arrayJSON_ARRAY(1, 2, 3)
JSON_CONTAINS(doc, val, path)Check if doc contains val at pathFind rows with specific tag
JSON_ARRAYAGG(col)Aggregate column into JSON arrayGroup tags into array per user
JSON_VALID(str)Returns 1 if string is valid JSONValidate stored JSON data quality
JSON_LENGTH(col)Number of elements in array/objectCount items in JSON array column
1

JSON Path Syntax

$ = root, $.key = object key, $.arr[0] = first array element, $.a.b = nested key.

2

JSON Column Type

MySQL validates JSON at insert time when the column is declared as JSON type — invalid JSON is rejected. Older schemas store JSON as TEXT — use JSON_VALID() to verify data quality.

3

Indexing JSON

You cannot index a JSON column directly. Create a generated/virtual column from the extracted value, then index that column.

4

QA Use Case

APIs that log request/response payloads as JSON — query the JSON column to verify specific fields were captured correctly without deserializing in application code.

SQL — JSON Functions (MySQL 8+)
-- Table with JSON column
-- orders(id, payload JSON)
-- payload: {"user":"alice","items":[{"sku":"A1","qty":2}],"total":49.99}

-- Extract a value from JSON
SELECT id,
  JSON_EXTRACT(payload, '$.user')  AS user_name,
  JSON_EXTRACT(payload, '$.total') AS total
FROM orders;

-- Shorthand operator (returns unquoted string)
SELECT payload->>'$.user' AS user_name FROM orders;

-- Filter by JSON field value
SELECT * FROM orders
WHERE payload->>'$.user' = 'alice';

-- Count items in JSON array
SELECT id,
  JSON_LENGTH(payload, '$.items') AS item_count
FROM orders
WHERE JSON_LENGTH(payload, '$.items') > 3; -- orders with >3 items

-- Validate all JSON in a TEXT column is valid
SELECT id, payload
FROM api_logs
WHERE JSON_VALID(payload) = 0; -- find broken JSON

-- Build JSON from relational data
SELECT
  JSON_OBJECT(
    'id',    id,
    'name',  name,
    'email', email
  ) AS user_json
FROM users LIMIT 5;

-- Aggregate: all order IDs per user as a JSON array
SELECT
  user_id,
  JSON_ARRAYAGG(id) AS order_ids
FROM orders
GROUP BY user_id;
Q1

What is a JSON column in SQL and when would you use it?

A JSON column stores semi-structured data validated as JSON at insert time. Use it for data with variable structure — API request/response logs, feature flags, user preferences, or product attributes that differ per record. It avoids adding many optional nullable columns to a table.

Q2

How do you extract a specific field from a JSON column?

Use JSON_EXTRACT(column, '$.fieldName') or the shorthand column->>'$.fieldName'. The ->> operator returns the value as an unquoted string (without surrounding quotes), while -> returns a JSON fragment. Use ->> for direct comparisons in WHERE clauses.

Q3

How do you validate that stored JSON is valid in a TEXT column?

Use JSON_VALID(column) — it returns 1 for valid JSON, 0 for invalid. SELECT * FROM table WHERE JSON_VALID(payload) = 0 surfaces all rows with corrupted or malformed JSON data. This is a key QA check on legacy systems that store JSON as TEXT.

Q4

Can you create an index on a JSON column?

Not directly. To index a JSON field, create a generated (virtual or stored) column that extracts the value, then index the generated column: ALTER TABLE t ADD COLUMN user_name VARCHAR(100) AS (payload->>'$.user'); CREATE INDEX idx ON t(user_name). This lets the optimizer use the index for JSON field queries.

Q5

What is JSON_ARRAYAGG and how is it different from GROUP_CONCAT?

JSON_ARRAYAGG aggregates values into a proper JSON array (e.g. [1, 2, 3]). GROUP_CONCAT creates a delimited string (e.g. "1,2,3"). JSON_ARRAYAGG preserves data types (numbers remain numbers, not strings) and produces valid JSON output suitable for API responses or further JSON processing.

❓ FAQ & Resources
SQL FAQs & Best Practices
Expert tips for test data & verification | TechWorld Labs

Master SQL for data verification, test data management, and database testing.

Best Practices

🎯
Use Indexes
Create indexes on frequently queried columns for better performance in QA verification.
📦
Normalize Data
Organize data into normalized tables to avoid redundancy and maintain consistency.
🔍
Use JOINs Correctly
Understand INNER, LEFT, RIGHT, FULL OUTER joins for accurate data verification.
📝
SQL Comments
Document complex queries with comments for team understanding and maintenance.
🚀
Use Transactions
Wrap test data setup in BEGIN TRANSACTION for rollback capability after tests.
⚙️
Parameterized Queries
Use prepared statements to prevent SQL injection in test automation scripts.

Common Pitfalls

⚠️

SELECT * usage

Cause: Retrieves unnecessary columns, slow queries. Solution: Specify exact columns needed for verification.

⚠️

Missing WHERE clause

Cause: Updates/deletes entire table accidentally. Solution: Always verify WHERE conditions before execution.

⚠️

NULL handling

Cause: NULL != NULL in comparisons. Solution: Use IS NULL / IS NOT NULL for proper NULL checking.

⚠️

Case sensitivity in WHERE

Cause: Query returns no results due to case mismatch. Solution: Use LOWER() or UPPER() for case-insensitive comparison.

FAQs

Q1

How do I verify data integrity in tests?

Write SQL queries to check counts, sum values, and validate data relationships. Assert results match expected values.

Q2

How do I clean test data after tests?

Use DELETE with WHERE clause or TRUNCATE for tables. Wrap in transaction and ROLLBACK if needed.

Q3

What's the difference between GROUP BY and DISTINCT?

DISTINCT removes duplicate rows. GROUP BY groups rows and allows aggregate functions. Use based on your requirement.

Q4

How do I find slow queries?

Use EXPLAIN PLAN to analyze query performance. Check indexes, avoid SELECT *, optimize JOINs.

Q5

How do I handle date comparisons in SQL?

Use DATE() function for date-only comparison. Use BETWEEN for date ranges. Handle timezone differences.

Q6

How do I test stored procedures?

Call procedures using CALL syntax. Verify output parameters and result sets. Use ROLLBACK to undo changes.

Advanced Topics

💡
Window Functions: ROW_NUMBER(), RANK(), DENSE_RANK() for analytical queries in test data verification.
CTEs: Common Table Expressions (WITH clause) for complex queries with better readability.
⚠️
Performance: Always COMMIT or ROLLBACK transactions. Monitor database locks during parallel test execution.