Pitfalls
- There is no
==
in SQL, use=
- No trailing commas
- No using of
AND
in ORDER BY and GROUP BY clauses- AJOIN
is followed by aON
not aWHERE
unless intentional - Numerical equality Conditions inside count are always counted as 1 (can be used inside AVG though)
Functions
- Length - of a VARCHAR field
SELECT
tweet_id
FROM
Tweets
WHERE
LENGTH(content) > 15
;
- Count
SELECT
T1.customer_id,
count(T1.visit_id) as count_no_trans
FROM
Visits as T1
LEFT JOIN Transactions as T2
ON
T1.visit_id = T2.visit_id
WHERE
T2.transaction_id IS NULL
GROUP BY
T1.customer_id
You can use
AND
within the count function
- Datediff
Given a field recordDate
of datatype date
,
SELECT
T2.id as id
FROM
Weather as T1
JOIN Weather as T2
WHERE
DATEDIFF(T2.recordDate, T1.recordDate) = 1
AND
T1.temperature < T2.temperature
- avg
# Write your MySQL query statement below
SELECT
T1.machine_id,
avg(T2.timestamp - T1.timestamp) AS processing_time
FROM
Activity as T1
JOIN Activity as T2
ON
T1.machine_id = T2.machine_id
AND T1.process_id = T2.process_id
AND T1.activity_type = 'start'
AND T2.activity_type = 'end'
GROUP BY
T1.machine_id
# Another example
SELECT
ROUND(
AVG(order_date=customer_pref_delivery_date) * 100
, 2) as immediate_percentage
FROM
Delivery
WHERE
(customer_id, order_date) IN (SELECT customer_id, min(order_date) FROM Delivery GROUP BY customer_id)
- Between
SELECT
T1.product_id as product_id,
ROUND(SUM(T1.price * T2.units) / SUM(T2.units), 2) AS average_price
FROM
Prices AS T1
JOIN UnitsSold AS T2
ON T1.product_id = T2.product_id
WHERE
T2.purchase_date BETWEEN T1.start_date AND T1.end_date
GROUP BY
T1.product_id
- CASE WHEN
SELECT query_name, ROUND( SUM(rating / position) / COUNT(query_name) , 2) as quality, ROUND( SUM(CASE WHEN rating < 3 THEN 1 ELSE 0 END) * 100 / COUNT(query_name) , 2) as poor_query_percentage
FROM Queries GROUP BY query_name
7. Date_format
```sql
SELECT
DATE_FORMAT(trans_date, "%Y-%m") as month,
country,
COUNT(*) AS trans_count,
SUM(CASE WHEN state="approved" THEN 1 ELSE 0 END) as approved_count,
SUM(amount) AS trans_total_amount,
SUM(CASE WHEN state="approved" THEN amount ELSE 0 END) as approved_total_amount
FROM
Transactions
GROUP BY
country, DATE_FORMAT(trans_date, "%Y-%m")
;
Using WHERE
vs ON
SELECT * FROM table1 JOIN table2 ON table1.id = table2.id;
SELECT * FROM table1 JOIN table2 WHERE table1.id = table2.id;
Both the above commands return the same result, but using ON
is more sensible as it’s clear to the reader that the condition is used upon joining the two tables. WHERE
is more suitable when we are trying to apply a filter on the fetched data.
Is there a performance difference though?
No. SQL comes up with a query plan for every user written query, in this case both queries result in the same query plan.
Example usage,
SELECT
T1.machine_id,
round(avg(T2.timestamp - T1.timestamp), 3) AS processing_time
FROM
Activity as T1
JOIN Activity as T2
ON
T1.machine_id = T2.machine_id
AND T1.process_id = T2.process_id
WHERE
T1.activity_type = 'start'
AND T2.activity_type = 'end'
GROUP BY
T1.machine_id
Having
SELECT
T2.name as name
FROM
Employee as T1
JOIN Employee as T2
ON T1.managerId = T2.id
GROUP BY
T1.managerId
HAVING
COUNT(T1.managerId) >= 5
Show of strength
- Use
<>
instead of!=
(Both works though) - Format the query in multiple lines with indentation
Note
JOIN
is by default anINNER JOIN