I was recently asked to look at an issue causing results on two MySQL servers (hosted on different machines: local environment and production environment) to be inconsistent with one another – even though the dataset and query used were identical.
Here is the problematic SQL:
SELECT
COUNT(DISTINCT `order_items`.`id`) as `item_count`,
ROUND(SUM(DISTINCT `products`.`price` * `order_items`.`quantity`), 2) as `sub_total`,
ROUND(`vat_rate`.`rate` * SUM(
DISTINCT `products`.`price` * `order_items`.`quantity`
) / 100, 2) as `vat`,
ROUND(SUM(
DISTINCT `products`.`price` * `order_items`.`quantity`
) + `vat_rate`.`rate` * SUM(
DISTINCT `products`.`price` * `order_items`.`quantity`
) / 100, 2) as `total_incl_vat`,
`vat_rate`.`rate` AS `vat_rate`
FROM `orders`
LEFT JOIN `order_items` ON `order_items`.`order_id` = `orders`.`id`
AND `order_items`.`status` = 1
LEFT JOIN `products` ON `products`.`id` = `order_items`.`product_id`
AND `products`.`status` = 1
LEFT JOIN `vat_rate` on `vat_rate`.`id` = `products`.`vat_rate_id`
WHERE `orders`.`id` = 205861
ORDER BY `products`.`id`;
Firstly, we must establish the differences between the two environments we are executing this query on. It is probably not beneficial to look for similarities (we know what is similar: the dataset and the query) and instead we should be looking for other differences which might not be immediately obvious. Things to look out for are a discrepancy in:
- MySQL versions (different versions may cause different query optimisation or syntax);
- database software itself (MySQL vs. MariaDB); or
- environment variables (particularly sql_mode – you can check this with
SHOW VARIABLES LIKE `sql_mode`
.
Try to ensure the two environments are as the same as possible.
The Fix
In the example query given above, there was a logic error in the query meaning `vat_rate`.`rate` was being used outwith the SUM()
aggregate function. This meant that, due to discrepancies in server software/versions, the query was being executed differently. This is perhaps due to differences in how the query optimiser orders or groups results.
Incorrect SQL:
ROUND(`vat_rate`.`rate` * SUM(
DISTINCT `products`.`price` * `order_items`.`quantity`
) / 100, 2) as `vat`,
The fix was to put the joined table’s columns within the aggregate function so that the `vat_rate`.`rate` value used is specific to the current row (before it is aggregated) rather than a seemingly random one.
Correct SQL:
ROUND(SUM(
DISTINCT `products`.`price` * `order_items`.`quantity` * `vat_rate`.`rate`
) / 100, 2) as `vat`,