Duplicate code – case study

Code duplication is the root of all evil in software development. When working on an e-commerce project, we found that the prices were wrong, sometimes. We ended up locating the problem as being a duplication across two different import jobs that were misaligned.

It is a subtle error that could have been avoided if the developer writing the code initially had focused on removing duplication. Tracking down such an error is very time-consuming and thus expensive.

Project context – multiple import jobs

To understand the issue first, a bit of context on the e-commerce project and the problem.

Suppliers upload files with product and price information.

  • One full export file, once a week with all the product details.
  • One smaller daily upload with just the price and inventory levels.

The shop has two jobs running to import the two files.

  • One importer to update products using the full export file.
    This importer can also be run manually if we need to import products outside the regular cycle.
  • One automated importer to update the price and stock information daily.

Project context – price calculation

The product price is calculated as cost price + profit margin + environmental fee. Both the full export file and the daily file contains the cost price of the product. It makes them able to run independently.

Code walkthrough

The full importer is using the method Parse, and the daily importer uses FetchPrices. Both methods are provided with the data from the respective data file split into lines and fields.

The code has several issues like naked primitives, magic numbers, no tests, and more. But the focus here is only on the duplication that caused the price calculation to break.

Both importers call the CalculatePrice method, respectively, at lines 13 and 18. It is the method that is responsible for adding the profit margin to the cost price and add tax.

Notice that the AddEnvironmentalFee method is only called from the Parse method at line 20. In the FetchPrices method, the AddEnvironmentalFee is not called.

This difference cause prices to be calculated correctly on the weekly import, but they will be missing the environment fee when running the daily import.

public List<List<string>> FetchPrices()
{
var priceData = DB.LoadPriceData();
var parsed = new List<List<string>>();
var costPriceField = 5;
foreach (var row in priceData)
{
var price = row.Value;
parsed.Add(new List<string>
{
BuildProductSku(price),
CalculatePrice(price[costPriceField]).ToString(),
price[costPriceField].ToString(),
});
}
return parsed;
}
public (List<List<string>>, List<string>) Parse(IDictionary<string, List<string>> dbData,
IDictionary<string, List<string>> csvData, IDictionary<string, List<decimal>> prices)
{
var parsedData = new List<List<string>>();
var skipped = new List<string>();
var skuField = 3;
var costPriceField = 5;
foreach (var line in dbData)
{
var product = line.Value;
if (!Validate(product, csvData))
{
skipped.Add(product[skuField]);
continue;
}
var parsed = product.Clone();
var price = CalculatePrice(prices[product[skuField]][costPriceField]);
parsed[8] = AddEnvironmentalFee(price);
parsedData.Add((List<string>) parsed);
}
return (
parsedData,
skipped
);
}
public decimal CalculatePrice(decimal costprice)
{
var price = costprice / (100 - Margin) * 100;
return ApplyTax(price);
}
view raw CalculatePriceOld.cs hosted with ❤ by GitHub

Fixing the problem

The CalculatePrice method signals that it calculates the price. We would expect that calling it with a cost price would return the final price without any additional calculations needed.

I can easily see how a developer could forget to add the environmental fee because the expectation is that the method calculates the final price.

It is relatively easy to fix. The AddEnvironmentalFee should be moved into the CalculatePrice method.

public decimal CalculatePrice(decimal costprice)
{
var price = costprice / (100 - Margin) * 100;
return AddEnvironmentalFee(ApplyTax(price));
}
view raw CalculatePriceNew.cs hosted with ❤ by GitHub

It wouldn’t make sense to have different prices calculated depending on which import job is running. Prices should be stable. If pricing changes, we want to change just the CalculatePrice method and not chase around in the code to find other places to change.