Fail faster, run safer!

Yair Nevet
3 min readFeb 23, 2022

Following an interesting discussion I had with one of my engineering fellows, I would like to stress the importance of why it’s recommended to fail fast when designing and building a system.

Failing silently is considered an anti-pattern and should be avoided. Let me explain why. Imagine the case when you’re deploying your web app to the production environment. Everything looks well, your app is running as expected, responds well to its main routes without any issue and you’re calm & happy.

Photo by Simon Maage on Unsplash

Now, for some unseen reason, a couple of concrete configuration values required for other routes of your app were missing (e.g. Due to network issue, deleted keys, etc.) during the app initialization phase, but no failure has occurred since when designing the system, you decided to use default values/blank values if the configuration is missing and to not fail your app loading right away as it happens in order to avoid inconvenience during development time.

Photo by Ante Hamersmit on Unsplash

To demonstrate this anti-pattern, let’s take to following code snippet as an example:

Note how with this code, we’re effectively muting the critical issue:

  1. we are defaulting to a default value (URL) if the key was not found
  2. we are “masking” the error by “blowing” possible exception

In such a case, with or without proper monitoring in place, how long and how many requests would be lost until it would be clear that the configuration was not loaded correctly? What would be the impact in such a case? How long will this investigation take? This situation and those questions could be avoided if you choose the fail fast direction in which the stability of your production environment is more important than comfort during development time. You fail as soon as something unwanted happens and not when the situation is irreversible and the impact is no longer measurable. Therefore, I think and recommend failing fast and loudly so we’ll deliver a reliable and stable system without weird and disturbing inconsistencies!

Photo by Elisa Ventur on Unsplash

The following code snippet demonstrates how to fail as soon as something unwanted happens by throwing the relevant error and aborting the app initialization as a result of it:

Note how with this code above we’re raising/throwing an error in case we unmanaged to load the required config.

Feel free to share your thoughts below.

--

--