Code for Reliability

  • A Data Center(Region) fail
  • A Microservice Instance Fail
  • A machine Fail
  • Timeout Happens
  • Latency increase
  • A service hangs and never returns
  • A service breaks the data protocol and returns dirty results
  • A downstream component fail and that creates issues for your service
  • Lack of internal validations(i,e: expect an email and receives a number)
  • A database call fails: You try to persist but the client/server fails
  • You get wrong input data and your code blows(parsing issues — i.e: json/yaml)
  • A corner case in your code make you fall (untested corner case)
  • You were expecting one exception and another one happens(bad error handling)
  • Some HTTP call timeout and you were not counting on that(i.e lack of retry)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Diego Pacheco

Diego Pacheco

Brazilian, Software Architect, SWE(Java, Scala, Rust, Go) SOA & DevOps expert, Author. Working with EKS/K8S. diegopacheco.github.io (Opinions on my own)