Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Good processes for debugging production environment? Copying data to Dev?

I've been thinking about this a bit. The idea is that something goes wrong in PROD. The data that was captured causes the web app to behave differently than in other environments. So, also data in other environments gets out of sync with prod (as expected). However, a bug comes along and for some reason it only happens in PROD, probably because of the differences in data.

I'm wondering what is a good practice to remedy these kinds of problems? More tests, for sure. But beyond that? One could create new data in dev, but the whole point is that some data point, or some combination of actions causes a data point to be wrong. Perhaps when using some other data source to arrive at the "actual" data point, which is different then the "expected" data point. Apologizes that this isn't a great description, and tries to be both an example and a definition of a general production bug.

I know this isn't a very precise question. Hopefully, there are references that make good suggestions.

like image 217
lucidquiet Avatar asked Oct 04 '22 12:10

lucidquiet


1 Answers

This is a very interesting question. One approach I've used before is to deliberately do my final testing in production (TIP).

Before you skewer my effigy with multiple pointy needles, hear me out for a minute while I talk about continuous deployment :-)

The idea is to deploy a new build into production and then use custom routing to direct traffic between the old and new production builds. In principle this is quite simple: you start by routing the old build to your current customers and the new build only to your engineering team. Your customers don't see any change. But your team can start testing your new build, including messy stuff like disaster recovery and stress-testing. You will hopefully discover the type of bugs that you talk about in your question.

If there's a problem, then you simply rollback the new build. If your tests don't find any problem, you roll-out to say 5% of your client base. Then 10% and 20% and so on.

Whilst simple in principle, there are some issues that you need to plan for from the very start. The first is data and data schemas, which need to function correctly across both old and new builds. As long as the services used by your web app are designed to handle at least one rollback after a new build is deployed, and your new build understands both the old and new data, then you should be okay.

The second issue is API/interface changes. Rather than editing or deleting methods or parameters, you need to create a new API/interface that mostly re-directs to the old API/interface, except for the new/changed code.

Other issues including incompatible changes to configuration and settings between builds. These issues aren't fatal, but you do need to do some planning and testing beforehand. And the big reward is that you can safely do final testing of your code in production without affecting your customers.

Some links on testing in production:

  • There's no place like production
  • The future of software testing
  • Production is a mixed blessing
  • TIP - malpractice?
  • TIP really happens
  • Why TIP isn't as stupid as it sounds
like image 97
HTTP 410 Avatar answered Oct 10 '22 02:10

HTTP 410