What kind of safeguards do you use to avoid accidentally making unintended changes to your production environment?

StackOverflow https://stackoverflow.com/questions/956428

Question

Because we don't have a good staging environment we often have to debug issues on our production systems. We have web, application, and database servers.

What kind of safeguards do you use to avoid accidentally making unintended changes to your production environment when doing this?


EDIT:

The application is a very complex B2B vertical web application. There is a lot of data involved. Some tables have close to 100 million records.


EDIT:

The staging environment we have in place does not have the capacity to mirror production. There are also hundreds of gigabytes of data files involved besides the actual database data.


EDIT:

We do use source control for the code but not for the stored procedures. There are some old stored procedures in source control but nobody keeps that updated anymore.

The main concerns are the database and data on the file system.

BTW, I am a consultant at this company, not an actual employee.

Was it helpful?

Solution

The most direct answer is: "Don't do that."

OTHER TIPS

source control. nothing like a rollback when things to irreparably wrong. Also, a diff can help you replicate the changes to other production systems.

New production releases go via our systems guys, the programmers and developers can only request to make their new system go live, approval is needed as well, and we show that each change that has been made has been tested (by including a snapshot of all that was tested in this release in the production request).

We keep the previous production releases for fallback in case of issues.

If things do break (which they shouldn't do often with a proper testing procedure and managed releases) then we can either roll back, or hotfix. Often when things are broken in live and the fix is small, we can hotfix, then move the fix to test to do a proper test.

Regardless, sometimes things get by...

only allow certain accounts write access, so you have to log in differently to make a change

on web server, have two directory structures, that mirror each other, one where only one ID can write, the other staging dir, everyone can write.

on database server, have one production db, where only one ID can write, have a staging db where everyone can write. the staging DB can have nightly backup restored to it.

HOWEVER, if you have a bad query or some resource hog in your staging system resources will be pulled from production, and the machine could hang.

For Web and Application Servers, I would try to copy the environment to a new location (but on the same environment) and have the affected people reproduce behavior on the copy. This will at least give you a level of separation from accidentally screwing with 100% of your clients.

For Database Servers, I would configure user accounts on the production system to give them read only rights.

Read-Only/Guest accounts. Seriously. It's the same reason you don't always login as root or Administrator.

This is a tough thing, and it goes with the territory of "no staging environment."

For many reasons, it's best to have a dedicated (duplicate) of PROD you can use to stage deploys to...and to debug on, but I know that sometimes when you're starting out that doesn't work out as quickly or thoroughly as we'd want.

One thing I've seen work is the use of VMs: aside from the debug environment, you can create a mini-PROD in a VM and use that to debug. This may not be practical given the type of app you're developing, so additional detail in that area would be helpful.

As for avoiding changes to PROD during debugging: is there a reason you'd need to change anything to facilitate debugging? If so, that might be worth looking into solving another way.

Version control is immensely helpful for controlling changes to production environments - just make your production environment a working copy of the appropriate directory or directories from the repository. When you roll out an update, your source control system makes sure that ALL the changed files get copied. When an update breaks something, you can roll the production working copy back to the last revision which wasn't broken. Also, you can check your production WC out from a tag instead of from the trunk; that way you can decide which repository revisions to apply to the production environment by adjusting the tag.

If you're not familiar with the concepts of version control systems, I'd advise you to do some research. They're conceptually complex but incredibly useful and powerful. The Wikipedia article is a good place to start: http://en.wikipedia.org/wiki/Revision_control

I'm sorry, you have to have a staging environment. There's no getting around this. If it means you have to cull the size of your datasets, then that's what you have to do. Use VMware and VMware converter to import the production systems during down-periods, if you have them (this is a many-hour process, so maybe not practical).

There are a certain class of problems you can't solve without having full access to production DBs (or a copy), performance is one of these. But you really should build a staging environment, even if it's on someone's desktop machine with a stripped down dataset.

That aside, I've had to live my life with a few of these in the past, and really, there's nothing you can do except lots of backups. Every change you make should be preceded by incremental backups. That way if you fubar'd something, the amount you've lost is not substantial. SQL server can take differential backups that limit amount of diskspace used for backups. Oracle can as well.

In case you really have no other choice, and it is likely to be a chronic situation... consider adding some way to the application data (files, or database) to flag a set of data as 'please god do not actually actively change production state with this data', combined with data dumps at critical positions in a process when this flag is activated, you may be able to exercise most of the production logic without the data actually being acted upon.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top