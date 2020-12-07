Noisy neighbor issues in IT are a major problem that can be difficult to deal with. A noisy neighbor is a workload that's using up one or more resources on a platform in a manner that restricts the availability of such resources for other workloads.

On physical platforms, this is only a problem for a single workload. A memory leak or overly chatty storage setup can bring the workload to a halt but won't affect other workloads, as they have their own dedicated resources to use. The only areas where it could be problematic are SAN or NAS, or when a workload hits a shared WAN.

It's only when you use more virtualized resources that noisy neighbor workloads became a widespread issue. The capability for virtual platforms to share resources and for those resources to be dynamically managed means that it's possible for one workload to use up all available resources, leaving none for other workloads to use when they need it most.

Administrators need a way to prevent noisy neighbor issues in the first place and to deal with them if they become an issue.

For the purposes of this article, we'll consider two different multi-tenant environments: a relatively controlled colocation environment and a public cloud platform.

Managing noisy neighbors in a colocation facility In a colocation facility, an organization has its own dedicated cage or area where it installs, provisions and manages its own servers, storage and local area networks. This contained environment uses the facility's broader network capabilities to gain access to other third-party services and external access to the internet, as well as external access to the services provided by the managed platform. Here are the main ways to prevent and manage noisy neighbor issues: Ensure code is correctly written. Even in a DevOps environment, it's vital that code is tested to check that there are no memory leaks and that network chatter is optimized to be as low as possible. Stress testing to see how the workload deals with peak loads is also necessary to see what resources the workload attempts to use.

Set resource limits. These parameters should define how the platform should react as the workload flexes. For example, how much extra CPU

Prioritize workloads. The last thing an organization needs is for a low-level, nonprofit generating workload to use up resources while highly important workloads are unable to gain the extra resources they require. In addition to setting resource limits, prioritize your workloads to best meet the organization's needs.

Redeploy workloads to alternative platforms. A badly behaving workload might still need to run while the underlying problem is dealt with. It should be possible to spin the workload up on a less used or less important part of your overall platform using virtual machines or containers

Enable human intervention. As mentioned, monitoring systems must be in place to identify any issues and flag them. Alongside this, administrators must be able to set off events that can throttle a specific workload so its demands on the platform's resources are minimized to allow other workloads to use the resources they need.