Keep your server sharp and upgrade memory techniques
A comprehensive collection of articles, videos and more, hand-picked by our editors
The importance of on-server memory is changing server design and configuration, accommodating more memory and higher...
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
memory performance. This means overhauling how servers deal with memory errors and reliability.
How important are on-server memory features now that our data center is consolidated onto virtualized servers?
As virtualization expands across most data centers, fewer hardware platforms are responsible for a greater number of workloads.
Memory capacity and reliability are critical to successful consolidation and workload integrity. Correspondingly, the effect of any server fault is multiplied by the number of workloads running on the server. For example, if a server running 10 workloads experiences a memory fault that causes a system crash or reboot, all 10 workloads are affected until the system restarts or each fails over onto other servers.
New technologies are bolstering memory resilience far beyond error correction code (ECC) and memory sparing. These developments address correctable errors over the long term, tipping off administrators to chronic memory faults. Server administrators can inspect and replace questionable components during routine maintenance before hard failures occur.
The error threshold allows a dual in-line memory module (DIMM) to track the location and frequency of correctable errors -- those that ECC can catch and fix on the fly-- using serial presence to detect error logging and other DIMM capabilities.
If advanced ECC is implemented, the system can detect and recover from multi-bit errors. Data words are split between separate ECC DIMMs for advanced ECC, which usually means deploying matched DIMMs with the same capacity and ranking. An even number of DIMMs should be installed in the server.
When a server identifies chronic problems -- when correctable errors exceed the set threshold -- with a DIMM, the error report can alert the systems management tool to flag the DIMM for pre-emptive replacement. Some servers go a step further and effectively remove an entire memory page from use. The remainder of the DIMM remains in use, or memory swapping switches operations over to a spare module.
When deploying memory sparing or mirroring, use DIMMs in two channels that are matched in capacity and ranking. This ensures the system can switch to a backup DIMM with precisely the same data format as the original memory module. If dissimilar DIMMs are used, the server BIOS may detect the difference and disable these features.
Server technicians should always refer to the new system's documentation to specify the appropriate number and type of memory modules to meet the required level of resiliency.
See the next question on server memory: What voltage is best?
Related Q&A from Stephen J. Bigelow
DR planning mistakes are easy to make. Avoid selecting a tool that doesn't meet your needs or that's overly complex, carefully consider the ...continue reading
Establishing a DR plan for a VMware environment can be overwhelming. How do you design a plan that prioritizes VMs and manage your infrastructure to ...continue reading
Storage I/O control can be an effective way to handle occasional storage sharing issues, but it is not always suitable for every virtual machine.continue reading
Have a question for an expert?
Please add a title for your question
Get answers from a TechTarget expert on whatever's puzzling you.