On The Road to Recovery

Negative impact on data center in case of calamity

Earthquake. Fire. Hurricane. Blackout. Virus. Terror attack. Any of these natural or manmade events can cause the obliteration of business data. And if that isn’t frightening enough, Sarbanes-Oxley, HIPAA and Securities and Exchange Commission requirements will really scare you. Face it, data needs to be kept alive and accounted for to survive and thrive in the testing conditions of modern-day business.

The challenge is to provide solutions that allow a business to continue to operate in the event of any number of calamities that can have a negative impact on a data center. The problem is not knowing which one it will be or how bad it will get—but each scenario involves the data center being wiped out or, if luck prevails, just being offline for a few days.

The two biggest challenges for disaster recovery in IT are the movement of data to the recovery site and the actual process of recovery. Now that virtualization technology is mainstream, and thanks to the emergence of new IT business models, such as Infrastructure as a Service (IaaS), some compelling new ways to approach the problem are becoming available.

Getting Warmer
The ideal scenario for most organizations is a hot site. A hot site is a near replica of the entire production environment at another data center, ideally several hundred miles away and in a different environment. Unfortunately, hot sites are extremely expensive and can increase production cost by as much as 250 percent. This large cost increase generally drives most companies away from implementing a true hot site.

The majority of companies end up with a cold site—a physical location held in reserve with a promise from the supplier that there will be an adequate number of computers waiting when they arrive. A cold site is a form of insurance. Businesses share the costs with other companies and hope that not everyone has a disaster at the same time. This makes cold sites a much more affordable option.

Clearly, neither of these options—the hot site or cold site—offers sufficient protection for today’s data-dependent businesses. The best available option, however, may be a combination of the two.

Odds are that you’ve already used virtualization technology in the data center to consolidate old servers or perform testing. Using the same concepts, it’s possible to create a warm site—a hybrid between a hot site and a cold site—at a secondary location. Using replication software like Doubletake, one normally needs a one-to-one server ratio in the data center and at the recovery site.

It All Adds Up

Caught on Tape

Many companies have a 36-hour recovery time objective (RTO) because they have determined that’s how much time it will take to get the backup tapes, fly to the recovery site, pull everything off of tape, test the systems and be back online. That is, if it works.

Oftentimes the hardware isn’t exactly the same as you use in production. Tape-based recovery isn’t perfect, either. It might contain a good copy of your data, but all of the information about the network -- IP addresses, registry configurations, patch levels — is frequently not on the tape. It takes a huge amount of time to reconstruct this information, even provided you have great (and up-to-date) documentation.

With IaaS, instead of purchasing a stock of extra servers and a SAN, it’s possible to rent 60 processor cores, 2 Terabytes of storage and 64 GB of memory, and pay on a monthly or quarterly basis. Most IaaS vendors run VMware or a similar operating system that enables virtualization.

This operating system approach is the key to putting a shim between the hardware and your environment, allowing the hardware to scale, move around and be replicated. It also is what makes an IaaS provider different from a traditional service provider or hosting center

Suppose a company has approximately 100 physical servers in its environment. Forty of these are designated as Tier 1 mission critical. If these servers failed, so would the business.

In the old model, choices would be to go with either a hot, cold or warm site. Assuming the cost per server is $6,000 fully loaded, it would be approximately $240,000 worth of server equipment at the disaster recovery site to do a true hot site approach. Add to this the cost for co-location at $4,000 a month (including 5 kw of power/cooling load per cabinet) for two cabinets. This figure doesn’t include switches, routers, operating systems, replication software, human resources, bandwidth and recovery testing.

Given that the remote site is not using resources on the servers, it’s a perfect opportunity for consolidation. Conservatively, consolidate virtual servers to physical servers at a rate of 10-to-one because this is for disaster recovery, not production. That means you can use four physical servers instead of 40. It also means you are using 90 percent less of the co-location space. If the co-location center doesn’t charge in less than half-cabinet increments, this puts the cost at $1,000 per month. Now beef up the memory, adding 16 GB to each host. Keep in mind that the VMware also costs money. But even with these additional costs, savings will net more than $200,000 per year.

However, there are other costs involved, such as replication software. This can range dramatically in price, depending on features and vendors. Use an agent-based approach, and figure the cost per agent to be $2,500. Even though there are now only four servers at the recovery site, it’s still 40 virtual machines. That means there are 40 agents in production and 40 agents at the recovery site, for a total of 80 agents, amounting to $200,000 in software licenses.

Virtual Reality
SAN vendors and customers already know the secret here. They boot all of the servers from the SAN and use SAN replication software to transport data to the recovery site. However, there are two problems with this method. Depending on the SAN vendor, that replication software might be an additional replication license. It also might mean the purchase of one or more additional SANs for the recovery site. Not only does this amount to more of an expense, but the recovery picture gets complicated. With both approaches, recovering the actual data is possible, but the configuration information that’s so important is lost. Although better than tapes, it is still not a perfect system.

Here is where virtualization comes to the rescue once again. Whether using SAN-to-SAN replication or an agent-based approach, virtualize the production servers. This makes it possible to replicate at the VMware level instead of within Windows®.

Take the existing scenario of 40 servers. Say these are spread across four very robust physical servers. If a software replication product that runs on ESX is employed, you only need four licenses at the production site and a similar number at the remote site. This means even more significant cost savings, as you are replicating all of the patches, configuration data and permissions by sending fully bootable virtual machines instead of SQL data or an Exchange store.

Even though companies are virtualizing more production servers than ever, it still may not be ready to virtualize certain servers. A new class of replication agents is coming. These will enable the user to take a traditional, non-virtualized server and convert it into a virtual machine during the replication process. This way, the source production server stays the same, but users can gain all of the efficiencies of using virtual machines instead of physical servers at the recovery site

In the new model, recovery and testing work the same way. Since there are fully-replicated, bootable virtual servers at the recovery site, one simply needs to access them remotely and power them up.

In most cases, leave virtual machines in a heavily consolidated mode if they’re just being tested. During testing, users have opportunity to make sure they can view tables, access files or even mount a mail store and open a replicated e-mail inbox.

During an actual or simulated recovery, it might be necessary to spread virtual servers around on enough physical servers to provide the performance the production environment normally requires.

This is where IaaS can be a huge help. An IaaS provider already has many racks of servers and network capacity available on demand. The only thing that’s required is a small foothold in that environment that can replicate data. Expensive resources, such as processing power and memory, aren’t necessary. Since the IaaS vendor already has the capacity available, this process can generally be completed in a few hours or less. In a declaration or test, the provider will take one or two replication servers onto 20, 30 or 100 real physical servers, whatever is appropriate, to equal the resources that were there in production.

Bonus Features
Virtualization software features like VMware’s DRS will even allow users to move already booted (but slow) servers from the consolidated hardware onto the new hardware resources as they become available without any downtime.

As a bonus, push the development, staging or other tertiary applications out to run in production at the DR site on the DR equipment. That way, IaaS is can actually be considered a production investment. This makes the CFO happy.

An even bigger challenge is to recover people and business processes. Suppose a prolonged power outage in one facility forces a company to shift data to computers in another state. What about its employees? Instead of moving people into a workplace recovery center, with IaaS, all an employee needs is remote Internet access from a home office until the issue is resolved.

Welcome to the brave new world of disaster recovery. These new approaches go a long way toward making business continuity simpler, more affordable and more reliable than ever.

Featured

  • Cloud Security Alliance Brings AI-Assisted Auditing to Cloud Computing

    The Cloud Security Alliance (CSA), the world’s leading organization dedicated to defining standards, certifications, and best practices to help ensure a secure cloud computing environment, today introduced an innovative addition to its suite of Security, Trust, Assurance and Risk (STAR) Registry assessments with the launch of Valid-AI-ted, an AI-powered, automated validation system. The new tool provides an automated quality check of assurance information of STAR Level 1 self-assessments using state-of-the-art LLM technology. Read Now

  • Report: Nearly 1 in 5 Healthcare Leaders Say Cyberattacks Have Impacted Patient Care

    Omega Systems, a provider of managed IT and security services, today released new research that reveals the growing impact of cybersecurity challenges on leading healthcare organizations and patient safety. According to the 2025 Healthcare IT Landscape Report, 19% of healthcare leaders say a cyberattack has already disrupted patient care, and more than half (52%) believe a fatal cyber-related incident is inevitable within the next five years. Read Now

  • AI Is Now the Leading Cybersecurity Concern for Security, IT Leaders

    Arctic Wolf recently published findings from its State of Cybersecurity: 2025 Trends Report, offering insights from a global survey of more than 1,200 senior IT and cybersecurity decision-makers across 15 countries. Conducted by Sapio Research, the report captures the realities, risks, and readiness strategies shaping the modern security landscape. Read Now

  • Analysis of AI Tools Shows 85 Percent Have Been Breached

    AI tools are becoming essential to modern work, but their fast, unmonitored adoption is creating a new kind of security risk. Recent surveys reveal a clear trend – employees are rapidly adopting consumer-facing AI tools without employer approval, IT oversight, or any clear security policies. According to Cybernews Business Digital Index, nearly 90% of analyzed AI tools have been exposed to data breaches, putting businesses at severe risk. Read Now

  • Software Vulnerabilities Surged 61 Percent in 2024, According to New Report

    Action1, a provider of autonomous endpoint management (AEM) solutions, today released its 2025 Software Vulnerability Ratings Report, revealing a 61% year-over-year surge in discovered software vulnerabilities and a 96% spike in exploited vulnerabilities throughout 2024, amid an increasingly aggressive threat landscape. Read Now

New Products

  • Mobile Safe Shield

    Mobile Safe Shield

    SafeWood Designs, Inc., a manufacturer of patented bullet resistant products, is excited to announce the launch of the Mobile Safe Shield. The Mobile Safe Shield is a moveable bullet resistant shield that provides protection in the event of an assailant and supplies cover in the event of an active shooter. With a heavy-duty steel frame, quality castor wheels, and bullet resistant core, the Mobile Safe Shield is a perfect addition to any guard station, security desks, courthouses, police stations, schools, office spaces and more. The Mobile Safe Shield is incredibly customizable. Bullet resistant materials are available in UL 752 Levels 1 through 8 and include glass, white board, tack board, veneer, and plastic laminate. Flexibility in bullet resistant materials allows for the Mobile Safe Shield to blend more with current interior décor for a seamless design aesthetic. Optional custom paint colors are also available for the steel frame.

  • HD2055 Modular Barricade

    Delta Scientific’s electric HD2055 modular shallow foundation barricade is tested to ASTM M50/P1 with negative penetration from the vehicle upon impact. With a shallow foundation of only 24 inches, the HD2055 can be installed without worrying about buried power lines and other below grade obstructions. The modular make-up of the barrier also allows you to cover wider roadways by adding additional modules to the system. The HD2055 boasts an Emergency Fast Operation of 1.5 seconds giving the guard ample time to deploy under a high threat situation.

  • Luma x20

    Luma x20

    Snap One has announced its popular Luma x20 family of surveillance products now offers even greater security and privacy for home and business owners across the globe by giving them full control over integrators’ system access to view live and recorded video. According to Snap One Product Manager Derek Webb, the new “customer handoff” feature provides enhanced user control after initial installation, allowing the owners to have total privacy while also making it easy to reinstate integrator access when maintenance or assistance is required. This new feature is now available to all Luma x20 users globally. “The Luma x20 family of surveillance solutions provides excellent image and audio capture, and with the new customer handoff feature, it now offers absolute privacy for camera feeds and recordings,” Webb said. “With notifications and integrator access controlled through the powerful OvrC remote system management platform, it’s easy for integrators to give their clients full control of their footage and then to get temporary access from the client for any troubleshooting needs.”