RedLegg

The IN’s and OUT’s of Information Technology

Archive for the ‘EMC’ Category

Why Data Deduplication

Posted by Laura on July 23, 2009

Database growth is GROWING.   Hopefully not like the ouchy on Sean’s red leg!!!

sean's red leg

Of course the reason for the growth is because of the dependence on digital assets to conduct business and the need to support a growing mobile workforce.  Collaboration, Web 2.0 applications and the use of messaging systems also contributes to the amount of information growth. Many other things are also of course included.  Digital signatures, having less paper & scanning more documents More information being created, more primary storage is of course required.  The increase in capacity may affect the primary storage system footprint in the data center and potentially requires your company to secure or rent additional floor space.   Operating costs such as associated power & cooling requirements, additional networking infrastructure, redundancy components and resource management software licensing will also grow with the storage stuff.  And of course in increase in primary storage triggers an increase in secondary storage (disk and or tape) media management servers, backup software licensing, backup reporting software licensing and offsite media expenses.    Also you cannot forget about the remote and branch offices.  Their data is also increasing and distributed data at these locations must also be considered.

Primary data growth is expensive but the biggest contributor to the cost of information are ALL of the copies made for data protection purposes.   ESG asked 400 IT people what the greatest data protection challenge was and the top reason was “keeping pace with the capacity of data to protect”  (ESG Research Report, Data Protection Market Trends, January, 2008.)  Most all organizations have a standard process in place to protect all digital records within the organization which of course means you make a copy of a volume, LUN or file at one or more points in time during the day and saves the copy locally for operational recovery at an offsite location for disaster recovery.  But the problem is that the data protection operations can be ineffective –backup applications make many backup copies of the same (or slightly modified) file when only a small amount of the data within the file has actually changed.  Dozens of copies of the same data may be made and stored for lengthy periods of time – even when the file is not changing or has lost its usefulness.    Something like this is typical..

  • A file is created and backed up on the same day
  • The file is continually updated & backed up over a week
  • The file is then emailed to the group of people and is then backed up as part of the email application backup
  • One or more of the people modify the file and then back that one back up again
  • In the meantime every on premises copy of the backup is replicated offsite, doubling the copy instances

Highly redundant backup files clog LANS, WANS and SANS and consume on and off premise storage capacity. 

Lots of time companies are adding to the data protection capacity problem by implementing new technologies to solve other IT problems.  For example, there are lots of data center consolidation & GREEN (yes I hate that word but it is being used a lot) and deploying server virtualization solutions.  These solutions allow you to run multiple servers on a single piece of hardware which drives up utilization.  HOWEVER @ least more then a third of these organizations that have deployed virtualization technology has seen an INCREASE in the total amount of data needed to back up.  Since virtual machines disk images contain operating systems, applications and data there is a high amount of redundant information across virtual machines on a single physical server.   The .vmdk files for 10 virtual machines running Windows will contain 10 very similar binaries, patches & auxiliary applications.

So it is tricky.  You have lots and lots of data, longer retention policies to KEEP data and less money to spend.

So check out data deduplication… it is a good idea to review data deduplication to help control storage capacity & cost.   As you have heard data deduplication identifies and eliminates redundant data.  It can be performed at the file, block or byte level.   With data deduplication, data is not stored twice instead a pointer to the stored duplicate data is written (which takes up significantly less space).   Data duplication rates vary with the type of data, frequency of full backups, retention, inter-file and inter-application redundancy, local or global de duplication but a reduction ration of 20:1 can be broadly available.   The amount of data stored either due to a greater frequency of full backups or longer retention times leads to increased data deduplication ratios.   Using deduplication is good because the capacity associated and money savings are likely to improve while also improving the likelihood that data can be recovered from disc.

Data Domain this week announced its new DD8800 enterprise dedupe appliance which will probably be its last new product introduction before being acquired by EMC.  It is the fastest backup array on a per controller basis regardless of whether data deduplication is factored in or not.  The bandwidith on this device is 5.4TB per hour or 1.28 Tbyter per hour for a single stream because of its new quad Intel Processors.  It can replicate from up to 180 remote sites into a single central location doubling the DD690’s 90-1 replication ration.  It is scheduled to ship in the 3rdquarter of this year.  I will update this blogger post later with the list pricing.

Posted in Data Domain, EMC | Tagged: , , | Leave a Comment »

Ok really? Data Domain?

Posted by Laura on June 3, 2009

I posted a news article yesterday about the crazy bids going on between NetApp  & EMC. 

There were 394823904823408 questions yesterday on “Who is Data Domain?” They really are a big competitor to EMC? Really?  And worth all of this money?

Why not just buy Quantum?  They are prob only valued at about $246 million… they are much more known & has a quality product offering.

Something else is going on here.. NetApp initially wanted to give Data Domain away free as a feature.  Wow… expensive free feature.   If NetApp wins this there is no way NetApp prices are not going to go up.    EMC already has a de-duplication piece.. they want another one too.. This whole thing is really interesting  & weird.  Lots of money being thrown around.   I would both back out ASAP if I were them.  Nuts.

Posted in Data Domain, EMC | Tagged: , , | 1 Comment »

EMC Trumps NetApp’s Offer For Data Domain

Posted by Laura on June 2, 2009

EMC on Monday said it had made an offer to acquire Data Domain for $30 per share in a deal worth about $1.8 billion.

That is about 20 percent over the $1.5 billion offered for Data Domain last month by NetApp.

Data Domain is a pioneer in the development of data deduplication technology, and is by far the best-known vendor of the technology. The company manufacturers a series of storage appliances that tightly integrates its dedupe technology with dedicated storage capacity.

Both EMC and NetApp currently offer data dedupe technology, which is why the fight between the two over Data Domain is so interesting, according to solution providers.

One solution provider, who asked to remain anonymous, said that EMC’s offer to acquire Data Domain after trashing that company in its sales presentations will be an issue.

“EMC has been telling everyone in the market that its dedupe offering based on technology from Quantum is essentially the same as Data Domain’s,” the solution provider said. “So why would EMC offer $1.8 billion in cash for the same technology it already has? Either EMC has been making fraudulent statements or something else is happening.”

The EMC offer is a lot of money, and it is hard to see why EMC is coming on so strong, especially with an all-cash offer, Norbie said

There are three major undercurrents being validated by these two big giants fighting over this relatively unknown company,” he said. “The first two are brand ownership and market awareness, both of which are normally hard to get. The third is a technology leadership that no one else has been able to meet or beat.”

There could even be a more cynical reason for EMC’s offer, Norbie said. “Given the amount of money EMC has, one could argue EMC wants to buy Data Domain to kill it,” he said.

EMC said it made the offer for Data Domain because of its fast-growing revenue base, its strong data-protection-focused management team and sales force, and its complementary storage software technology.

In a statement about its offer, EMC warned Data Domain that, because its offer is identical to the offer of NetApp in all aspects other than the increase in price, Data Domain’s board of directors would risk breaching its fiduciary duties to its shareholders.

EMC also said that, to speed up the deal, it is waiving its right to review financial data from Data Domain, and that Data Domain does not need to enter into discussions or negotiations with EMC or sign any confidentiality agreements with EMC.

hmm we will see what happens…

Posted in EMC | Tagged: , | Leave a Comment »