Author Archives: admin

Redefining Data Center In A Box

Data center in a box is traditionally defined as a “type of data center in which portable, mobile, and modular information nodes are self-contained within a cargo container. It is designed and packaged for quick deployment and acquisition of data center solutions in organizations or facilities, including remote off-site locations.” Data center in a box usually contains equipment from large storage, compute, and network vendors such as EMC, NetApp, Dell, and Cisco. They are pieced together to form the IT infrastructure. Virtual Computing Alliance (VCE) for instance, offers Vblock, a bundled product containing EMC storage, Cisco servers, and VMware. NetApp has a similar offering called Flexpod.

But new innovative companies such as Simplivity, Nutanix, and Scale Computing are changing the definition of data center in a box. They are creating a purpose-built product from the ground up that incorporates not just compute, storage, and network, but additional services such as data deduplication, wan optimization, and backup in a box.

For instance, Simplivity’s product called OmniCube is “a powerful data center building block that assimilates the core functions of server, storage and networking in addition to a wide range of advanced functionality including: native VM-level backup, WAN optimization, bandwidth efficient replication for DR, cache accelerated performance, and cloud integration.”

These products will further simplify the design, implementation, and operation of IT infrastructure. With these boxes, there is no more storage area network (SAN) to manage, nor additional appliances such as WAN accelerator to deploy. A few virtual machine (VM) administrators can manage all the boxes in a cluster from the VMware server virtualization management user interface.

Data center in a box will continue to evolve and will change how we view and manage IT infrastructure for years to come.

Data Migration Using PowerPath Migration Enabler

One project I recently led is the migration of data from an old EMC Clariion to the new EMC VNX. There are a couple of strategies for migrating block data on a storage area network (SAN) – either use storage-based migration (migration is between the two storage arrays) or use host-based migration (migration is done on the host). EMC provides several tools for accomplishing these tasks. SAN Copy for instance is an excellent storage-based migration tool.

There are many factors to consider when choosing a migration strategy – size of data, cost, SAN bandwidth, complexity of the setup, application downtime, among many others. One strategy that is relatively simple and requires no downtime is to use the host-based migration tool PowerPath Migration Enabler Hostcopy.

This tool is part of PowerPath when you install the full software. In version 5.7 SP2, as long as the PowerPath is licensed, there is no additional license needed for Hostcopy (unlike the older version).

The migration process is non disruptive. It does not require shutting down the application. The host is still operational while migration is going on. In general, the steps for migrating data are:

1. On Windows or Linux host, make sure Powerpath 5.7 SP2 is installed and licensed.

powermt check_registration

2. Check source disk and record the disk pseudo name.

powermt display dev=all

3. On new storage, present the target LUN to host.

4. On host, rescan and initialize the target disk.

5. Check that the target disk is present and record the pseudo name.

powermt display dev=all

6. Setup the PowerPath Migration Enabler session

powermig setup -src harddiskXX -tgt harddiskYY -techType hostcopy

7. Perform initial synchronization

powermig sync -handle 1

8. Monitor status of the session

powermig query -handle 1

9. The data transfer rate can also be throttled

powermig throttle -throttleValue 0 -handle 1

10. When ready to switch over to the new storage, enter the following command:

powermig selectTarget -handle 1

11. Commit the changes

powermig commit -handle 1

12.Cleanup/delete the session

powermig cleanup -handle 1

13. Remove the old storage by removing lun from the old storage group

14. On host, rescan HBA for hardware changes, then remove old LUNs from PowerPath

powermt display dev=all
powermt remove dev=all
powermt display dev=all

For more information about PowerPath Migration Enabler, visit EMC website.

EMC VNX2 Storage Array Review

VNX is EMC’s unified enterprise storage solution for block and file. The latest release called VNX2, uses the advanced Intel Sandy Bridge processor with more cores. It also has more memory (RAM).

It’s Fast VP technology which dynamically moves data between SSD (flash drives), SAS drives and NL-SAS tiers, is now improved by decreasing the data “chunk” from 1GB to 256MB, which allows greater efficiency of data placement. Also, using SSD as the top tier is new in VNX2.

It’s Fast Cache technology has also been improved. Per EMC, “the warm up time has been improved by changing the behavior that when the capacity of FAST Cache is less than 80% utilized, any read or write will promote the data to FAST Cache.”

VNX2 boasts of its active/active LUNs configuration. However, active/active LUNs only work when the LUN is provisioned using RAID Groups. It does not work with Storage Pools. Hopefully, active/active LUNs will be available for Storage Pools in the future because more and more LUNs are being configured using Storage Pools instead of RAID Groups.

Another improvement is that in Unisphere, storage administrators do not need to set the storage processors (SP) cache settings – read and write cache settings and high and low water marks. It needs only to be turned on or off. The system now adjusts the cache settings automatically.

There are also no hot spare drives now. You simply don’t provision all the drives, and a blank drive becomes a hot spare. You can set the hot spare policy for each type of drive. The recommended is 1 per 30 drives.

I noticed a couple of shortcomings in this release. I do not like the fact that when creating a LUN in a pool, the “thin” is checked by default now. I believe that thick LUNs should be the default because of performance considerations. In addition, if storage administrators are not careful, they may end up over-provisioning the pool with thin LUNs.

On the file side, there is really no major improvement. I believe there is no updates on the data movers. Data movers still function in active/passive mode. One change though is that you can now use VDM (Virtual Data Mover) for NFS, although to configure this, you need to use the CLI.

Overall, VNX2 is one of the best enterprise storage array in terms of its performance and functionality.

Avamar Backup Solution Review

I recently completed a hands-on training on Avamar management and was impressed by its deduplication technology. Deduplication happens at the client side which means less bandwidth consumed on the network and less space used for storage. Backup jobs run very fast, especially the subsequent backups after the initial baseline backup.

The Avamar disk-based backup appliance is based on Linux operating system, thus its Command Line Interface (CLI) commands are excellent. Its Redundant Array of Independent Nodes or RAIN architecture provides failover and fault tolerance across its storage nodes. It can also integrate with Data Domain as its backend storage nodes. It has an intensive support for VMware and NAS appliances via the NDMP accelerator device. Avamar servers can be replicated to other Avamar servers located at a disaster recovery site. The management GUI is intuitive for the most part and it’s very easy to do backup and restore.

However, I also found several shortcomings that I think could improve the product. First, the management GUI does not have an integrated tool to push the agent software to the clients. If you have hundreds of clients, you need to rely on third party tools such as Microsoft SMS to push the agent software. Second, there is no integrated management GUI. You have to run several tools to perform management tasks – the Avamar Administrator Console, Avamar Client Manager, Enterprise Manager, and Backup Recovery Manager. Third, there is no extensive support for Bare Metal Restore (BMR). Only Windows 2008 and later are supported for BMR. Finally, the system requires a daily maintenance window to perform its HFS checks and other tasks, during which very few backup jobs are allowed to run. This should not be a big deal though since a short backup window is usually enough to finish backup jobs because as I mentioned earlier, backups run very fast.

Overall, I consider Avamar coupled with the Data Domain appliance as the leading backup solution out there.

IT Infrastructure Qualification and Compliance

One of the requirements of building and operating an IT infrastructure in a highly regulated industry (such as the pharmaceutical industry, which is regulated by the FDA) is to qualify, or validate the servers, network, and storage when they are being built. Once built, any changes to the infrastructure should undergo a change control procedure.

Building the infrastructure and making changes to it should undergo verification. They should also be documented so that they can be easily managed and traced. These activities are really not that different from the best practices guide in operating an IT infrastructure, or even from the ITIL processes.

FDA does not dictate how to perform IT infrastructure qualification or validation, as long as you have documented reasonable procedures.

The problem is that some companies overdo validation and change control processes. The common problems I’ve seen are: (1) too many signatures required to make a change, (2) no automated procedure to perform the documentation – many still use papers to route documents (3) and finally, the people who perform the checks and balances sometimes do not understand the technology.

The result is that IT personnel get overwhelmed with paperwork and bureaucracy. This discourages them to make critical changes to the infrastructure such as performing security patches on time. This also leads to the relunctance of IT personnel to implement newer or leading edge technologies into their infrastructure.

Fortunately, the International Society for Pharmaceutical Engineering (ISPE) has published a Good Automated Manufacturing Practice (GAMP) guidance on IT Infrastructure Control and Compliance. Companies can create their own IT infrastructure qualification program and procedures based on the GAMP guidance document. They should be simple but comprehensive enough to cover all the bases. It is also important that these procedures be periodically reviewed and streamlined to achieve an optimized procedure.

Data At Rest Encryption

When the Internet was invented several decades ago, security was not in the minds of the pioneers. TCP/IP, the protocol used to send data from one point to the next was inherently insecure. Data are being sent over the wire in clear text. Today, advances in encryption technologies enabled the data to be secure while in transit. When you shop at reputable websites, for instance, you can be sure that the credit card number you send over the Internet is encrypted (You will see https on the URL instead of http). Most web applications now (such as gmail, facebook, etc) are encrypted.

However, most of these data, when stored on the servers (data at rest) are still not encrypted. That’s why hackers are still able to get hold of these precious data, such as personally identifiable information (PII) – credit card numbers, social security numbers, etc. as well as trade secrets and other company proprietary information. There are a lot of ways to secure data at rest without encrypting them (such as using better authentication, better physical security, firewalls, using secured applications, better deterrent to social engineering attacks, etc.), but encrypting data at rest is another layer of security to make sure data is not readable when hackers get a hold of them.

The demand for encrypting data at rest is growing, especially now that more data are being moved to the cloud. Enterprise data centers are also being required to encrypt data on their storage systems, either by business or compliance need.

Luckily, IT storage companies such as EMC, NetApp, and many others are now offering encryption for data at rest on their appliances. However, encrypting data is still expensive. Encrypting and decrypting data need a lot of processing power. Moreover, adding encryption to the process may slow down the access of data. Better key management system is also needed. For instance, when using the cloud for storage, data owners (as opposed to service providers) should solely possess the keys and should be able to manage the keys easily.

The Internet will be more secure if data is encrypted not only during transit but also during storage.

IT Infrastructure for Remote Offices

When designing the IT infrastructure (servers, storage, and network) of small remote offices, infrastructure architects of large enterprises are often faced with the question, what is the best IT infrastructure solution for remote sites? Low-cost, simple, secure, and easy to support solution always come to mind, but positive end-user experience in terms of network and application performance, and user friendliness should also be in the top priorities when building the infrastructure.

Most small sites just need access to enterprise applications and to file and print services. Network infrastructure definitely needs to be built – such as the site’s local area network (LAN), wireless access points, wide area network (WAN) to connect to the enterprise data center, and access to the Internet. The bigger question though is: should servers and storage be installed on the site?

There are a lot of technologies such as WAN accelerators and “sync and share” applications that will forgo installing servers and storage on the remote sites without sacrificing positive end-user experience. For instance, Riverbed WAN accelerator products tremendously improve performance access to files and applications from the remote sites to the enterprise data center.  These products can even serve up remote datastore for VMware farms. “Sync and share” applications are dropbox-like applications (such as EMC Syncplicity). Enterprises can build a storage as a service solution in their internal infrastructure. This will eliminate the need to install file servers or storage appliances on the remote sites.

The decision to “install servers” or “go serverless” at the remote sites still depends on many factors. They should be dealt with on a case by case basis and should not rely on a cookie cutter solution. Some of the criteria to consider are: the number of people at the sites and the growth projection; the storage size requirement, available WAN bandwidth, the presence or absence of local IT support, office politics, and country/region specific regulation for data to remain local. If these issues are factored in, a better solution can be designed for remote offices.

Big Data

There is too much hype on big data these days, promising the next big revolution in information technology which will change the way we do business. It purports to have a big impact on economy, science, and society at large. In fact, big data right now is at the “peak of inflated expectations” on the Gartner technology hype cycle.

Big data “refers to our burgeoning ability to crunch vast collections of information, analyze it instantly, and draw sometimes profoundly surprising conclusions from it.” It answers questions that are sometimes not so obvious.

Big data definitely has tremendous potential. After all the hype has subsided, entities that do not take advantage of its power will be left out. In fact big data is already being used by technology companies such as Google, Amazon, Facebook, and many other companies. IT vendors such as Oracle, EMC, and IBM started offering big data solutions for companies and enterprises.

There are three drivers that is making big data possible:

First, a robust and cheap IT infrastructure – powerful server platforms that crunch data, advanced storage systems that store huge amount of data, and ubiquitous network – Wifi, 4G, fiber, etc.

Second, the explosion of data from mobile devices, social networks, web searches, sensors, and data from many different devices.

Lastly, the proliferation of powerful analytics and data mining tools suited for big data, such as Hadoop, MapReduce, NoSQL, and many other software yet to be created. These tools will only get better and better.

I recently read the book entitled “Big Data: A Revolution That Will Transform How We Live, Work, and Think” by Viktor Mayer-Schönberger and Kenneth Cukier.

The book is spot on its predictions. With big data, there will be yet another paradigm shift on how we understand the world. With big data, “what” is more important than “why”. Big data is also the processing of complete data, not just a sampling of data. It also means accepting less than perfect accurate result.

The book also talks about the dark side of big data – such as the loss of privacy. It also talks about how big data predictions can be used to police and punish individuals, and how organizations may blindly defer to what the data says without understanding its limitations.

I highly recommend the book to those who like to fully understand big data and its implications.

Toastmasters Is Also About Leadership

Many people join Toastmasters Club to improve their communications skills. But Toastmasters is not only about communications; it’s also about leadership. There is a leadership program that members can take advantage of to improve their leadership skills. In fact, before a member can become a distinguished Toastmaster – the highest Toastmaster educational award – one needs to complete both the leadership and communications tracks.

It makes sense that communications and leadership skills go hand in hand. Great communicators are great leaders, and great leaders are great communicators. Many areas of our society require leaders. People just need to step up and lead.

In Toastmasters, there are many opportunities to lead at the club, district, and international levels, thus improving our leadership skills. When I became a club president a year ago, I learned so many things including organizing events, motivating people, and managing the club. Now that I am an area governor, I have to face a new set of challenges, thus more opportunities to learn and lead.

Network and Server Monitoring Using Open Source Tools

I am a fan of open source tools. The Internet, as we know it today, will not exist if not for the open source movement. We owe this to the countless architects and developers who dedicated their time and effort to write open source software.

Enterprise IT departments can also take advantage of open source software. Numerous companies have been using them for years. One particular area where they can be used is network and server monitoring.

There are a lot of open source network monitoring tools out there. Leading the pack are Nagios, Zabbix, and Cacti. My favorite tool though is OpenNMS. I particularly like it because it is very easy to setup and administer. It can automatically discover your nodes on the network. There were very few tweaks when I first set it up. It provides simple event and notification via email or pager. In addition, its web-based management interface interface is very easy to use.

I have been using OpenNMS for several years now and it has been running rock solid. I definitely recommend OpenNMS for IT departments who do not want to pay a hefty price to monitor their network and servers.