Author Archives: admin

Interviewing Harvard College Applicants

Two years ago I was asked if I would like to interview students applying for admission to Harvard College in Worcester, MA area. As an Harvard alum, I was glad to volunteer to interview four to six applicants during the admission period (October to March).

The qualities of the applicants I interviewed are very impressive – excellent academic grades, lots of extra curricular activities, and great personalities. But I realized that not all of them will get accepted. In fact, with 6% acceptance rate (only approx 2,000 out of over 33,000 applicants get accepted), getting into Harvard College is a tall order. Only the best of the best are accepted.

I knew it will take a long time before I get to interview that person who will make it to Harvard College, if at all. But last winter, I was lucky to interview a very promising applicant. True enough, she got an acceptance letter. I was very happy when I learned about it. Congratulations Taylor Benninger of Spencer, MA!

Skiing

Winter has been particularly mild this year in New England. But that did not stop me and my daughter, Justine, to go up the mountains and ski.  Most ski resorts have snow making capabilities anyway and the mountains in northern New England usually have natural snow.

Aside from the usual skiing trip to Wachusett Mountain, which is only 35 minutes from our house, this year we also went to Bretton Woods in New Hampshire and Okemo Mountains in Vermont with my wife’s friend, Dick.

I used to hate the snow. I don’t like to clean up our driveway after a snow storm.  It’s so much more comfortable to curl up in front of the fireplace. 🙂 And I don’t like to drive in the snow because it can be dangerous.  Several years ago, my car spun 360 degrees in the highway.  I was lucky there was no other vehicle near my car.  And I grew up in the tropics, so warmer weather is much more suited for me.

But there is no way we can avoid the snow in New England.  We need to find outdoor activities to avoid cabin fever. So, two years ago, my daughter and I signed up for skiing lesson, and soon realized that we loved it.

I also realized that it’s a good activity to bond with my daughter.  I’m glad we tried skiing and I’m looking forward to many skiing and bonding trips with my daughter for many years to come.

Toastmasters International Speech Contest

I just won first place at the International Speech Contest at our Toastmaster at Abbott BioResearch Club today, March 13, 2011.  I was very honoured to compete against three other seasoned Toastmasters.

My speech was about how I lost weight and why my daughter inspired me.  The title of my speech was “Persevere, Overcome, Succeed.”  The event was very well attended.

I’ve been with the Toastmasters Club for more than five years and I have completed my Competent Communicator (CC) award last year.  But this is the first time I joined the speech contest.  There was a different feel to it compared with our regular Toastmasters bi-weekly meeting, knowing that I was competing. But, it was a very rewarding experience.  I had to write and practice my speech three weeks in advance.  I guess it paid off.

On to the Area E, District 31 speech contest on March 27, 2012.

Backup Infrastructure

I have been designing, installing , and operating backup systems for the past several years.  I have mostly implemented and managed Symantec Netbackup (used to be Veritas Netbackup) for larger infrastructures and Symantec Backup Exec for smaller ones.

These software worked very well although some features are not very robust.  I’m very impressed for instance of the NDMP implementation in Netbackup.  Backing up terabytes of NetApp data via NDMP works very well.  However, I do not like the admin user interface of Netbackup since its not very intuitive. Their bare metal restore (BMR) implementation also is a pain.  Some of the bugs took years to fix.  Maybe because there are not too many companies using BMR.

Backup Exec works very well with small to medium systems. It has very intuitive interface, it is relatively easy to setup, and it has very good troubleshooting tools.  Lately though, Symantec has been playing catch up in their support for newer technologies such as VMware. It is so much easier to use Veeam to manage backup and restore of virtual machines.  In addition, Backup Exec has been breaking lately. Recent Microsoft patches have caused backup of System_State to hang.

But I think the biggest threat to these backup software are online backup providers. Crashplan, for instance, was initially developed for desktop backup, but it will not take long before companies will use it to back up their servers. When security concerns are addressed properly by these providers, companies will be more compelled to backup their data online. It’s just cheaper and easier to backup online.

NetApp Storage Migration Lessons

One of my latest projects is to consolidate six old NetApp Filers and migrate a total of 30 TB of data to a new NetApp Filer cluster, FAS 3240C. The project started several months ago and it is almost complete. Only one out of six NetApp filers is left to migrate.

I have done several storage migrations in the past, and there are always new lessons to learn in terms of the technology, migration strategy and processes, and the people involved in the project. Here are some of the lessons that I learned:

  1. As expected, innovations in computer technology move too fast and storage technology is one of them. IT professionals need to keep pace or our skills become irrelevant. I learned storage virtualization, NetApp fast cache, and snapmirror using smtape, among many other new features.
  2. Migration strategy, planning, and preparation take more time than the actual migration itself. For instance, one filer only took an hour and a half to migrate. However, the preparations such as snapmirroring, re-creating NFS and CIFS shares, making changes in users login scripts, making changes in several applications, and many other pre-work were done several days before the actual migration. The actual migration is actually just to catch up with the latest changes in the files (ie snapmirror update), and flipping the switch.
  3. People, like many other big IT projects, are always the challenging part. The key is to engage the stakeholders (business users, application owners, technical folks) early on in the project. Communicate with them the changes that are happening and how their applications and accesses to their data will be affected. Give them time to understand and make changes to their applications. Tell them the benefits of the new technology and communicate often the status of the project.

Performing maintenance tasks on vmware hosts

There are times when you need to perform hardware maintenance (such as adding a new Network Interface Card [NIC]) on VMware hosts, or the host simply disconnects from vCenter.  The only way to perform maintenance is to shutdown or reboot the hosts.  To minimize damage, here’s the procedure I use:

  1. Run vSphere client on the workstation.  Do not use the vSphere client on the servers. The reason being – a server might be a virtual machine (VM) which will go down.
  2. Using vSphere client, connect to VMware host, *not* the vCenter server.
  3. Login as user root.
  4. Shutdown all the VM’s, by right clicking the VM, selecting Power, Shutdown Guest.  This is faster than logging in to each machine using RDP and shutting it down.  The vmtools though have to be up to date, or else the Shutdown Guest option will be grayed out. If Shutdown Guest is grayed out, you need to login to the VM to shut it down.  Performing “Power Off” on the VM should be the last resort.
  5. Once all the VM’s are powered down, right click on the VMware host and select Enter Maintenance Mode.
  6. Go to the console of the VMware host, and press Alt-F11 to get the login prompt.
  7. Login as root.
  8. Issue the command “shutdown -h now” to power down the host.  If you just want to reboot, issue the command “shutdown -r now”.
  9. Wait until the machine is powered off.
  10. Perform maintenance.
  11. Power on the VMware host.  Look for any problems on the screen.  The equivalent of blue screen in VMware is purple screen.  When there’s a purple screen, that means there is something very wrong.
  12. When the VMware host is all booted up, go back to your workstation, and connect using vSphere client to the VMware host.
  13. Right click on the Vmware host first, and select “Exit Maintenance Mode”
  14. Power On all the VM’s.

If there are multiple VMware hosts, and Vmotion is licensed and enabled (i.e. Enterprise License), you can vmotion VMs to the other hosts, and perform maintenance.  When the host gets back, you can vmotion back the VM’s to the host, and do the same maintenance on the other.

 

Reinstalling a Node on a Scyld Beowulf cluster

This writeup describes how to restore a node back to the cluster after a node hard disk has been wiped out due to hardware error.

I was prompted to write this instruction because one of the nodes in our cluster failed. After the hardware has been replaced, I tried to put it back to the cluster, however, I was not able to. I tried to follow the instructions to no avail. I also posted a message to the scyld beowulf mailing list but I did not get any response.

Anyway, I was trying to add the node back to the cluster. Using beosetup, the new MAC address was registered as node 0. I tried to partition the disk using the beofdisk tool, then I restarted the node. Here’s the output:

# beofdisk -w -n 0

Disk /dev/hda: 4865 cylinders, 255 heads, 63 sectors/track
Old situation:
Units = cylinders of 8225280 bytes, blocks of 1024 bytes, counting from 0

Device Boot Start End #cyls #blocks Id System
/dev/hda1 * 0+ 0 1- 8001 89 Unknown
/dev/hda2 1 516 516 4144770 82 Linux swap
/dev/hda3 517 4864 4348 34925310 83 Linux
/dev/hda4 0 - 0 0 0 Empty
New situation:
Units = sectors of 512 bytes, counting from 0

Device Boot Start End #sectors Id System
/dev/hda1 * 63 16064 16002 89 Unknown
/dev/hda2 16065 8305604 8289540 82 Linux swap
/dev/hda3 8305605 78156224 69850620 83 Linux
/dev/hda4 0 - 0 0 Empty
Successfully wrote the new partition table

Re-reading the partition table ...

If you created or changed a DOS partition, /dev/foo7, say, then use dd (1) to zero the first 512 bytes: dd if=/dev/zero of=/dev/foo7 bs=512 count=1
(See fdisk(8).)
The partition table on node 0 has been modified.
You must reboot each affected node for changes to take effect.

# beoboot-install 0 /dev/hda
Creating boot images...
Installing beoboot on partition 1 of /dev/hda.
mke2fs 1.32 (09-Nov-2002)
/dev/hda1: 11/2000 files (0.0% non-contiguous), 268/8001 blocks
Done

rcp: /boot/boot.b: No such file or directory
Failed to copy boot.b to node 0:/tmp/.beoboot-install.mnt

After rebooting, it came out with an ERROR state on the BeoSetup window. Here’s the log:

node_up: Initializing cluster node 0 at Wed Mar 9 15:44:55 EST 2005.
node_up: Setting system clock from the master.
node_up: Configuring loopback interface.
node_up: Loading device support modules for kernel version 2.4.27-294r0048.Scyldsmp.
setup_fs: Configuring node filesystems using /etc/beowulf/fstab...
setup_fs: Checking /dev/hda2 (type=swap)...
chkswap: /dev/hda2: Unable to find swap-space signature
setup_fs: FSCK failure. (OK for RAM disks)
setup_fs: Mounting /dev/hda2 on swap (type=swap; options=defaults)
swapon: /dev/hda2: Invalid argument
setup_fs: Failed to mount /dev/hda2 on swap (fatal).

So, to solve this problem, you have to do 2 extra steps before rebooting the node. After executing beoboot-install, you should execute bpsh mk2fs -j on the data partitions and bpsh mkswap on the swap partition, such as

# bpsh 0 mk2fs -j /dev/hda3
# bpsh 0 mkswap /dev/hda2