Protecting Your Company Against Ransomware Attacks

Ransomware attacks are the latest security breach incidents grabbing the headlines these days. Last month, major companies including Britain’s National Health Services, Spain’s Telefónica, and FedEx were victims of the WannaCry ransomware attacks. Ransomware infects your computer by encrypting your important documents, and the attackers then ask for ransom to decrypt your data in order to become usable again.

Ransomware attack operations have become more sophisticated, in some cases functioning with a full helpdesk support.

While the latest Operating System patches and anti-malware programs can defend these attacks to a point, they are usually reactive and ineffective. For instance, the WannyCry malware relied heavily on social engineering (phishing) to spread, and relying on end users to open malicious email or to click on malicious websites.

The best defense for ransomware attacks is a good data protection strategy in the area of backup and disaster recovery. When ransomware hits, you can simply remove the infected encrypted files, and restore the good copies. It’s surprising to know that a lot of companies and end users do not properly backup their data. There are tons of backup software and services in the cloud to backup data. A periodic disaster recovery test is also necessary to make sure you can restore data when needed.

A sound backup and disaster recovery plan will help mitigate attacks against ransomware.

Ensuring Reliability of Your Apps on the Amazon Cloud

On February 28, 2017, the Amazon Simple Storage Service (S3) located in the Northern Virginia (US-EAST-1) Region went down due to an incorrect command issued by a technician. A lot of websites and applications that rely on the S3 service went down with it. The full information about the outage can be found here: https://aws.amazon.com/message/41926/

While Amazon Web Services (AWS) could have prevented this outage, a well-architected site should not have been affected by this outage. Amazon allows subscribers to use multiple availability zones (and even redundancy in multiple regions), so that when one goes down, the applications are still able to run on the others.

It is very important to have a well-architected framework when using the cloud. AWS provides one that is based on five pillars:

  • Security – The ability to protect information, systems, and assets while delivering business value through risk assessments and mitigation strategies.
  • Reliability – The ability of a system to recover from infrastructure or service failures, dynamically acquire computing resources to meet demand, and mitigate disruptions such as misconfigurations or transient network issues.
  • Performance Efficiency – The ability to use computing resources efficiently to meet system requirements, and to maintain that efficiency as demand changes and technologies evolve.
  • Cost Optimization – The ability to avoid or eliminate unneeded cost or suboptimal resources.
  • Operational Excellence – The ability to run and monitor systems to deliver business value and to continually improve supporting processes and procedures.

For those companies who were affected by the outage, applying the “reliability” principle (by utilizing multiple availability zones, or using replication to different regions), could have shielded them from the outage.

Securing Your Apps on Amazon AWS

One thing to keep in mind when putting your company’s applications in the cloud, specifically on Amazon AWS, is that you are still largely responsible for securing them. Amazon AWS has solid security in place, but you do not entrust the security aspect to Amazon thinking that your applications are totally secure because they are hosted there. In fact, Amazon AWS has a shared security responsibility model depicted by this diagram:

Source:  Amazon AWS

Amazon AWS is responsible for the physical and infrastructure security, including hypervisor, compute, storage, and network security; and the customer is responsible for application security, data security, Operating System (OS) patching and hardening, network and firewall configuration, identity and access management, and client and server-side data encryption.

However, Amazon AWS provides a slew of security services to make your applications more secure. They provide the AWS IAM for identity and access management, Security Groups to shield EC2 instances (or servers), Network ACLs that act as firewall for your subnets, SSL encryption for data transmission, and user activity logging for auditing. As a customer, you need to understand, design, and configure these security settings to make your applications secure.

In addition, there are advance security services that Amazon AWS provides, so that you don’t have to build them, including the AWS Directory Service for authentication, AWS KMS for Security Key Management, AWS WAF Web Application Firewall for deep packet inspection, and DDOS mitigation.

There is really no perfect security, but securing your infrastructure at every layer tremendously improves the security of your data and applications in the cloud.

Annual New England VTUG Winter Conference

I have been attending the annual New England Virtualization Technology Users Group (VTUG) Winter Warmer Conference for the past couple of years. This year, it was held on January 19, 2017 at Gillette Stadium.

Gillette Stadium is where the New England Patriots football team plays. The stadium has nice conference areas and the event usually features meeting and getting autographs from some famous Patriots alumni. This year we got the chance to meet running back Kevin Faulk and Patrick Pass.

Although the event is sponsored by technology vendors, most of the keynotes and breakout sessions are not sale pitches. They are usually very informative sessions delivered by excellent speakers.

The key takeaways for me from the conference are the following:

  1. Cloud adoption remains a hot topic, but containerization of applications being led by Docker, enables companies to construct and deliver microservices applications at lightning speed. Coupled with DevOps practices and support from major software vendors and providers (Windows, RedHat, Azure, AWS, etc), containers will be the next big thing in virtualization.
  2. VMware is getting serious about infrastructure security. Security has become the front and center focus of the release of vSphere 6.5. Their objective is to make security easy to manage. Significant security features include VM encryption at scale, enhanced logging from vCenter, VM’s secure boot support, and secure boot support for ESX1. For more information, visit this website.
  3. As more and more companies are moving into hybrid cloud model (a combination of private and public cloud), vendors are getting more innovative on creating products and services that will help companies easily manage and completely secure the hybrid cloud.
  4. Hyper-converged infrastructure is now being broadly adopted, with EMC VXrails and Nutanix leading the pack. The quest for consolidation, simplification, and software-defined infrastructure is in full steam.
  5. New innovative companies are present at the event as well. One particular company called Igneous, offers “true cloud for local data.”

Building an Enterprise Private Cloud

Businesses are using public clouds such as Amazon AWS, VMware vCloud or Microsoft Azure because they are relatively easy to use, they are fast to deploy, businesses can buy resources on demand, and most importantly, they are relatively cheap (because there is no operational overhead in building, managing and refreshing an on-premise infrastructure). But there are downsides to using public cloud, such as security and compliance, diminished control of data, data locality issue, and network latency and bandwidth. On-premise infrastructure is still the most cost effective for regulated data and for applications with predictable workloads (such as ERP, local databases, end-user productivity tools, etc).

However, businesses and end-users are expecting and demanding cloud-like services from their IT departments for these applications that are best suited on-premise. So, IT departments should build and deliver an infrastructure that has the characteristics of a public cloud (fast, easy, on-demand, elastic, etc) and the reliability and security of the on-premise infrastructure – an enterprise private cloud.

An enterprise cloud is now possible to build because of the following technology advancements:

  1. hyper-converged solution
  2. orchestration tools
  3. flash storage

When building an enterprise cloud, keep in mind the following:

  1. They should be 100% virtualized.
  2. There should be a mechanism for self-service provisioning, monitoring, billing and charge back.
  3. A lot of operational functions should be automated.
  4. Compute and storage can be scaled-out.
  5. It should be resilient – no single point of failure.
  6. Security should be integrated in the infrastructure.
  7. There should be a single management platform.
  8. Data protection and disaster recovery should be integrated in the infrastructure.
  9. It should be application-centric instead of infrastructure-centric.
  10. Finally, it should be able to support legacy applications as well as modern apps.

Important Responsibilities of IT Infrastructure Operations

The main function of IT operations is to keep IT services running smoothly and efficiently. It would be nirvana if IT infrastructure services just work perfectly throughout their lifespan after they are initially installed and configured. However, the reality is that hardware fails, bugs are found, features need to be added, security flaws are found, patches need to be applied, usage fluctuates, data needs to be protected, upgrades need to be done, demand increases, etc. The following are the important job of IT operations:

Monitoring

Monitoring is the only way to keep track of the health and availability of the systems. Monitoring can be accomplished by looking at the system’s health via dashboard or console, or via specialized monitoring appliances. One major component of monitoring is alerting via email or pagers when there is a major issue. Monitoring can also come from incident tickets generated by machines or end users that may not be apparent via machine monitoring. System logs can also be used for trending and monitoring as it can bring into light some flaws on the system.

Troubleshooting

Once issues are detected, IT operations should be able to troubleshoot these issues and fix them as soon as possible to bring the service back online. Issues that are more complicated to fix should be escalated to vendors, higher level support, or engineers and developers.

Change Control

IT operations should not make any changes (such as configuration change, hardware replacement, upgrades, etc) without following the proper change control procedure. More than 50% of outages are caused by changes on the system. IT services are often tightly integrated with other system and a change on one system may be able to affect the others. Subject matter experts of the various systems should make sure that a change will not affect their system. Planning and testing are vital steps in performing changes.

Capacity planning

IT operations should monitor and trend the utilization of resources (compute, storage, network) and allocate resources to ensure that there is enough capacity to serve demand. They should be able to predict and allocate resources so that there is capacity when they are needed.

Performance optimization

IT operations should optimize IT services and ensure efficient use of resources. The goal is provide an excellent user experience for these services. Mechanisms such as redundancy, local load balancing, global load balancing and caching improves utilization, efficiency and end user experience.

Backups

In addition to keeping the IT services running smoothly, IT operations should also protect these systems and their data by backing them up and replicating them to a remote site. The goal is to bring these systems back online in as little time as possible when there is a catastrophic failure on the systems.

Security

IT operations should also be responsible for securing the systems. Due to its enormous task, a lot of companies employ a dedicated Security Operations Center (SOC) that watches security breaches.

Automation

One of the goals of IT operations is to automate most manual activities via scripting and self-healing mechanism. This enables them to focus on higher value tasks and not get bogged down on repetitive tasks.

Mitigating Insider Threats

With all the news about security breaches, we often hear about external cyber attacks, but internal attacks are widely unreported. Studies show that between 45% to 60% of all attacks were carried out by insiders. In addition, it is harder to detect and prevent insider attacks because access and activities are coming from trusted systems.

Why is this so common and why is this so hard to mitigate? The following reasons have been cited to explain why there are more incidents of internal security breaches:

1. Companies don’t employ data protection, don’t apply patches on time, or don’t enforce any security policies/standards (such as using complex passwords). Some companies wrongly assume that installing a firewall can protect them from inside intruders.

2. Data is outside of the control of IT security such as when the data is in the cloud.

3. The greatest reason for security breach is also the weakest link in the security chain – the people. There are two types of people in this weak security chain:

a. People who are vulnerable such as careless users who use USB, send sensitive data using public email services, or sacrifice security in favor of convenience. Most of the time, users are not aware that their account has already been compromised via malware, phishing attacks, or stolen credentials gleaned from social networks.

b. People who have their own agenda or what we call malicious users. These individuals want to steal and sell competitive data or intellectual properties to gain money, or they probably have personal vendetta against the organization.

There are however proven measures to lessen the gravity of insider threats:

1. Monitor the users, especially those who hold the potential for greatest damage – top executives, contractors, vendors, at-risk employees, and IT administrators.

2. Learn the way they access the data, create a baseline and detect any anomalous behavior.

3. When a divergent behavior is detected such as unauthorized download or server log-ins, perform an action such as block or quarantine user.

It should be noted that when an individual is caught compromising security, more often than not, damage has already been done. The challenge is to be proactive in order for the breach to not happen in the first place.

An article in Harvard Business Review has argued that psychology is the key to detecting internal cyber threats.

In essence, companies should focus on understanding and anticipating human behavior such as analyzing employee language (in their email, chat, and text) continuously and in real time. The author contends that “certain negative emotions, stressors, and conflicts have long been associated with incidents of workplace aggression, employee turnover, absenteeism, accidents, fraud, sabotage, and espionage”

Applying big data analytics and artificial intelligence on employees language in email, chat, voice, text logs and other digital communication may uncover worrisome content, meaning, language pattern, and deviation in behavior, that may make it easier to spot indications that a user is a security risk or may perform malicious activity in the future.

(ISC)2 Security Congress 2016

I recently attended the (ISC)2 Annual Security Congress (in conjunction with ASIS International) in Orlando, Florida. (ISC)2 Security Congress is a premier 4-day conference attended by hundreds of IT security professionals from around the world. This year featured a line-up of excellent speakers including keynote speeches from journalist Ted Koppel and foreign policy expert Elliott Abrams.

Here are the top IT security topics I gathered from the conference:

  1. Cloud security. As more and more companies are migrating to the cloud, IT security professionals are seeking the best practices for securing applications and data in the cloud.
  2. IoT (Internet of Things) security. It’s still a wild west out there. Manufacturers are making IOT devices (sensors, cameras, appliances, etc) that are insecure. There is a lack of standardization. People are putting devices on the Internet with default settings and passwords which make them vulnerable. Inside most companies, there is usually no process of putting these IOT devices on the network.
  3. Ransomware. They are getting more prevalent and sophisticated. Some perpetrators have a solid business model around this, including a call center/ help desk to help victims pay the ransom and recover their data.
  4. Resiliency. It’s better to build your network for resiliency. Every company will be a victim of an attack at some point, even with the best defenses in place. Resilient networks are those that can recover quickly after a breach.
  5. Common sense security. There are plenty of discussions on using time-tested security practices such as hardening of devices (replacing default passwords for instance), patching on time, and constant security awareness for users.
  6. Cyberwar.  There’s a mounting occurrence of cyber incidents and the next big threat to our civilization is cyberwar. Bad actors (state-sponsored hackers, hacktivists, criminals, etc.) may be able to hack into our industrial systems that are controlling our electrical and water supply, and be able to disrupt or destroy them.
  7. Shortage of cybersecurity experts.  The industry is predicting a shortage of cybersecurity professionals in the near future.

Hyper-converged Infrastructure: Hype or For Real?

One of the hottest emerging technologies in IT is hyper-converged infrastructure (HCI). What is the hype all about? Is it here to stay?

As defined by Techtarget, hyper-convergence infrastructure (HCI) is a system with a software-centric architecture that tightly integrates compute, storage, networking, virtualization resources (hypervisor, virtual storage, virtual networking) and other technologies (such as data protection and deduplication) in a commodity hardware box (usually x86) supported by a single vendor.

Hyper-convergence grew out of the concept of converged infrastructure, where engineers took it a little further – using very small hardware footprint, tight integration of components and simplified management. It is a relatively new technology. On the technology adoption curve, it is still at the early adopters stage.

Nutanix is the first vendor to offer hyper-converged solution, followed by Simplivity, and Scale Computing. Not to be outdone, VMWare developed its EVO-RAIL, then opened it for hardware vendors to OEM the product. Major vendors, including EMC, NetApp, Dell, HP, and Hitachi began selling EVO-RAIL products.

One of the best HCI product that I’ve seen is VxRail. Jointly engineered by VMware and EMC, the “VxRail appliance family takes full advantage of VMware Hyper-Converged Software capabilities and provides additional hardware and lifecycle management features and rich EMC data services, delivered in a turnkey appliance with integrated support.”

What are the advantages of HCI and where can it be used? Customers who are looking to start small and be able to scale out overtime, will find the HCI solution very attractive. It is a perfect fit for small to medium size companies, to be able to build their own data center without spending huge amount of money. It is simple (because it eliminates a lot of hardware clutter) and highly scalable (because it can be scaled very easily by adding small standardized x86 nodes). Since it is scalable, it will ease the burden of growth. Finally, its performance is comparable to big infrastructures because leveraging SSD storage and bringing the data close to the compute enables high IOPS at very low latencies.

References:

1. Techtarget
2. VMware Hyper-Converged Infrastructure: What’s All the Fuss About?

Replicating Massive NAS Data to a Disaster Recovery Site

Replicating Network Attached Storage (NAS) data to a Disaster Recovery (DR) site is quite easy when using big named NAS appliances such as NetApp or Isilon. Replication software is already built-in on these appliances – Snapmirror for NetApp and SyncIQ for Isilon. They just need to be licensed to work.

But how do you replicate terabytes, even petabytes of data, to a DR site when the Wide Area Network (WAN) bandwidth is a limiting factor? Replicating a petabyte of data may take weeks, if not months to complete even on a 622 Mbps WAN link, leaving the company’s DR plan vulnerable.

One way to accomplish this is to use a temporary swing array by (1) replicating data from the source array to the swing array locally, (2) shipping the swing frame to the DR site, (3) copying the data to the destination array, and finally (4) resyncing the source array with the destination array.

On NetaApp, this is accomplished by using the Snapmirror resync command. On Isilon, this is accomplished by turning on the option “target-compare-initial” in SynqIQ which compares the files between the source and destination arrays and sends only data that are different over the wire.

When this technique is used, huge company data sitting on NAS devices can be well protected right away on the DR site.