Home > VMware vSAN > Cloudera Distribution Including Apache Hadoop on VMware vSAN™

Cloudera Distribution Including Apache Hadoop on VMware vSAN™

Best Practices for Optimizing Virtualized Big Data Applications on VMware vSphere® 6.7 with vSAN



Executive Summary

This section covers the business introduction and solution overview.

Technology Overview

This section provides an overview of the technologies used in this solution: - VMware vSphere 6.7 - VMware vSAN 6.7 - Cloudera Enterprise

Solution Configuration

This section introduces hardware and software resources, vSphere and vSAN configuration, Apache Hadoop/Spark configuration, and Hadoop cluster scaling.


The benchmarks used in the solution are: - Cloudera storage validation - Hadoop MapReduce - Spark

Performance Testing and Results

We performed the following tests based on different workload benchmarks: - Cloudera storage validation - TeraSort testing - TestDFSIO testing - Spark testing - IoT Analytics testing

Failover Testing

In the failover testing, we performed the host and disk failure tests with the FTT=1 setting and the FTT=0 with Host Affinity setting.

FTT=1 and FTT=0 with Host Affinity Considerations and Comparison

This section lists the considerations and comparison results of these two configurations from the network, capacity, performance, and availability perspectives.

Solution Summary


About the Authors