This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Overview

Service Overview

Cloud Hadoop is a service for easily and quickly analyzing large-scale data, providing a Hadoop cluster (computing resources, management tools, and applications) used for big data processing and analysis in the SCP environment.

Features

Cloud Hadoop provides an automated cluster creation service through the Hadoop Manager and the Hadoop Ecosystem(ecosystem) composed of Spark, HDFS(Hadoop distributed file system), Hive, etc., enabling anyone to easily build, optimize, or flexibly scale infrastructure for big data analysis.

Service Diagram

Architecture diagram
Figure. Cloud Hadoop Architecture Diagram

Provided features

Cloud Hadoop provides the following features.

  • Provide Hadoop Cluster as a cloud service

    • Providing a Hadoop Cluster through automated cluster installation in the SDS Cloud environment
    • Perform essential operational activities for cluster management (cluster operation/monitoring)
    • Provides a Hadoop ecosystem with verified interoperability and allows users to access the server (VM)
  • Offer the Hadoop service stack as separate products (increase nodes per product)

    • Minimum node allocation per product for stable service operation
    • Providing diverse product selection opportunities to meet user needs and reduce costs
  • Providing user-friendly features for Hadoop services

    • Provides installation and management functions for each Hadoop ecosystem, optimal configuration values, and version management features.
    • Provide an integrated monitoring dashboard for system resources
    • Provides Service Failure Alert feature

Component

We package the major components of the Hadoop ecosystem to deliver an enterprise data cloud.

Service Configuration

Cloud Hadoop provides the following services.

  • Basic Installation Service
    • HDFS 3.3.6
    • YARN 3.3.6
    • Hbase 2.4.17
    • Hive 3.1.2
    • Tez 0.9.1
    • Hue 4.11.0
    • Solr 8.11.4
    • Spark 3.4.1
    • Zookeeper 3.8.5
  • Additional Option Service
    • Data Governance: Atlas 2.1.0, Ranger 2.1.0
    • Analytical Data Warehouse: Iceberg 1.8.0, Kyuubi 1.10.2
    • Data Ingestion: Sqoop 1.4.7, Kafka 3.9.1, Flume 1.11.0

Server type

The server types supported by Cloud Hadoop are as follows.

Category
exampleDetailed description
Server typeStandardProvided server types
  • Standard: Configured with the commonly used standard specifications (vCPU, Memory)
  • High Capacity: Large-capacity server specifications of 24 cores or more
Server sizes1v4m32Provided server specifications
  • vCPU 4
Table. Cloud Hadoop server type

The minimum specifications for using Cloud Hadoop are as follows.

Category
AlgebraInstance size (user-selected value)
Master2(fixed)CPU: 4 Core
Memory: 32 GB
Worker3(minimum)CPU: 4 Core
Memory: 32 GB
Data GovernanceNone
Analytical Data WarehouseNone
Ingestion3 (minimum)CPU: 4 Core
Memory: 32 GB
Table. Cloud Hadoop Minimum Specifications

Provision status by region

Cloud Hadoop is available in the following environments.

regionProvision status
Korea West (kr-west1)Provide
Korea East (kr-east1)Provide
South Korea South1(kr-south1)Not provided
South Korea South 2 (kr-south2)Not provided
South Korea 3 (kr-south3)Not provided
Table. Cloud Hadoop regional availability status

Preliminary Service

This is a list of services that need to be pre-configured before creating the service. Please refer to the guide provided for each service and prepare in advance.

Service CategoryserviceDetailed description
NetworkingVPCA service that provides an isolated virtual network in a cloud environment
NetworkingSecurity GroupVirtual firewall that controls server traffic
StorageObject StorageObject storage that simplifies data storage and retrieval
Table. Cloud Hadoop preliminary service

1 - ServiceWatch metric

You can view Virtual Server metrics in ServiceWatch for servers created in Cloud Hadoop. Like Virtual Server, the metrics provided by default monitoring are data collected at 5‑minute intervals. In the Virtual Server detailed view, enabling detailed monitoring allows you to view data collected at 1‑minute intervals. For more details, Virtual Server > Enable ServiceWatch Detailed Monitoring

information
  • The basic monitoring and detailed monitoring of Cloud Hadoop are provided with the same metrics as Virtual Server, and the namespace is also provided as Virtual Server.
Reference
Refer to the ServiceWatch guide for how to view metrics in ServiceWatch.
Reference
Refer to the ServiceWatch Agent guide for how to collect metrics using the ServiceWatch Agent.

Basic Metrics

The following are the basic metrics for the Virtual Server namespace.

The indicators whose names are displayed in bold below are the key indicators selected from the basic metrics provided by Virtual Server. Key metrics are used to build service dashboards that are automatically created for each service in ServiceWatch.

Each metric provides guidance in the user guide on which statistical value is meaningful when viewing that metric, and among the meaningful statistics, the values shown in bold are the primary statistics. In the service dashboard, you can view key metrics using primary statistical values.

Performance itemsDetailed descriptionunitmeaningful statistics
Instance StateInstance status display
  • 1 - Active
  • 0 - Off
None
  • Total
CPU UsageCPU usagePercent
  • Average
  • Maximum
  • Minimum
Disk Read BytesBytes read from block device (bytes)Bytes
  • Total
  • Average
  • Maximum
  • Minimum
Disk Read RequestsNumber of read requests on a block deviceCount
  • Total
  • Average
  • Maximum
  • Minimum
Disk Write BytesWrite capacity (bytes) on block deviceBytes
  • Total
  • Average
  • Maximum
  • Minimum
Disk Write RequestsNumber of write requests on block deviceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network In BytesReceived bytes on the network interfaceBytes
  • Total
  • Average
  • Maximum
  • Minimum
Network In DroppedNumber of packet drops received on the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network In PacketsNumber of packets received on the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network Out BytesData transmitted from the network interface (bytes)Bytes
  • Total
  • Average
  • Maximum
  • Minimum
Network Out DroppedNumber of packet drops transmitted from the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Network Out PacketsNumber of packets transmitted on the network interfaceCount
  • Total
  • Average
  • Maximum
  • Minimum
Table. Virtual Server Basic Metrics