This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Overview

    Service Overview

    Data Flow is a data processing flow tool that extracts large amounts of data from various data sources and visually creates a processing flow for transformation/transmission of stream/batch data, providing open-source Apache NiFi. Data Flow can be used independently in the Kubernetes Engine cluster environment of the Samsung Cloud Platform or with other application software.

    architecture diagram
    Figure. Data Flow architecture diagram

    Provided Features

    Data Flow provides the following functions.

    • Easy installation and management: Data Flow can be easily installed through the web-based Samsung Cloud Platform Console in a standard Kubernetes cluster environment. Based on open-source Apache NiFi, it automatically configures the architecture required for extensible clustering, and automatically installs ZooKeeper, Registry, and management modules. Through Data Flow, you can set up and deploy the setting files, NiFi templates, etc. required for service connection.
    • Easy Data Flow Management: The processing flow of stream/batch data can be easily written in a GUI-based manner tailored to the user environment, and efficient data extraction/transmission/processing between systems is possible with GUI-based data flow writing.
    • NiFi Template Gallery: You can share/distribute reference NiFi templates. Data Flow provides a gallery of work files for data processing flows frequently used in the field, and users can share their own data processing flow tasks.

    Component

    Data Flow is composed of Manager and Service modules, and provides Apache NiFi as a package.

    Data Flow Manager

    Data Flow Manager provides various managing functions to utilize NiFi more efficiently.

    • Through Data Flow Manager, customers can upload the Nar File they created and use it in the Processor, and upload setting files to share them.
    • Among NiFi templates, high-frequency templates are assetized and provided as a gallery, and can be used immediately with just one click.
    • Provides real-time monitoring and resource status monitoring for multiple services configured for Native NiFi Service.
    • You can easily provision setting information for NiFi configuration components within the cluster.

    Data Flow Service

    • It provides a data flow management service based on Apache NiFi.
    • It automatically configures the architecture required for extensible clustering based on Apache NiFi, and Nifi, ZooKeeper, Nifi Registry modules are automatically installed.
    • When providing Nifi, you can set Description, resource size, access ID/PW, and Host Alias.
    • After creating the service, you can modify the Description, necessary resource size, access password, Host Alias, etc. and reflect them in the service.

    Server spec type

    When creating a Data Flow service, please check the following contents.

    • Recommended Service Installation Specifications: CPU 21 core, Memory 57 GB, storage 100 GB or more
    Reference
    • The Data Flow service needs to be installed before creating the Ingress Controller.
    • In a Kubernetes cluster, only 1 Ingress Controller can be installed.
    • For more information, please refer to Ingress Controller installation.

    Regional Provision Status

    Data Flow is available in the following environments.

    RegionAvailability
    Western Korea (kr-west1)Provided
    Korea East (kr-east1)Available
    South Korea (kr-south1)Not provided
    South Korea southern region 2 (kr-south2)Not provided
    South Korea southern region 3(kr-south3)Not provided
    Table. Data Flow Provision Status by Region

    Preceding Service

    This is a list of services that must be pre-configured before creating this service. Please refer to the guide provided for each service and prepare in advance.

    Service CategoryServiceDetailed Description
    StorageFile StorageStorage that allows multiple client servers to share files through network connections
    ContainerKubernetes EngineKubernetes container orchestration service
    Fig. Preceding Data Flow Service