Overview
Service Overview
Data Flow is a data processing flow tool that extracts large amounts of data from various data sources and visually creates a processing flow for transformation/transmission of stream/batch data, providing open-source Apache NiFi. Data Flow can be used independently in the Kubernetes Engine cluster environment of the Samsung Cloud Platform or with other application software.
Provided Features
Data Flow provides the following functions.
- Easy installation and management: Data Flow can be easily installed through the web-based Samsung Cloud Platform Console in a standard Kubernetes cluster environment. Based on open-source Apache NiFi, it automatically configures the architecture required for extensible clustering, and automatically installs ZooKeeper, Registry, and management modules. Through Data Flow, you can set up and deploy the setting files, NiFi templates, etc. required for service connection.
- Easy Data Flow Management: The processing flow of stream/batch data can be easily written in a GUI-based manner tailored to the user environment, and efficient data extraction/transmission/processing between systems is possible with GUI-based data flow writing.
- NiFi Template Gallery: You can share/distribute reference NiFi templates. Data Flow provides a gallery of work files for data processing flows frequently used in the field, and users can share their own data processing flow tasks.
Component
Data Flow is composed of Manager and Service modules, and provides Apache NiFi as a package.
Data Flow Manager
Data Flow Manager provides various managing functions to utilize NiFi more efficiently.
- Through Data Flow Manager, customers can upload the Nar File they created and use it in the Processor, and upload setting files to share them.
- Among NiFi templates, high-frequency templates are assetized and provided as a gallery, and can be used immediately with just one click.
- Provides real-time monitoring and resource status monitoring for multiple services configured for Native NiFi Service.
- You can easily provision setting information for NiFi configuration components within the cluster.
Data Flow Service
- It provides a data flow management service based on Apache NiFi.
- It automatically configures the architecture required for extensible clustering based on Apache NiFi, and Nifi, ZooKeeper, Nifi Registry modules are automatically installed.
- When providing Nifi, you can set Description, resource size, access ID/PW, and Host Alias.
- After creating the service, you can modify the Description, necessary resource size, access password, Host Alias, etc. and reflect them in the service.
Server spec type
When creating a Data Flow service, please check the following contents.
- Recommended Service Installation Specifications: CPU 21 core, Memory 57 GB, storage 100 GB or more
- The Data Flow service needs to be installed before creating the Ingress Controller.
- In a Kubernetes cluster, only 1 Ingress Controller can be installed.
- For more information, please refer to Ingress Controller installation.
Regional Provision Status
Data Flow is available in the following environments.
| Region | Availability |
|---|---|
| Western Korea (kr-west1) | Provided |
| Korea East (kr-east1) | Available |
| South Korea (kr-south1) | Not provided |
| South Korea southern region 2 (kr-south2) | Not provided |
| South Korea southern region 3(kr-south3) | Not provided |
Preceding Service
This is a list of services that must be pre-configured before creating this service. Please refer to the guide provided for each service and prepare in advance.
| Service Category | Service | Detailed Description |
|---|---|---|
| Storage | File Storage | Storage that allows multiple client servers to share files through network connections |
| Container | Kubernetes Engine | Kubernetes container orchestration service |
