Service Overview
Data Flow is a data processing workflow tool that visually creates processing flows for extracting large volumes of data from various data sources and for transforming and transmitting stream/batch data, and it provides the open-source Apache NiFi. Data Flow can be used independently in the Kubernetes Engine cluster environment of Samsung Cloud Platform, or together with other application software.
Provided features
Data Flow provides the following functions.
- Convenient Installation and Management: Data Flow can be easily installed in a standard Kubernetes cluster environment via the web-based Samsung Cloud Platform Console. It automatically configures the architecture required for scalable clustering based on the open-source Apache NiFi, automatically installing ZooKeeper, Registry, and management modules. With Data Flow, you can configure and deploy configuration files, NiFi templates, and other assets needed for service integration.
- Easy Data Flow Management: You can easily create processing flows for stream/batch data in a GUI that matches the user environment, and by authoring data processing flows in a GUI, you can efficiently extract, transmit, and process data between systems.
- NiFi Template Gallery: You can share/distribute reference NiFi templates. Data Flow provides work files for data processing flows commonly used in the field as a gallery, and users can share the data processing flow work they have created.
Component
Data Flow consists of Manager and Service modules, and is provided packaged with Apache NiFi.
Data Flow Manager
Data Flow Manager provides various management functions to enable more efficient use of NiFi.
- You can upload the Nar file created by the customer through the Data Flow Manager for use in the Processor, and upload configuration files to share them.
- Frequently used NiFi templates are packaged as assets and offered in the Gallery, ready for use with a single click.
- Provides real-time monitoring of multiple services configured for the native NiFi service, as well as resource status monitoring.
- You can easily provision configuration information for NiFi components within the cluster.
Data Flow Service
- We provide a data flow management service based on Apache NiFi.
- Automatically configures the architecture required for scalable clustering based on Apache NiFi, and automatically installs the Nifi, ZooKeeper, and Nifi Registry modules.
- When providing Nifi, you can set the Description, required resource size, connection ID/PW, and Host Alias.
- After creating the service, you can modify the Description, required resource size, connection password, Host Alias, and other settings, and apply the changes to the service.
Server spec type
When creating a Data Flow service, check the following.
- Recommended Service Installation Specifications: CPU 21 core, Memory 57 GB, Storage at least 100 GB
- Before creating the Data Flow service, you need to install the Ingress Controller.
- Only one Ingress Controller can be installed in a Kubernetes cluster.
- For more details, refer to Ingress Controller Installation.
Provision status by region
Data Flow is available in the environments below.
| region | Provision status |
|---|---|
| Korea West (kr-west1) | Provided |
| Korea East (kr-east1) | Provide |
| South Korea 1 (kr-south1) | Not provided |
| South Korea South 2 (kr-south2) | Not provided |
| South Korea 3 (kr-south3) | Not provided |
Preliminary Service
This is a list of services that must be pre‑configured before creating the service. Please refer to the guide provided for each service and prepare in advance.
| Service Category | service | Detailed description |
|---|---|---|
| Storage | File Storage | Storage that enables multiple client servers to share files over a network connection. |
| Container | Kubernetes Engine | Kubernetes container orchestration service |
