The page has been translated by Gen AI.

Resource Selection and architecture Design

Resource selection criteria

Resource selection based on billing model

When designing an architecture, the service’s billing model is an important consideration, and the billing model can vary depending on how cloud resources are deployed.

Provisioning is a method of deploying resources by specifying particular specifications or services.

In this case, charges are incurred according to the billing cycle (hourly, daily, monthly) even if no actual resources are used.

In contrast, the Pay-as-you-go model does not incur any cost until resources are actually used, and billing is based on the amount of usage, the number of calls, and similar factors.

There are services that bill by mixing the provisioning model with the pay-as-you-go model.

service	Deployment method	Billing model
Virtual Server, GPU Server, Bare Metal Server, Multi-node GPU Cluster, Virtual Server DR	Provisioning	Per Hour
Cloud Functions (computing time seconds‑GB calculated together)	Pay-as-you-go	Per Call
Block Storage	Provisioning	Per Hour
Object Storage, File Storage, Archive Storage, Backup	Pay-as-you-go	Per Usage(GB)
DBaaS	Provisioning	Per Hour
Kubernetes Engine	Provisioning	Per Hour
Container Registry	Pay-as-you-go	Per Usage(GB)
Public IP, NAT Gateway, VPC Endpoint, Direct Connect, SASE, Load Balancer, GSLB, Transit Gateway (calculated together with the per-GB transfer fee)	Provisioning	Per Hour
DNS	Provisioning	Per Day
VPN (calculated together with Internet Outbound charges)	Provisioning	Per Month
Internet Outbound Traffic, VPC Peering, Global CDN	Pay-as-you-go	Per Usage(GB)
DDoS Protection, IPS, Secured Firewall	Provisioning	Per Hour
WAF, Config Inspection	Provisioning	Per Month
Secured VPN	Provisioning	Per Year
SingleID	Provisioning	Per User
Secret Vault, Key Management Service (key count + per-call calculation)	Pay-as-you-go	Per Call
Event Streams, Search Engine, Vertica(DBaaS), Data Ops, Data Flow	Provisioning	Per Hour
Data Query	Pay-as-you-go	Per Usage(GB)
API Gateway	Pay-as-you-go	Per Call
AI/MLOps Platform, Cloud ML	Provisioning	Per Hour
DevOps Service (Workflow is calculated per occurrence)	Pay-as-you-go	Per User
Trail	Pay-as-you-go	Per Event
Edge Server	Provisioning	Per 3/5years

Table. Samsung Cloud Platform service-specific billing models

The two architectures shown in the figure below are examples of implementing a mobile application that provides an image service using a provisioning model and a pay-as-you-go model.

Diagram — Figure. Computing architecture based on Pay-as-you-go pricing and Provisioning pricing

The three-tier web application approach based on the provisioning model on the left is a traditional architecture familiar to developers and operators, and it is the configuration typically chosen when migrating an existing application or developing a new one.

Since this architecture incurs a fixed monthly cost regardless of usage, applying it to workloads with low load variability can improve cost efficiency.

The architecture on the right is primarily composed of Serverless Computing services that are billed per request, in contrast to the left’s Virtual Server‑centric configuration.

Serverless Computing architecture requires an understanding of API routing in API Gateway, function processing in Cloud Functions, and the operation of REST APIs in Object Storage, and involves technical challenges.

However, because billing is per call or per volume, costs are charged only when actual usage occurs.

Therefore, in environments where request load periodically increases or load prediction is difficult, applying this architecture allows you to handle the load cost‑effectively.

When selecting resources like this, you must consider both the use case and the organization’s technical capabilities.

Since cloud resources provide various billing models depending on service characteristics, you should design the architecture to appropriately leverage them and optimize costs.

Select resource type, size, quantity

In on-premises environments, hardware sizing must be determined after carefully forecasting future demand during the early stages of system construction, and because changes are difficult once a purchase is completed, the initial design is extremely important.

In contrast, in a cloud environment, resources can be scaled up or down at any time, allowing you to start with minimal initial capital investment.

To leverage these cloud characteristics for greater cost efficiency, it is advantageous to choose technologies with high resource efficiency and agility, such as resources or containers billed per request.

However, adopting Serverless Computing or container technologies requires that the organization’s internal technical capabilities be supported, so the associated costs and risks must also be considered.

When selecting a server type, it is preferable to start on a small scale rather than beginning with a large scale, and then gradually increase specifications or expand the number of deployments.

In a cloud environment, you can change the server type at any time, so it is common to initially deploy a small-capacity server and go through a step of adjusting specifications to ensure the software and code operate reliably.

After optimizing the capacity of the individual server, we perform the task of scaling the number of servers through load testing.

From the perspective of management and application integration, deploying multiple software on one or two servers may be convenient, but in a cloud environment, using many small-capacity servers is more advantageous in terms of cost and scalability.

For example, using three small servers instead of two large servers provides higher availability and resource utilization, and can support rapid scaling.

Select a pricing model

Pricing Plan Selection

In Samsung Cloud Platform, you can select and apply plans such as a committed pricing plan, Cost Savings, and Planned Compute.

The table below describes the plans and the services that correspond to each plan.

Pricing plan	Explanation	Applied Service
Cost Savings	Discount on fees conditional on committing to an hourly usage amount for 1 year or 3 years	Virtual Server, GPU Server, Bare Metal Server, Multi-node GPU Cluster, Database, Event Streams, Search Engine, Vertica(DBaaS)
Planned Compute	Discounted rates for committing to a 1‑year or 3‑year instance type usage.	Virtual Server, GPU Server, Bare Metal Server, Multi-node GPU Cluster, Database, Event Streams, Search Engine, Vertica(DBaaS)

Table. Samsung Cloud Platform pricing plans and target services

The pricing plans for applying discounts consist of Cost Savings and Planned Compute.

When you contract Cost Savings and Planned Compute, discounts are applied to the applicable services, and you can receive discounts by selecting a 1‑year or 3‑year term.

In particular, Planned Compute contracts are made based on the instance type, and discounts can be applied to the corresponding non‑committed instances.

To view rates with no‑contract, 1‑year or 3‑year discounts for each service, you can use the rate calculator.

If the average monthly usage over the past year is as shown below, and we assume the usage for the next year will be the same, we can design the pricing plan as shown in the table below.

Service (server type)	quantity	Operation pattern	Apply pricing plan
Virtual Server (s2v2m16)	2	Continuous operation (730 hours)	1-year contract
Virtual Server (s2v2m4)	2	Irregular (54 hours)	Cost Savings
Virtual Server (s2v4m32)	3	Irregular (365 hours)	Cost Savings
GPU Server (g2v48h4)	1 ~ 4	Regular variation (1,460 hours)	Planned Compute
Virtual Server Auto-Scaling (s2v2m4)	Min(2)	Regular operation (586 hours)	Cost Savings
Virtual Server Auto-Scaling (s2v2m4)	Max(8)	Periodic load (144 hours)	No contract
MySQL(DBaaS) (db2v4m32)	HA(2)	Continuous operation (730 hours) Vertical scalability possible	Cost Savings
MySQL(DBaaS) (db2v4m32)	Replica(1)	Continuous operation (730 hours) Scale replicas as needed	Cost Savings

Table. Compute pricing plan design example

The always-on Virtual Server listed in the table has been discounted through a contract pricing plan.

GPU Server(g2v48h4) is used from a minimum of 1 unit to a maximum of 4 units, and the total monthly usage time is 1,460 hours.

In such cases, you can apply a discount to the server type through a Planned Compute commitment.

For Virtual Servers that run irregularly, Auto-Scaling VMs that handle regular loads, and databases with vertical scaling potential, apply the Cost Savings pricing plan to all.

When applying Cost Savings, the point to note is that the contract amount should be determined conservatively.

If you set the amount excessively, you may end up paying for something you didn’t use, so it is advisable to start conservatively and then optimize gradually.

Software License

As mentioned earlier, the strategy of handling workloads with multiple small servers rather than a few large servers is also cost-effective from a software licensing perspective.

The application server typically has high software usage, and in on-premises environments, it was common to simplify integration by installing multiple software packages on large servers to reduce hardware costs.

However, in a cloud environment, this approach can be inefficient.

If you separate and install the software on a small server, lower server specifications won’t be an issue, and you can also reduce software license purchase costs.

Additionally, an additional factor to consider in software licensing is BYOL (Bring Your Own License).

Samsung Cloud Platform supports BYOL software in services such as Database and Analytics, and you must review the server deployment plan after confirming the licensing conditions of the software to be deployed to avoid unexpected budget overruns in the future.

Additionally, reviewing the license and technical support packs for 3rd‑party software available for purchase in the marketplace is a good way to improve cost efficiency not only in terms of license fees but also in terms of technology implementation.

Architecture selection

Demand-based elastic resource composition architecture design

Samsung Cloud Platform can configure Auto-Scaling to elastically adjust servers.

Through Auto-Scaling, you can build a demand-based cost model that allows resources to be used according to demand.

Auto-Scaling provides threshold-based and schedule-based horizontal scaling and shrinking policies.

Threshold-based scaling policies adjust the number of servers by setting thresholds based on the minimum, maximum, and average values of metrics such as CPU usage, memory usage, and network in/out.

When setting policies, it is common to start scaling up quickly to proactively address load, and to delay scaling down to prepare for a potential load increase.

Serverless Computing handling temporary load through architecture

Auto-Scaling is a suitable computing architecture when the rate of load increase is slower than the VM creation rate.

If the load increase rate is faster than the VM creation speed, you should consider alternatives other than Auto-Scaling-based VMs.

In this case, the primary alternative to consider is the managed service Serverless Computing architecture.

When it is difficult to predict the rate or scale of load increase, using managed services to reduce the burden of operating highly available infrastructure is effective.

Resources provided by the cloud cannot handle unlimited load, but they can serve as an alternative for ensuring stable availability against unpredictable demand.

The figure below shows an architecture that adds a Serverless Computing component to Auto-Scaling to handle temporary load.

During normal operation, the load is handled by the top Auto-Scaling, and sudden load spikes are managed by activating the bottom temporary service.

This architecture is used in situations where, like limited-product promotions on online shopping malls or ticket reservations for popular performances, the load spikes instantly at the start of an event and then decreases once inventory or seats are sold out.

We deploy or run services using IaC tools such as Terraform or the CLI. Some services use a request-based billing model, so they can be pre‑deployed, but services that require provisioning, such as CacheStore (DBaaS), need to be newly deployed in advance and kept in a stopped state for preparation. One thing to note when deploying a service for a temporary workload is that a certain amount of lead time is needed before the load starts. Services such as Cloud Functions require a pre-warming period called a Cold Start to handle requests. When deploying the service, you should take this uptime into account and ensure that preparation is completed before the request start time.
Requests are distributed through the API Gateway.
Web requests are served using web assets stored in an Object Storage bucket.
Inventory checks and order processing are performed via Cloud Functions.
To enable fast queries and order storage, we use CacheStore(DBaaS) as a temporary data store, and after all processing is complete, we batch the data into the primary Database.

After all processing is complete, delete or shut down resources to minimize costs.

Minimizing Data Transfer Cost Network Design

During the early stages of cloud design and deployment, you must pay attention to compute resources.

This is because compute resources account for the largest share of actual cloud costs, and improper design can lead to excessive expenses.

After operations start, unforeseen expenses can occur, with Internet Outbound Traffic costs being a prime example.

Data can be requested from a server inside the VPC over the Internet, and a server within the VPC can also transmit and receive data to and from the Internet.

At this point, you should note that charges apply to data transmitted outside the VPC.

When designing network architecture, especially a multi‑VPC architecture, you must take care to minimize traffic that traverses the Internet.

When connections between servers in multiple VPCs are required, avoid routing through the Internet and instead use VPC Peering (a connection between two VPCs) or a Transit Gateway to establish private communication, which can optimize costs.

Furthermore, private connections enhance the security of the link between the two site networks.

Common Service Area Configuration

When operating multiple information systems based on multiple VPCs or multiple projects, there may be services that are commonly used across systems even though each system’s functions differ.

In such cases, the architecture can be designed by organizing common services in a separate domain and having each information system share and use them.

In particular, servers with commercial software installed can maximize economic efficiency by configuring them as a shared service area.

For example, you can consider a scenario where an organization with multiple branches nationwide integrates the headquarters and branch websites into the cloud through a unified migration.

Each branch’s website server can operate independently, but by consolidating the components required for web services into a common service and designing them as a shared zone, cost savings can be achieved.

The following diagram is an example architecture that places common services in a shared VPC to reduce cloud costs.

Deploy the headquarters, branch A, and branch B systems in separate VPCs, and configure an additional VPC for shared services. The headquarters and branch VPCs establish VPC peering with the common services VPC using a hub-and-spoke (Hub-and-Spoke) architecture.
By installing the commonly required software on VMs in each VPC, we reduce costs by configuring all systems to share it. In the example architecture, the video and map services used by each website were configured as common services.
By setting up separate management servers such as a Bastion Server or Logging server for external access or information security purposes, you can unify external entry points. However, this single point of contact requires preparation against external breaches, and because it can become a single security vulnerability, caution is required.

In projects that host multiple systems with similar workloads, using a common-service-based architecture can prevent duplicate investment in resources and license costs.

Additionally, you can gain further benefits in terms of ease of management and architectural scalability.