BI in the cloud: minimum size AWS instances to get you started
When considering building Business Objects BI prototype or development systems in AWS instances, a frequently asked question is about the minimum or cheapest virtual machine instance (Amazon calls them EC2 instances) size that can be used in these scenarios.
The quick, easy and expensive answer is the r3.2xlarge instance. But there are other options, as exposed below.
The deployment of BI systems on AWS instances is only partially covered by the Supported Platforms (“PAM”) document. The rest of the story is determined by SAP Note 1656099, which requires the following, as minimum requirements:
“The following resource requirements should be fulfilled:
For BI version 4.0, EC2 Instance Types with a SAP performance rating of 7,400 SAPS or higher should be used.
The SAP BusinessObjects BI 4 Sizing Estimator requirements should be met.
The SAP BusinessObjects BI 4 Sizing Companion Guide requirements should be met.”
The complex problem of sizing EC2 instances in larger deployments, which requires one to take into account virtualization, the peculiarities of AWS instances, along with SAP’s sizing estimator application and its BI 4 Sizing Companion Guide, are tackled in another article. Here, the objective of the discussion is to determine the minimum size for smaller deployments, where the generic 7,400 SAPS is more likely to be a problem than the other factors.
This minimum requirement of 7,400 SAPS has no mathematical conversion to the metrics used by Amazon and EC2 instances (the ECUs). SAP Note 1656099, however, let us infer an average ratio of 295 SAPs to ECU. So anything below 25 ECUs is not recommended, and we believe it is a valid assumption that any instance on or above that criteria should have a comparable or at least acceptable performance. Also, 16 GB RAM is another minimum requirement clearly documented in the PAM document.
Now we can take these factors into account and visit the Amazon pricing tables to mash up a comparison table. We chose to use the Sidney, Sao Paulo and US California AWS regions to compare whether or not location makes any difference. The “Meets Par” field conditions are satisfied when the ECU is equal to or greater than 24 and RAM, 16. The green instances are the ones that met both requirements. Finally, the “SAP validated” field denotes whether or not SAP states explicit support for the instance and associates SAPS values with them (in SAP Note 1656099):
EC2 instance | vCPU | ECU | RAM GiB | Storage | Optimization | Sidney | Sao Paulo | US California | Meets Par | SAP Validated |
---|---|---|---|---|---|---|---|---|---|---|
m3.medium | 1 | 3 | 3.75 | 1 x 4 SSD | Generic | $0.10 | $0.10 | $0.08 | N | N |
m3.large | 2 | 6.5 | 7.5 | 1 x 32 SSD | Generic | $0.20 | $0.19 | $0.15 | N | N |
m3.xlarge | 4 | 13 | 15 | 2 x 40 SSD | Generic | $0.39 | $0.38 | $0.31 | N | N |
m3.2xlarge | 8 | 26 | 30 | 2 x 80 SSD | Generic | $0.78 | $0.76 | $0.62 | Y | N |
c4.large | 2 | 8 | 3.75 | EBS Only | CPU | $0.15 | NA | $0.14 | N | N |
c4.xlarge | 4 | 16 | 7.5 | EBS Only | CPU | $0.30 | NA | $0.28 | N | N |
c4.2xlarge | 8 | 31 | 15 | EBS Only | CPU | $0.61 | NA | $0.55 | N | N |
c4.4xlarge | 16 | 62 | 30 | EBS Only | CPU | $1.22 | NA | $1.10 | Y | N |
c4.8xlarge | 36 | 132 | 60 | EBS Only | CPU | $2.43 | NA | $2.21 | Y | N |
c3.large | 2 | 7 | 3.75 | 2 x 16 SSD | CPU | $0.13 | $0.16 | $0.12 | N | Y |
c3.xlarge | 4 | 14 | 7.5 | 2 x 40 SSD | CPU | $0.27 | $0.33 | $0.24 | N | Y |
c3.2xlarge | 8 | 28 | 15 | 2 x 80 SSD | CPU | $0.53 | $0.65 | $0.48 | N | Y |
c3.4xlarge | 16 | 55 | 30 | 2 x 160 SSD | CPU | $1.06 | $1.30 | $0.96 | Y | Y |
c3.8xlarge | 32 | 108 | 60 | 2 x 320 SSD | CPU | $2.12 | $2.60 | $1.91 | Y | Y |
g2.2xlarge | 8 | 26 | 15 | 60 SSD | GPU | $0.90 | NA | $0.70 | N | N |
r3.large | 2 | 6.5 | 15 | 1 x 32 SSD | Memory | $0.21 | NA | $0.20 | N | Y |
r3.xlarge | 4 | 13 | 30.5 | 1 x 80 SSD | Memory | $0.42 | NA | $0.39 | N | Y |
r3.2xlarge | 8 | 26 | 61 | 1 x 160 SSD | Memory | $0.84 | NA | $0.78 | Y | Y |
r3.4xlarge | 16 | 52 | 122 | 1 x 320 SSD | Memory | $1.68 | NA | $1.56 | Y | Y |
r3.8xlarge | 32 | 104 | 244 | 2 x 320 SSD | Memory | $3.36 | NA | $3.12 | Y | Y |
i2.xlarge | 4 | 14 | 30.5 | 1 x 800 SSD | Storage | $1.02 | NA | $0.94 | N | N |
i2.2xlarge | 8 | 27 | 61 | 2 x 800 SSD | Storage | $2.04 | NA | $1.88 | Y | N |
i2.4xlarge | 16 | 53 | 122 | 4 x 800 SSD | Storage | $4.07 | NA | $3.75 | Y | N |
i2.8xlarge | 32 | 104 | 244 | 8 x 800 SSD | Storage | $8.14 | NA | $7.50 | Y | N |
hs1.8xlarge | 16 | 35 | 117 | 24 x 2048 | Storage | $5.57 | NA | NA | Y | N |
Table 1 (prices based on Linux, on-demand instances)
As the table suggests, there are two clear entry-level instances: m3.2xlarge and r3.2xlarge.
It is interesting to see how the determination of a suitable EC2 instance may vary greatly depending on which region one is. In the most dramatic case, Sao Paulo simply does not have the option of the r3.2xlarge.
The Sidney region has both instances available and m3.2xlarge is cheaper, but the difference is so small ($0.056 per hour or $490 per year) that one would probably prefer to go with the r3.2xlarge and its additional 31 GiB of RAM, and the fact that SAP supports it explicitly.
The conclusion might be different for an Organisation in the US, as the difference between the instances amounts to $0.164 per hour or $1,436 per year, which could tip the scales towards the cheaper option.
The entry-level EC2 instance is then, a function of the location you are in, combined with your choice between performance (and SAP’s validation) versus costs.
If an Organisation still finds this too expensive for a non-production system, for example, they could consider throwing in some capital expense funds upfront and get a reserved instance (as opposed to an on-demand one) with a reduced per-hour rate.
Another alternative is to organise its servers to turn off out of business hours, so that an r3.2xlarge instance would cost only $1,747 per year (8hx5daysx52weeks), as opposed to $7,338 (24hx7daysx52 weeks) per year if it was running continuously (Sidney prices, Linux OS).
Yet another path that was taken by smaller Organisations was to leverage the flexibility inherent to AWS and start with an EC2 that does not quite match the minimum requirements, and get their Developers to check whether or not they can live with that performance with minimum development disruption, and only increase to the minimum size (26 ECUs, 16 RAM) when they cannot. This approach requires some configuration intervention on the BI side, since one would need to reconfigure the java heap sizes for services on the host, so they should either have a long-term contract with a Consultant or in-house expertise.
There are Organisations running development BI 4.1 on an m2.2xlarge instance (previous generation), which has only 13 ECUs. That instance is very similar to the current generation’s r3.xlarge, so that anecdotal evidence suggests one should get away with that instance for development, at the annual cost of only $873 (Sidney prices, Linux OS).
This cost-based aggressive strategy goes very well with AWS, but it is important to highlight that this configuration is below the minimum recommended pre-requisites by any measure or scale, so one should not be surprised if SAP declines to support these environments – especially regarding performance-related issues.
It can now be summarized that the minimum size of AWS instance one can choose to deploy BI 4.x varies according to the location the Organisation is based in, budget, the importance of performance, the availability (and costs) of BI expertise, and the risk exposure threshold for experimenting with unsupported options. The quick and easy answer (r3.2xlarge instance) is so because it ticks all boxes: metrics, PAM, SAP Note 1656099 plus that extra RAM you can use for your adaptive processing servers.