AWS S3 Storage Gateway Revisited (Part I)
This Amazon Web Service (AWS) Storage Gateway Revisited posts is a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it's called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series looks at a sample test drive deployment and use.
If you need an AWS primer and overview of various services such as Elastic Cloud Compute (EC2), Elastic Block Storage (EBS), Elastic File Service (EFS), Simple Storage Service (S3), Availability Zones (AZ), Regions and other items check this multi-part series (Cloud conversations: AWS EBS, Glacier and S3 overview (Part I) ).
As a quick refresher, S3 is the AWS bulk, high-capacity unstructured and object storage service along with its companion deep cold (e.g. inactive) Glacier. There are various S3 storage service classes including standard, reduced redundancy storage (RRS) along with infrequent access (IA) that have different availability durability, performance, service level and cost attributes.
Note that S3 IA is not Glacier as your data always remains on-line accessible while Glacier data can be off-line. AWS S3 can be accessed via its API, as well as via HTTP rest calls, AWS tools along with those from third-party's. Third party tools include NAS file access such as S3FS for Linux that I use for my Ubuntu systems to mount S3 buckets and use similar to other mount points. Other tools include Cloudberry, S3 Motion, S3 Browser as well as plug-ins available in most data protection (backup, snapshot, archive) software tools and storage systems today.
AWS S3 Storage Gateway and What's New
The Storage Gateway is the AWS tool that you can use for accessing S3 buckets and objects via your block volume, NAS file or tape based applications. The Storage Gateway is intended to give S3 bucket and object access to on-premise applications and data infrastructures functions including data protection (backup/restore, business continuance (BC), business resiliency (BR), disaster recovery (DR) and archiving), along with storage tiering to cloud.
Some of the things that have evolved with the S3 Storage Gateway include:
- Easier, streamlined download, installation, deployment
- Enhanced Virtual Tape Library (VTL) and Virtual Tape support
- File serving and sharing (not to be confused with Elastic File Services (EFS))
- Ability to define your own bucket and associated parameters
- Bucket options including Infrequent Access (IA) or standard
- Options for AWS EC2 hosted, or on-premise VMware as well as Hyper-V gateways (file only supports VMware and EC2)
AWS Storage Gateway Three Functions
AWS Storage Gateway can be deployed for three basic functions:
AWS Storage Gateway File Architecture via AWS.com
File Gateway (NFS NAS) - Files, folders, objects and other items are stored in AWS S3 with a local cache for low latency access to most recently used data. With this option, you can create folders and subdirectory similar to a regular file system or NAS device as well as configure various security, permissions, access control policies. Data is stored in S3 buckets that you specify policies such as standard or Infrequent Access (IA) among other options. AWS hosted via EC2 as well as VMware Virtual Machine (VM) for on-premise file gateway.
Also, note that AWS cautions on multiple concurrent writers to S3 buckets with Storage Gateway so check the AWS FAQs which may have changed by the time you read this. Current file share limits (subject to change) include 1 file gateway share per S3 bucket (e.g. a one to one mapping between file share and a bucket). There can be 10 file shares per gateway (e.g. multiple shares each with its own bucket per gateway) and a maximum file size of 5TB (same as maximum S3 object size). Note that you might hear about object storage systems supporting unlimited size objects which some may do, however generally there are some constraints either on their API front-end, or what is currently tested. View current AWS Storage Gateway resource and specification limits here.
AWS Storage Gateway Non-Cached Volume Architecture via AWS.com
AWS Storage Gateway Cached Volume Architecture via AWS.com
Volume Gateway (Block iSCSI) - Leverages S3 with a point in time backup as an AWS EBS snapshot. Two options exist including Cached volumes with low-latency access to most recently used data (e.g. data is stored in AWS, with a local cache copy on disk or SSD). The other option is Stored Volumes (e.g. non-cached) where primary copy is local and periodic snapshot backups are sent to AWS. AWS provides EC2 hosted, as well as VMs for VMware and various Hyper-V Windows Server based VMs.
Current Storage Gateway volume limits (subject to change) include maximum size of a cached volume 32TB, maximum size of a stored volume 16TB. Note that snapshots of cached volumes larger than 16TB can only be restored to a storage gateway volume, they can not be restored as an EBS volume (via EC2). There are a maximum of 32 volumes for a gateway with total size of all volumes for a gateway (cached) of 1,024TB (e.g. 1PB). The total size of all volumes for a gateway (stored volume) is 512TB. View current AWS Storage Gateway resource and specification limits here.
AWS Storage Gateway VTL Architecture via AWS.com
Virtual Tape Library Gateway (VTL) - Supports saving your data for backup/BC/DR/archiving into S3 and Glacier storage tiers. Being a Virtual Tape Library (e.g. VTL) you can specify emulation of tapes for compatibility with your existing backup, archiving and data protection software, management tools and processes.
Storage Gateway limits for tape include minimum size of a virtual tape 100GB, maximum size of a virtual tape 2.5TB, maximum number of virtual tapes for a VTL is 1,500 and total size of all tapes in a VTL is 1PB. Note that the maximum number of virtual tapes in an archive is unlimited and total size of all tapes in an archive is also unlimited. View current AWS Storage Gateway resource and specification limits here.
Where To Learn More
- AWS S3 Storage Gateway Revisited (Part I)
- Part II Revisiting AWS S3 Storage Gateway (Test Drive Deployment)
- AWS Storage Gateway site
- AWS Storage Gateway resource limits and specifications and Pricing
- AWS Storage Gateway Concepts, Getting Started, Managing Volumes, Troubleshooting and Local Console
- Cross-Region Replication for Amazon S3
- AWS (Amazon) storage gateway, first, second and third impressions
- Cloud conversations: If focused on cost you might miss other cloud storage benefits
- Data Protection Diaries
- Cloud Conversations: AWS overview and primer
- Eight Ways to Avoid Cloud Storage Pricing Surprises
- Cloud and Object Storage Center
- Are more than five nines of availability really possible?
- How do primary storage clouds and cloud for backup differ?
- Cloud Conversations: AWS S3 Cross Region Replication storage enhancements
- S3motion Buckets Containers Objects AWS S3 Cloud and EMCcode
- AWS EFS Elastic File System (Cloud NAS) First Preview Look
What This All Means
As to which gateway function and mode (cached or non-cached for Volumes) depends on what it is that you are trying to do. Likewise choosing between EC2 (cloud hosted) or on-premise Hyper-V and VMware VMs depends on what your data infrastructure support requirements are. Overall I like the progress that AWS has put into evolving the Storage Gateway, granted it might not be applicable for all usage cases. Continue reading more and view images from the AWS Storage Gateway Revisited test drive in part two located here.
Ok, nuff said (for now...).