Object First OOTBI teardown: PART 1: hardware & installation
Hi all, in this blogpost, I'll share you my experiences with Object Storage and more specific, OOTBI, best storage for Veeam as they call themselves.
PS. OOTBI stands for: "Out Of The Box Immutability", a welcome feature on your backup repository to protect it from malware.
First of all a big shoutout too the Object First team and fellow Veeam Vanguard Geoff Burke to give us the opportunity to test the device ourselves in our own environment. A true review is a review where we can play ourselves with the hardware and software for a descent amount of time and to perform several tests and potential hacks.
As Object First mentions, this should be 'The best storage for Veeam'. How do they achieve this ?
First of all we have to say that Object Storage (aka S3-compatible storage) has become the go-to solution for scalable repositories. Especially when you're looking to a storage that grows with your needs, definitely have a look at S3 compatible storage.
S3 object storage is based on the S3 standard, but as a famous XKCD comic says, if you have 14 competing standards and your want to create an universal one, you'll end up with 15.
So there are a lot of 'dialects' and modifications on the S3 standard. So is the OOTBI device.
Before we dive into the SOS (Smart Object Storage) API which is an extension on the existing S3 standard, let's see what's Object Storage is really about.
What is object storage and how does it work ?
What is Object Storage?
Object storage is a type of data storage architecture that manages and manipulates data as objects. Unlike traditional file systems (which organize data into a hierarchy of directories and files) or block storage (which divides data into fixed-size blocks), object storage treats each piece of data as an independent object. Each object contains the data itself, metadata, and a unique identifier.
This architecture is ideal for storing large amounts of unstructured data like multimedia files, backups, documents, and archives.
Key Concepts of Object Storage:
- Object: The basic unit of storage, consisting of:
- Data: The actual data being stored (like an image, video, or file).
- Metadata: Information about the data (such as file type, creation date, permissions).
- Unique Identifier: A globally unique identifier (GUID or UUID) that allows for quick retrieval of the object without needing a directory path or filename.
Flat Namespace:
Unlike hierarchical file systems with directories and sub-directories, object storage uses a flat structure. Each object is stored in a bucket (or container), and the unique identifier is used to retrieve it, removing the need for a directory structure.
Scalability:
Object storage can scale horizontally, meaning it can manage petabytes or even exabytes of data across distributed nodes. This is especially useful for large-scale cloud environments.
Durability and Redundancy:
Many object storage systems are designed with data redundancy in mind. Data is often replicated across multiple locations, making it highly durable and fault-tolerant.
Access via APIs:
Object storage is typically accessed through APIs, like RESTful interfaces (HTTP-based), rather than through traditional file system protocols. This allows easier integration with web-based applications and services.
How Object Storage Works:
Data Upload:
When a file (data) is uploaded to an object storage system, it is stored as an object. The system generates a unique identifier for this object, which can later be used to retrieve it.
Metadata:
As part of the object, metadata (information about the file, such as its size, type, or permissions) is stored alongside the data itself. This metadata can be customized by the user to store additional information.
Storage in Buckets or Containers:
Objects are stored within "buckets" or "containers". These containers hold a collection of objects but do not enforce a hierarchical structure like directories in a file system.
Redundancy and Replication:
Object storage systems often replicate objects across multiple locations or data centers to ensure durability. For example, if one node fails, copies of the objects are available on other nodes or regions, minimizing data loss.
Retrieval:
When an object is requested (using its unique identifier), the system retrieves the object, including its data and metadata, from wherever it is stored. This might involve fetching it from multiple locations if redundancy is in place.
Use Cases for Object Storage:
Cloud Storage: Object storage is widely used by cloud service providers, like AWS S3, Google Cloud Storage, and Azure Blob Storage, to store massive amounts of data.
Backups and Archives: Its scalability and cost-effectiveness make it ideal for storing backups, logs, and archives.
Media Hosting: Ideal for storing and delivering large unstructured data like videos, images, and audio files.
Big Data Analytics: Large datasets can be stored and analyzed using object storage due to its ability to handle massive volumes of unstructured data.
Content Distribution: Content delivery networks (CDNs) use object storage to cache and deliver media quickly across the globe.
Advantages:
- Scalability: Can handle vast amounts of data without performance degradation.
- Durability: Data replication ensures high availability and protection against hardware failures.
- Cost-Effective: It is generally more affordable, especially for cold storage or long-term retention.
- API-driven: Allows easy access and integration with modern applications through APIs.
Disadvantages:
- Latency: Typically slower compared to block or file storage for frequent, real-time access. OOTBI's SOS API will improve this !
- Not Ideal for Transactional Data: Object storage is not well-suited for use cases requiring frequent updates or low-latency data transactions, like databases.
So when it seems that object storage is quite suitable for backup data which are generally large chunks of data. This backup data is divided into blocks. The size is these blocks is already a first important parameter that potential could have a huge impact on the performance of the storage.
We're using Veeam Backup & Replication to write to the OOTBI. In a standard backup job, you can select the block size for your backup. Standard sizes are 4MB, 1MB, 512 KB and 256 KB. Know that via a registry key you can configure 8MB blocks if necessary.
Important things to know about block-size;
- Larger block sizes result in bigger incremental backups, especially with 8MB blocks.
- These considerations apply before de-duplication and compression are applied.
- The final size of objects on disk can vary significantly based on de-duplication and compression.
- Increasing the block size reduces the number of API calls. (positive impact on performance !)
So, now we know high level something about object storage, let's look in detail to the Object First device.
The storage appliance comes in different sizes.
You can choose between 64Tb, 128Tb and recently the 192Tb nodes.
The nodes can be clustered with 4 which allows you to create clusters of 256Tb, 512Tb or 768Tb
Nico front, but what's inside the box ?
The appliance is made of a Supermicro chassis and motherboard. (sys-6029P-WTRT )
- Dual socket P (LGA 3647) supports Intel Xeon Scalable Processors, Dual UPI up to 10.4GT/s
- Up to 1.5TB ECC 3DS LRDIMM, up to DDR4-2666MHz; 12 DIMM slots
- 1 PCI-E 3.0 x16 (FHHL) slots,
2 PCI-E 3.0 x8 (FHHL) slots, 2 PCI-E 3.0 x8 (LP) slots, 1 PCI-E 3.0
x16 for Add-on-Module, 1 PCI-E M.2 SSD slot
- 12 Hot-swap 3.5" SAS/SATA drive bays (4 NVMe/SAS3/SATA3 hybrid drive bays)
- 2x 10GBase-T ports via Intel C622
- 1 VGA, 4 USB 3.0 (rear)
- 1200W Redundant Power Supplies Titanium Level (96% Efficiency)
Inside we find 10 classic SAS hard-drives in a RAID-6 configuration. To allow quick recovery of the redundancy in case of a problem, a hot space is available.
At the back of the server, you'll find 2 SATA SSD drives in mirror (RAID 1) which hold the operating system.
For performance reasons a cache disk allocated the latest slot. This is an Intel 1.6Tb NVMe drive.
That's all for disks and cache.
CPU / memory
The horsepower in this box d by a dual-cpu setup Intel Xeon Silver 4215R @ 3.2Ghz and 256 Gb of RAM, divided in 8 modules of 32 Gb each.
The appliance comes with an easy to follow installation sheet, and you should be up-and-running in 15 minutes.
According to the datasheet, you only need:
- 3 available IP's (two physical and one virtual IP for the S3 endpoint)
- A username
- A password
- MFA setup (authenticator app)
As soon as we booted up the device (with local keyboard, mouse and VGA screen attached), after the boot sequence the License Agreement was shown:
After we accept the EULA, we select Setup New Cluster from a screen that looks very familiar to an ESXi interface. 😉
Other options are useful when you want to join or remove the box from and existing cluster.
You can also setup a proxy for internet access, update the application, check the network status, start the replace disk procedure, check the telemetry service, lookup the system info or license agreement and finally reset the complete node or enter the command shell.
So we select Setup New Cluster and in the next screen, the configuration wizard asks for the IP configuration on the interfaces.
Keep attention on the NIC names to ensure you're putting the right IP's to the corresponding interfaces.
As soon as the interfaces are configured, you must enter a hostname. (we choose OOTBI to be very original)
When the node sniffs internet access, he'll check if a new version is available and updates right away.
Updating to latest version during initial configuration:
We got an error during the setup of the cluster admin. Firmware crash dump is not available
Take care of the complexity of your admin password. It must have at least 12 characters and contain at least one uppercase letter, one lowercase letter, one digit and one symbol.
After the configuration of the interface, the update and the configuration of the admin password, the configuration can be finalized. From this step, the web interface is available for the setup of the first bucket.
In a next blog post we'll go deeper on the web interface and the actual configuration of some buckets, immutability and the integration with Veeam Backup & Recovery