For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Blog
DocsAPI Reference
DocsAPI Reference
    • AIStore
    • Documentation
  • Core Documentation
    • In-depth Overview
    • Terminology and core abstractions
    • Getting Started
    • Networking model
    • Buckets: design, operations, namespaces, and system buckets
    • Observability overview
    • CLI overview
    • Production deployment
    • Technical Blog
  • APIs, SDKs, and Compatibility
    • Go API
    • Python SDK
    • PyPI package
    • Python SDK reference guide
    • PyTorch integration
    • TensorFlow integration
    • HTTP API reference
    • curl examples
    • Easy URL
    • S3 compatibility
    • s3cmd quick start
    • Presigned S3 requests
    • Boto3 support
  • Command-Line Interface
    • CLI overview
    • ais help
    • CLI reference guide
    • Bucket operations
    • Cluster and remote-cluster management
    • Storage and mountpath management
    • Monitoring and ais show
    • Downloads
    • Jobs
    • Authentication and access control
    • Configuration via CLI
    • ETL CLI
    • Distributed shuffle CLI
    • ML / get-batch CLI
    • GCP credentials
    • TLS certificate management
  • Storage and Data Management
    • Storage services
    • Buckets: design, operations, namespaces, and system buckets
    • Native Bucket Inventory (NBI)
    • Backend providers
    • On-disk layout
    • Virtual directories
    • System files
    • Evicting remote buckets and cached data
  • Cluster Operations
    • Node lifecycle: maintenance, shutdown, decommission
    • Global rebalance
    • Resilver
    • AIS in Containerized Environments
    • Highly available control plane
    • Information Center (IC)
    • Out-of-band updates
    • Troubleshooting
  • Configuration and Security
    • Configuration
    • Environment variables
    • Feature flags
    • AuthN and access control
    • Authentication validation
    • HTTPS and certificates
    • Switching a cluster to HTTPS
  • ETL and Advanced Workflows
    • ETL overview
    • ETL CLI docs
    • ETL Python SDK examples
    • Custom transformers
    • ETL Python webserver SDK
    • ETL Go webserver package
    • Archives: read, write, and list
    • Distributed shuffle (dsort)
    • Initial sharding utility (ishard)
    • Downloader
    • Blob Downloader
    • Batch object retrieval (get-batch)
    • Batch operations
    • Tools and utilities
    • Extended actions (xactions)
  • Observability, Monitoring, and Performance
    • Observability overview
    • Monitoring with CLI
    • Logs
    • Prometheus integration
    • Metrics reference
    • Grafana dashboards
    • Kubernetes monitoring
    • Distributed tracing
    • Monitoring get-batch
    • AIS load generator (aisloader)
    • Benchmarking AIStore
    • Performance tuning and testing
    • Performance monitoring via CLI
    • Rate limiting
    • Checksumming
    • Filesystem Health Checker (FSHC)
    • Traffic patterns
  • Networking
    • Networking: multi-homing, network separation, IPv6
    • HTTPS configuration
    • Switching to HTTPS
    • Idle connections
    • MessagePack protocol
  • Deployment
    • AIStore on Kubernetes
    • Kubernetes Operator
    • Ansible playbooks
    • Helm charts
    • Deployment monitoring
    • Docker
  • Developer Resources
    • Development guide
    • aisnode command line
    • Build tags
  • Object and Bucket Naming
    • Unicode and special symbols in object and bucket names
    • Extremely long object names
Blog
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoAIStore
On this page
  • Features
  • CLI
  • Developer Tools
  • Quick Start
  • Deployment options
  • Existing Datasets
  • Install from Release Binaries
  • PyTorch integration
  • AIStore Badge
  • More Docs & Guides
  • How to find information
  • License
  • Author

AIStore: High-Performance, Scalable Storage for AI Workloads

||View as Markdown|
Next

Documentation

AIStore (AIS) is a lightweight distributed storage stack tailored for AI applications. It’s an elastic cluster that can grow and shrink at runtime and can be ad-hoc deployed, with or without Kubernetes, anywhere from a single Linux machine to a bare-metal cluster of any size. Built from scratch, AIS provides linear scale-out, consistent performance, and a flexible deployment model.

AIS is a reliable storage cluster that can natively operate on both in-cluster and remote data, without treating either as a cache.

AIS consistently shows balanced I/O distribution and linear scalability across an arbitrary number of clustered nodes. The system supports fast data access, reliability, and rich customization for data transformation workloads.

Features

  • ✅ Multi-Cloud Access: Seamlessly access and manage content across multiple cloud backends (including AWS S3, GCS, Azure, and OCI), with fast-tier performance, configurable redundancy, and namespace-aware bucket identity (same-name buckets can coexist across accounts, endpoints, and providers).
  • ✅ Deploy Anywhere: AIS runs on any Linux machine, virtual or physical. Deployment options range from a minimal container-based deployment and Google Colab to petascale Kubernetes clusters. There are no built-in limitations on deployment size or functionality.
  • ✅ High Availability: Redundant control and data planes. Self-healing, end-to-end protection, n-way mirroring, and erasure coding. Arbitrary number of lightweight access points (AIS proxies).
  • ✅ HTTP-based API: A feature-rich, native API (with user-friendly SDKs for Go and Python), and compliant Amazon S3 API for running unmodified S3 clients.
  • ✅ Monitoring: Comprehensive observability with integrated Prometheus metrics, Grafana dashboards, detailed logs with configurable verbosity, and CLI-based performance tracking for complete cluster visibility and troubleshooting. See AIStore Observability for details.
  • ✅ Chunked Objects: High-performance chunked object representation, with independently retrievable chunks, metadata v2, and checksum-protected manifests. Supports rechunking, parallel reads, and seamless integration with Get-Batch, blob-downloader, and multipart uploads to supported cloud backends.
  • ✅ JWT Authentication and Authorization: Validates request JWTs to provide cluster- and bucket-level access control using static keys or dynamic OIDC issuer JWKS lookup.
  • ✅ Secure Redirects: Configurable cryptographic signing of redirect URLs using HMAC-SHA256 with a versioned cluster key (distributed via metasync, stored in memory only).
  • ✅ Load-Aware Throttling: Dynamic request throttling based on a multi-dimensional load vector (CPU, memory, disk, file descriptors, goroutines) to protect AIS clusters under stress.
  • ✅ Unified Namespace: Attach AIS clusters together to provide unified access to datasets across independent clusters, allowing users to reference shared buckets with cluster-specific identifiers.
  • ✅ Turn-key Cache: In addition to robust data protection features, AIS offers a per-bucket configurable LRU-based cache with eviction thresholds and storage capacity watermarks.
  • ✅ ETL Offload: Execute I/O intensive data transformations close to the data, either inline (on-the-fly as part of each read request) or offline (batch processing, with the destination bucket populated with transformed results).
  • ✅ Get-Batch: Retrieve multiple objects and/or archived files with a single call. Designed for ML/AI pipelines, Get-Batch fetches an entire training batch in one operation, assembling a TAR (or other supported serialization formats) that contains all requested items in the exact user-specified order (paper).
  • ✅ Data Consistency: Guaranteed consistency across all gateways, with write-through semantics in presence of remote backends.
  • ✅ Serialization & Sharding: Native, first-class support for TAR, TGZ, TAR.LZ4, and ZIP archives for efficient storage and processing of small-file datasets. Features include seamless integration with existing unmodified workflows across all APIs and subsystems.
  • ✅ Kubernetes: For production, AIS runs natively on Kubernetes. The dedicated ais-k8s repository includes the AIS K8s Operator, Ansible playbooks, Helm charts, and deployment guidance.
  • ✅ Batch Jobs: More than 30 cluster-wide batch operations that you can start, monitor, and control otherwise. The list currently includes:
1$ ais show job --help
2
3NAME:
4 archive blob-download cleanup copy-bucket copy-objects delete-objects
5 download dsort ec-bucket ec-get ec-put ec-resp
6 elect-primary etl-bucket etl-inline etl-objects evict-objects evict-remote-bucket
7 get-batch list lru-eviction mirror prefetch-objects promote-files
8 put-copies rebalance rechunk rename-bucket resilver summary
9 warm-up-metadata

The feature set continues to grow and also includes: native bucket inventory (NBI); blob-downloader; AuthN - authentication and authorization server; runtime management of TLS certificates; full support for adding/removing nodes at runtime; adaptive rate limiting; and more.

For the original white paper and design philosophy, please see AIStore Overview, which also includes high-level block diagram, terminology, APIs, CLI, and more. For our 2024 KubeCon presentation, please see AIStore: Enhancing petascale Deep Learning across Cloud backends.

CLI

AIS includes an integrated, scriptable CLI for managing clusters, buckets, and objects, running and monitoring batch jobs, viewing and downloading logs, generating performance reports, and more:

1$ ais <TAB-TAB>
2
3advanced cluster etl ls prefetch search tls
4alias config evict ml put show wait
5archive cp get mpu remote-cluster space-cleanup
6auth create help nbi rmb start
7blob-download download job object rmo stop
8bucket dsort log performance scrub storage

Developer Tools

AIS runs natively on Kubernetes and features open format - thus, the freedom to copy or move your data from AIS at any time using the familiar Linux tar(1), scp(1), rsync(1) and similar.

For developers and data scientists, there’s also:

  • Go API used in CLI and benchmarking tools
  • Python SDK + Reference Guide
  • PyTorch integration and usage examples
  • Boto3 support

Quick Start

  1. Read the Getting Started Guide for a 5-minute local install, or
  2. Run a minimal container-based AIS cluster consisting of a single gateway and a single storage node, or
  3. Clone the repo and run make kill cli aisloader deploy followed by ais show cluster

Deployment options

AIS deployment options, as well as intended (development vs. production vs. first-time) usages, are all summarized here.

Prerequisites essentially boil down to having Linux with a disk. Deployment options range from a minimal container-based deployment to petascale bare-metal clusters of any size, and from a single VM to multiple racks of high-end servers. Practical use cases require, of course, further consideration.

Some of the most popular deployment options include:

OptionUse Case
Local playgroundAIS developers or first-time users, Linux or Mac OS. Run make kill cli aisloader deploy <<< $'N\nM', where N is a number of targets, M is a number of gateways
Minimal container-based deploymentQuick testing and evaluation; single-node setup
GCP/GKE automated installDevelopers, first-time users, AI researchers
Large-scale production deploymentRequires Kubernetes; provided via ais-k8s

For performance tuning, see performance and AIS K8s Playbooks.

Existing Datasets

AIS supports multiple ingestion modes:

  • ✅ On Demand: Transparent cloud access during workloads.
  • ✅ PUT: Locally accessible files and directories.
  • ✅ Promote: Import local target directories and/or NFS/SMB shares mounted on AIS targets.
  • ✅ Copy: Full buckets, virtual subdirectories (recursively or non-recursively), lists or ranges (via Bash expansion).
  • ✅ Download: HTTP(S)-accessible datasets and objects.
  • ✅ Prefetch: Remote buckets or selected objects (from remote buckets), including subdirectories, lists, and/or ranges.
  • ✅ Archive: Group and store related small files from an original dataset.

Install from Release Binaries

You can install the CLI and benchmarking tools using:

1./scripts/install_from_binaries.sh --help

The script installs aisloader and CLI from the latest or previous GitHub release and enables CLI auto-completions.

PyTorch integration

PyTorch integration is a growing set of datasets (both iterable and map-style), samplers, and dataloaders:

  • Taxonomy of abstractions and API reference
  • AIS plugin for PyTorch: usage examples
  • Jupyter notebook examples

AIStore Badge

Let others know your project is powered by high-performance AI storage:

aistore

1[![aistore](https://img.shields.io/badge/powered%20by-AIStore-76B900?style=flat&labelColor=000000)](https://github.com/NVIDIA/aistore)

More Docs & Guides

  • Overview and Design
  • Terminology and Core Abstractions
  • Networking Model
  • Getting Started
  • AIS Buckets: Design and Operations
  • Observability
  • Technical Blog
  • S3 Compatibility
  • Batch Jobs
  • Performance and CLI: performance
  • CLI Reference
  • Production Deployment: Kubernetes Operator, Ansible Playbooks, Helm Charts, Monitoring

How to find information

  • See Extended Index
  • Use CLI search command, e.g.: ais search copy
  • Clone the repository and run git grep, e.g.: git grep -n out-of-band -- "*.md"

License

MIT

Author

Alex Aizman (NVIDIA)