Collection
A collection in the AT Protocol is a fundamental organizational framework that groups related records within a user's data repository. Collections serve as namespaces for storing specific types of content to allow for efficient organization, retrieval, and indexing of data across the ATmosphere.
Each record in a user's belongs to exactly one collection, which is specified by a Namespaced Identifier (NSID). The collection NSID corresponds to the record's type, creating a direct relationship between the data's structure and its storage location. For example, posts in the Bluesky social application are stored in the app.bsky.feed.post
collection.
Collections provide a logistical separation of different types of content while maintaining a unified repository structure. This organization enables efficient querying and indexing of specific content types accross the network.
Architecture[edit | edit source]
Structure and Addressing[edit | edit source]
Within a collection, individual records are identified by a unique key, known as a record key (rkey). The combination of a user's Decentralized Identifier (DID) or handle, a collection NSID, and a record key forms an AT URI that uniquely identifies any record within the network. For example:
at://did:plc:abcdef123456/app.bsky.feed.post/3jui7kdo2ck
This URI points to a specific post (with key 3jui7kdo2ck
in the app.bsky.feed.post
collection of the user with the specified DID.
Record keys are typically generated as Timestamp Identifiers (TID) which embed creation time information, though other formats are supported for specific use cases. The key format requirements are defined by the lexicon for each collection's record type.
Indexing and Querying[edit | edit source]
Collections are an indispensible part of the network's indexing infrastructure. Relays and AppViews use collection NSIDs to filter and process specific types of records from the firehose of network activity. This allows them to build specialized indices and provide efficient query capabilities for particular content types. For example, an AppView might specifically index the app.bsky.feed.post
and app.bsky.feed.like
collections to build a timeline view, while ignoring other collections that aren't relevant to that particular feature.
Data Portability[edit | edit source]
The collection-based organization of repositories facilitate data portability within the AT Protocol. When users migrate between Personal Data Servers (PDSes), their entire repository, including all collections and records, can be exported and imported as a unit, preserving the organizational structure and relationships between records.
Security and Access Control[edit | edit source]
Collections provide a natural boundary for access control policies. While most collections in the current AT Protocol implementation contain public records, the architecture supports future extensions for collection-level privacy settings. For example, future implementations might include private or group-restricted collections that are only visible to specified users or groups, enabling private messaging and selective sharing features.
Repository Implementation[edit | edit source]
In the underlying data repository implementation, collections are represented as paths within the Merkle Search Tree (MST) data structure. The MST organizes records hierarchically, with collection NSIDs forming part of the peath for each record.
When a PDS receives a request to create or update a record, it validates that the record's $typr
field matches the collection NSID where it's being stored to ensure type consistency within collections and prevent misplaced records.