Personal Data Server (PDS): Difference between revisions

From ATProto Wiki
No edit summary
No edit summary
Line 32: Line 32:
== Technical Requirements ==
== Technical Requirements ==
Running a PDS requires HTTP and WebSocket server capabilities, database storage for repositories and metadata, blob storage for media files, cryptographic signing capabilities, and network connectivity for federation. Resource requirements scale with the number of hosted users and their activity levels, but are generally modest compared to traditional social media platforms.
Running a PDS requires HTTP and WebSocket server capabilities, database storage for repositories and metadata, blob storage for media files, cryptographic signing capabilities, and network connectivity for federation. Resource requirements scale with the number of hosted users and their activity levels, but are generally modest compared to traditional social media platforms.
== Further Reading ==
* [https://github.com/bluesky-social/atproto/discussions/2350 What does a PDS implementation entail?]

Revision as of 15:00, 13 March 2025

A Personal Data Server (PDS) is the main entry point and digital home of users within the AT Protocol. They store a user's data repository and blobs, manage user identity, and provides the APIs necessary for data queries, cryptographic signing, and other interactions with the broader network. PDSes provides an update stream for its data repositories, which are crawled by relays to broadcast new records in relay firehoses.

Core Functions

PDSes perform several user-centric functions within the AT Protocol.:

Repository Hosting

The primary function of a PDS is to store all user-created content in a Merkle Search Tree (MST) structure. This ensures the cryptographic integrity of the user's data while providing access to public records via XPRC APIs. The PDS also streams real-time updates to relays via WebSockets, allowing the network to efficiently aggregate and broadcast updates to user repositories.

Identity Management

Each PDS maintains the connection between a user's handle and decentralized Identifier (DID), essentially anchoring their digital identity. It handles authentication and authorization via OAuth, manages account lifecycle events (creation, deactivation, deletion), and securely stores private user preferences and settings that shouldn't be publicly visible.

Media Storage

Beyond text-based content, PDSes host blobs - images, videos, and other media files shared by users. The PDS manages the complete lifecycle of these files from upload through reference to eventual deletion, providing access to media via content-addressed AT URIs that ensure integrity.

Network Communication

PDSes act as intermediaries between users and the broader network, relaying actions to appropriate services and proxying requests when needed. They implement the federation protocols that allow users on different PDSes to interact seamlessly with one another.

Architecture

PDSes are designed to be lightweight and modular. A single PDS can host anywhere from one to hundreds of thousands of user accounts, depending on its resources and configuration. PDSes are designed such that users can self-host their own PDS on modest hardware (even a Raspberry Pi). Service provides can host PDSes for many users efficiently, and users can migrate between PDSes without losing their identity or data.

Hosting Models

The AT Protocol supports various PDS hosting models. Users can run their own PDS on personal hardware or a virtual private server, giving them complete control over their data and server configuration. Most users use a PDS operated by a service provider, which may offer free or paid tiers with different features and capabilities.

PDS Entryway

For large-scale PDS hosting operations, an "Entryway" service provides centralized account distribution, session management, request routing, and OAuth authorization. Users of such services don't need to know which specific PDS hosts their account - they simply interact with the entryway domain. For example, Bluesky's PDS Entryway service (at bsky.social) distributes users across multiple physical PDSes while presenting a single logical service to users and applications.

When a user creates an account through an entryway like bsky.social, the service assigns them to a specific PDS behind the scenes. The entryway then handles routing requests to the appropriate server and manages authentication across the entire service. This architecture allows for efficient scaling while maintaining a simple user experience.

Data Portability

One of the key advantages of the AT Protocol is that users aren't locked into a specific PDS. If a user wants to change providers, they can export their repository and blobs from their current PDS, import this data to a new PDS, update their DID Document to point to the new PDS, and continue using the network with all their content and social connections intact. This portability ensures that users maintain ownership of their data and social graph regardless of which PDS they use.

Technical Requirements

Running a PDS requires HTTP and WebSocket server capabilities, database storage for repositories and metadata, blob storage for media files, cryptographic signing capabilities, and network connectivity for federation. Resource requirements scale with the number of hosted users and their activity levels, but are generally modest compared to traditional social media platforms.

Further Reading