Google Docs is an online document editor from Google, similar to Microsoft Word but cloud-based. It lets users create, edit, share, and collaborate on documents in real time. Since files are saved in Google Drive, people can access them from different devices and work together through comments, suggestions, and live editing.
Product Design Requirement
Functional Requirements
- Users should be able to create new documents and store them in the system.
- Users should be able to invite or share a document with others so multiple people can edit the same document at the same time.
- When one user makes an edit, other users viewing the same document should see that change in real time.
- The system should also support collaborative presence features, such as showing who is currently viewing the document and where each user’s cursor is located.
Non-Functional Requirements
- Documents should be eventually consistent, so all users eventually converge to the same document state.
- Edits should be low latency, ideally visible to other users within 100ms.
- The system should scale to millions of concurrent users and billions of documents.
- Each document can have up to 100 concurrent editors.
- Documents should be durable, so accepted edits are not lost after server failures.
- The system should remain available even if individual servers restart or crash.
Design Setup
Data Model
- Editor: A user who is actively viewing or modifying a document.
- Document: A shared text file that can be edited by one or more users.
- Edit: A change applied to the document, such as inserting, deleting, or updating text.
- Cursor: The editor’s current position in the document, also used to show user presence.
API Design
After defining the requirements, we can move to the API design. For a Google Docs-like system, we can separate the APIs into two groups: document management APIs and real-time collaboration APIs.
Document management can be handled with normal REST APIs because these operations are request-response based. Users need to create documents, open documents, rename them, and update sharing permissions.
POST /docs { title: string } -> { docId: string }
We may also have APIs like GET /docs/{docId} to load document metadata and PATCH /docs/{docId} to update the title or permissions.
The real-time editing path is different. Once users are inside a document, they need to send edits and receive updates from other collaborators with low latency. REST is not a good fit here because polling would be inefficient. Instead, we can use a WebSocket connection for each active document session.
WS /docs/{docId}
Over this connection, clients can send operations such as insert, delete, and cursor updates.
SEND { type: "insert", position: number, text: string, baseVersion: number } SEND { type: "delete", start: number, end: number, baseVersion: number }
The server validates each operation, assigns it an authoritative document version, resolves conflicts if needed, and broadcasts the update to other connected users.
RECV { type: "update", operation: object, serverVersion: number }
Cursor movement and presence can also be sent through WebSocket, but they usually do not need to be persisted because they are temporary.
Overall, REST APIs are suitable for document metadata and permissions, while WebSockets are better for collaborative editing, cursor updates, and presence. This split keeps the API design simple while still supporting the real-time editing experience.