Google News is a news aggregation product from Google that collects news articles from many publishers and organizes them by topic, source, location, and user interest. Instead of writing its own news, Google News crawls and indexes articles from news websites, then ranks and groups related stories so users can quickly see what is happening across different sources.
Product Design Requirements
Functional Requirements
- Users can view a unified feed of news articles from thousands of publishers worldwide.
- Users can keep scrolling to load more articles.
- Users can click an article and open the full story on the publisher’s website.
Non-Functional Requirements
- The system should favor high availability over strong consistency; eventual consistency is acceptable.
- The system should scale to support around 100M daily active users, with traffic spikes up to 500M users.
- The system should serve the news feed with low latency, targeting under 200 ms for feed load time.
Design Setup
Data Model
- Article: The core content object in the system. It includes fields such as articleId, title, summary, thumbnailUrl, publishedAt, publisherId, region, and mediaUrls.
- Publisher: A news source that provides articles to the platform. It includes fields such as publisherId, name, websiteUrl, feedUrl, and region.
- User: A reader of the news feed. The user entity can include userId and region, where region may be explicitly set or inferred from signals like IP address. Even for anonymous users, we can still track basic context needed to personalize or localize the feed.
API Design
We just need a simple GET endpoint to retrieve an aggregated feed of news articles:
GET /feed?pageSize={size}&offset={}