Request types: User register, user login, user query, user info update.
Assume DAU is 100 million.
User register, user login, and user info update altogether: 0.5/user/day, QPS ~100
User query: 100/user/day, QPS~100,000
Which implementation is better?
A, because B,C,D will easily cause data corruption(mismatch) in case of one operation falls. Case A usually does not cause dirty data, unless setUser() was invoked after the marked place in getUser(), which causes an older version of user been stored in cache.
To avoid the inconsistency issues, which may occur here and there in a concurrent system, we should set a cache timeout, so we can make sure that all data will eventually be the same.
After a user logged in:
- The server creates a session table(session_key, user_id, expire_time)
- Return session_key as cookie
- User browser stores the session_key
- Every request the user sent will take along with it all cookies from this site(so don’t make too many cookies)
- If the server sees the session_key in the cookie is effective, log in the user
Where should we put the sessions? cache? database?
Both. In case we lost all session tables in cache, a great number of users will try to log in at the same time, causing problems.
One-way or two-way. Usually in NoSQL database as it’s simple.
- SQL: ~1k QPS
- NoSQL: ~10k QPS
- In-memory NoSQL(Redis / Memcached): ~100k QPS