Tracking inbox state

Today I worked on the logic for tracking the state of posts in a user’s inbox. I’ve abstained from jumping too quickly into the implementation phase, to avoid painting myself into a corner and potentially having to rewind data architecture decisions prematurely.

It’s worth mentioning a few articles that are helping guide my thinking:

Lee Byron’s post architecting a Facebook-like activity feed
→ The definition of event sourcing and various related articles floating around the internet

Some questions I’ve been pondering:

→ How should prioritization work?
→ Should posts ever automatically be dismissed when read?
→ How should I represent the “reason” a post is in the inbox?
→ Should I use fanout on read or write to assemble the inbox?
→ How will Level prevent inbox bloat for the user?

I’ll admit, I’ve felt a bit paralyzed by the scope of all these decisions the last few weeks. Rather than let the paralysis continue, I decided to start nibbling away at the problem over the weekend and into today.

So far, I have:

→ Created a log table to track changes in the relationship between a post and a user (marked as unread, marked as read, dismissed from the inbox, subscribed, unsubscribed, etc.)
→ Created a table to track views of posts
→ Added a field to the post-user join table called inbox_state to hold the current state: read, unread, dismissed, or “excluded” (similar to “dismissed,” but means the post was never in the inbox)
→ Refactored the “create post” and “create reply” operations and added a bunch of post-create tasks: subscribe any @-mentioned users, mark the post as “unread” for any subscribed users, insert log entries, and propagate pubsub events for the front-end to intercept
→ Added filtering by inbox state on the posts GraphQL collection

My primary concern is the sheer amount of work that happens during the create operation. For example, if a post has hundreds of people subscribed to it, then hundreds of database records would be inserted/updated after every single reply is sent with my current implementation.

This is a challenge of scale that I don’t expect to manifest in the near-term (at least not while I’m testing with early users and validating my product assumptions). For now, updating state at write-time makes fetching the correct posts at read time very performant, without having to join a bazillion tables together. Down the line, some or all of this fan-out operation may need to shift to read-time or period background refresh jobs.

There’s much more on this topic to come in future posts!

 
23
Kudos
 
23
Kudos

Now read this

JavaScript raises my blood pressure

This morning I worked on fleshing out pagination on various pages that need it (viewing posts in a group, your “pings,” and the unified activity stream). I had previously punted on implementing it, but it’s definitely a requirement for... Continue →