Hashnode Creates Scalable Feed Architecture on AWS with Step Functions, EventBridge and Redis

Hashnode created a scalable event-driven architecture (EDA) for composing feed data for thousands of users. The company used serverless services on AWS, including Lambda, Step Functions, EventBridge, and Redis Cache. The solution leverages Step Functions' distributed maps feature that enables high-concurrency processing.

The company previously implemented a solution to provide personalized user feeds but soon discovered that the solution suffered from issues around slower page loads and a potential risk of destabilizing the database due to executing expensive queries while composing user feeds on the fly. Florian Fuchs, software engineer at Hashnode, describes the overall idea for optimizing feed calculations:

To optimize page speed, we found that pre-calculating feeds for users is the best option. This means we don't have to calculate the feed every time a user visits our feed page. Instead, we can return the feed from the cache and make page loading times faster. A crucial enabler for this is using a cache. With the fast access a cache offers, we can directly load the feed from there to be presented for our users.

Engineers implemented the feed calculation logic in AWS Step Functions with two workflows. The first workflow uses three Lambda functions to prepare user data for the feed calculation. Lambda functions extract relevant data from the database and store it in the AWS ElastiCache (Redis) cache. The second workflow is responsible for the actual feed calculation. Depending on whether the cached metadata is found for the user, the feed calculation logic can be either fully based on metadata sourced from the Redis cache or require extracting user metadata for the database.

In the new architecture, feed recalculation is triggered by events for the post creation or update, published into the AWS Event Bridge, or periodically, with the help of EventBridge Scheduler.

The Hashnode team leveraged the Map state in Step Functions, which is helpful for orchestrating parallel workloads. The map state supports two modes, depending on the processing requirements. The default inline mode offers limited concurrency and only accepts a JSON array as input. The distributed mode is suitable for large-scale parallel workloads and supports processing data sources stored in S3. In the distributed mode, Step Functions can run upwards of 10,000 parallel child workloads.

Step Functions with Distributed Map State (Source: AWS Documentation)

The solution employs two step functions using the map state in the distributed mode, one for users with cached metadata and one for users where no metadata was found. Developers report that, for now, the full recalculation of feeds for thousands of users takes only 26 seconds. The team additionally implemented periodic cache-purge logic to ensure old cached data is removed regularly.

About the Author

Rafal Gancarz

Show moreShow less

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

About the Author

Rafal Gancarz

Rate this Article

This content is in the Scalability topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter