{"id":145848,"date":"2026-03-31T14:30:30","date_gmt":"2026-03-31T14:30:30","guid":{"rendered":"https:\/\/mycryptomania.com\/?p=145848"},"modified":"2026-03-31T14:30:30","modified_gmt":"2026-03-31T14:30:30","slug":"designing-an-aave-v3-indexer-challenges-and-insights","status":"publish","type":"post","link":"https:\/\/mycryptomania.com\/?p=145848","title":{"rendered":"Designing an Aave V3 Indexer: Challenges and Insights"},"content":{"rendered":"<p>I spent the last few months building an <a href=\"https:\/\/github.com\/dlr-a\/aave-v3-tracker\">Aave V3 indexer<\/a> in Rust. It reads on-chain events from Ethereum Mainnet and writes protocol state into PostgreSQL -reserve configs, interest rate data, user supply\/borrow positions, and eMode categories. The stack is Rust with Tokio for async, Alloy for Ethereum interaction, and Diesel for PostgreSQL.<\/p>\n<p>It sounded straightforward at first: listen to events, decode them, write to a database. In practice, almost every assumption I started with turned out to be wrong or incomplete.<\/p>\n<h3>1. Large Numbers Need the Right\u00a0Type<\/h3>\n<p>Solidity uses integers for everything -1.5 ETH is 1500000000000000000 on-chain. Aave&#8217;s internal math goes further: it uses RAY (10^27) as a fixed-point base, so an intermediate calculation like amount \u00d7 RAY, where the amount is already 10^18, produces 10^45. Rust&#8217;s u128 maxes out at roughly 10^38, so it overflows. Aave itself uses uint256 in Solidity, and the Rust equivalent is Alloy&#8217;s\u00a0U256.<\/p>\n<p>That handles the math, but U256 isn\u2019t a storage type. For values that stay large -balances, liquidity indices, supply caps- BigDecimal maps cleanly to PostgreSQL\u2019s NUMERIC and doesn\u2019t lose precision.<\/p>\n<h3>2. Self-Healing: What Happens When an Unknown Reserve Shows\u00a0Up<\/h3>\n<p>During backfill, maybe I missed an event, and I get a ReserveDataUpdated event for a reserve that didn&#8217;t exist in the database yet. This can happen because we may have missed the ReserveInitialized event for that\u00a0reserve.<\/p>\n<p>The question was: what should the indexer do? Skipping the event because we don\u2019t have that reserve is completely the wrong way to go, because apparently, we missed something. Throwing an error and stopping means that a single edge case blocks the entire system, which does not make\u00a0sense.<\/p>\n<p>I saw it as an opportunity to fix missing data, and I went with self-healing: if an event references a reserve that\u2019s not in the database, the indexer fetches that reserve\u2019s full config from the chain via RPC, inserts it, and then continues processing normally. It prevents a missing reserve from blocking the entire backfill.<\/p>\n<p>Code is shown\u00a0below:<\/p>\n<p>if !exists {<br \/>            warn!(<br \/>                asset = %asset_str,<br \/>                block = block_number,<br \/>                &#8220;Reserve not found in DB \u2014 fetching from RPC&#8221;<br \/>            );<\/p>\n<p>            process_reserve(pool, &amp;provider, asset_addr, data_provider_addr, pool_addr, Some(block_number as u64))<br \/>                .await<br \/>                .wrap_err_with(|| {<br \/>                    format!(<br \/>                        &#8220;Self-healing: failed to fetch reserve {} from RPC&#8221;,<br \/>                        asset_str<br \/>                    )<br \/>                })?;<br \/>        }<\/p>\n<p>The same pattern applies to eMode categories -if an event references a category that doesn\u2019t exist locally, the indexer pulls the full category definition from the chain before continuing. The idea is simple: never stop, never lose data, and recover automatically when possible.<\/p>\n<h3>3. Reorg, WebSocket, and a Deliberate Tradeoff<\/h3>\n<p>Ethereum can reorganize its recent history, and any events you indexed from those discarded blocks may now be invalid in a reorg -but they\u2019re already in your database. So, processing events in real time with WebSocket is not acceptable. Because without reorg handling, a reorganization would leave stale data in the database with no way to detect or fix\u00a0it.<\/p>\n<p>Instead of implementing full reorg detection and rollback -which is a significant undertaking on its own- I made a deliberate tradeoff: disable WebSocket entirely and only write data through HTTP-based backfill with a 20-block confirmation delay. On Mainnet, 20 blocks are roughly 4 minutes. At that depth, a reorganization is highly unlikely. Real-time data is sacrificed, but correctness is guaranteed -and for my indexer, correctness matters more than\u00a0latency.<\/p>\n<h3>4. Multi-RPC Failover<\/h3>\n<p>RPC providers go down, hit rate limits, or return server errors -and when your indexer depends on a single provider, any of these stops everything.<\/p>\n<p>The indexer sticks with one until it starts failing. If a chunk fails and the error looks like a rate limit, timeout, or server error, the provider rotates automatically to the next one. No manual intervention, no downtime.<\/p>\n<h3>5. Cold Start: Subgraph Bootstrap + Event\u00a0Replay<\/h3>\n<p>When you start an indexer from scratch, you need a historical state. There are two obvious approaches: replay every event since Aave V3\u2019s deployment, or pull the current state from an external source like The\u00a0Graph.<\/p>\n<p>Replaying from the beginning is complete but extremely slow -Aave V3 on Ethereum has millions of events. Relying entirely on a subgraph is fast but means trusting an external source for accuracy, and subgraphs can have their own indexing\u00a0issues.<\/p>\n<p>I combined both: seed user positions from the subgraph at a recent block height, then switched to event replay from that block forward. The subgraph gives you speed for the initial load, and event replay gives you accuracy going\u00a0forward.<\/p>\n<p>I chose this approach because replaying millions of blocks is resource-intensive and time-consuming. But more importantly, I wanted to make sure my event processing and calculations were actually correct before committing to a full historical sync. If there\u2019s an inaccuracy in how events are processed, waiting for millions of blocks to sync just to discover a bug at the end would be a waste of\u00a0time.<\/p>\n<p>Also, starting from a recent snapshot let me improve the design much faster -and instead of spending time waiting for a full sync, I could focus on the other parts of the system, which turned out to be much more instructive. It was a deliberate tradeoff for my own learning\u00a0process.<\/p>\n<p>The tradeoff is real, though: the quality of your starting state depends entirely on the external source. Any inaccuracy in the subgraph\u2019s data propagates into every subsequent position update. I noticed some inconsistencies in practice, which is why the project\u2019s README lists this as a known limitation.<\/p>\n<h3>6. Event Replay Safety and Idempotency<\/h3>\n<p>If the indexer crashes while processing a chunk of events, that chunk gets retried on restart. Without safeguards, events can be processed twice -and for events like Mint or Burn, that means double-counting a user\u2019s balance or incorrectly reducing\u00a0it.<\/p>\n<p>The fix is deduplication: every event on Ethereum can be uniquely identified by its transaction hash and log index -log index is needed because a single transaction can emit multiple events. Before processing any event, the indexer records this pair. If the same pair appears again during a retry, it is\u00a0skipped.<\/p>\n<p>This makes the event handler idempotent. You can replay the same block range as many times as you want. If an event has already been processed, running again won\u2019t change the database\u00a0state.<\/p>\n<p>Combined with checkpoint recovery -saving progress per block range so the indexer resumes from where it left off- this gives you crash safety. The indexer can be stopped and restarted at any point without data corruption or double-counting.<\/p>\n<h3>7. Position Tracking: Harder Than It\u00a0Looks<\/h3>\n<p>I initially assumed the Pool contract was the single source of truth for user balances. But because aTokens are real ERC-20 tokens, users can transfer them directly to other addresses without going through the\u00a0Pool.<\/p>\n<p>This means watching Pool events alone isn\u2019t enough -the Pool doesn\u2019t see direct aToken transfers between users. To keep positions accurate, you also need to track events on each individual token contract.<\/p>\n<p>On top of that, the balances themselves aren\u2019t straightforward. Aave doesn\u2019t store actual token amounts -it stores \u201cscaled\u201d values. The relationship looks like\u00a0this:<\/p>\n<p>actualBalance = scaledBalance \u00d7 liquidityIndex \/\u00a0RAY<\/p>\n<p>The liquidity index grows over time as interest accrues, which means every user\u2019s real balance increases passively -without any individual events being\u00a0emitted.<\/p>\n<p>Not all token events report values the same way. I track BalanceTransfer -Aave&#8217;s custom event for aToken transfers between users- rather than the standard ERC-20 Transfer event. A single user-to-user transfer can emit multiple Transfer events: one for the actual move, and up to two more from address(0) for interest that accrued on each side since their last interaction. All of them carry actual amounts, so you&#8217;d need to fetch the index separately and call rayDiv on each one. BalanceTransfer sidesteps all of this: it emits exactly once per transfer, includes the scaled amount directly alongside the index, and is much simpler to track and compute\u00a0with.<\/p>\n<p>Mint and Burn are more involved. Both events carry a balanceIncrease field -the interest that accrued on the user&#8217;s position since their last interaction. For Mint, the emitted value is amount + balanceIncrease, so you subtract the accrued interest before dividing: (value &#8211; balanceIncrease).rayDiv(index). For Burn, the emitted value is amount &#8211; balanceIncrease, so you add it back first: (value + balanceIncrease).rayDiv(index).<\/p>\n<p>Due to those calculations, you may have wei drift per event. You can even see this on-chain. Here\u2019s an example transaction where the Supply event and the Mint event in the same transaction report values <a href=\"https:\/\/etherscan.io\/tx\/0x5e9eb74f9f6130d951053c5a7fefdae88b229d28e0a53c3604fff66124319e91#eventlog\">that differ by 1 wei<\/a> due to Aave&#8217;s internal fixed-point rounding (rayDiv\/rayMul). This \u00b11 wei drift is a protocol-level property, not a bug -but it means the indexer has to expect and handle small discrepancies in every calculation.<\/p>\n<h3>What\u2019s Still\u00a0Missing<\/h3>\n<p>The project is far from done. Reorg handling isn\u2019t implemented -the 20 block delay is a workaround, not a solution. Asset prices aren\u2019t tracked yet, which means health factor calculation isn\u2019t possible. The Pool and PoolConfigurator addresses are hardcoded, so if Aave governance deploys new contracts, the indexer won\u2019t\u00a0notice.<\/p>\n<p>There\u2019s also the subgraph accuracy issue I mentioned: any error in the bootstrapped data carries forward into all subsequent updates.<\/p>\n<p>Most of what I learned from this project didn\u2019t come from the Rust code itself -it came from understanding how the protocol actually works and why simple assumptions break quickly when you\u2019re dealing with real on-chain\u00a0data.<\/p>\n<p>This is an actively evolving project -code and design decisions may change as I learn\u00a0more.<\/p>\n<p>More details and known tradeoffs in the\u00a0<a href=\"https:\/\/github.com\/dlr-a\/aave-v3-tracker\">Repo<\/a>.<\/p>\n<p><a href=\"https:\/\/medium.com\/coinmonks\/designing-an-aave-v3-indexer-challenges-and-insights-a15b5bcd7961\">Designing an Aave V3 Indexer: Challenges and Insights<\/a> was originally published in <a href=\"https:\/\/medium.com\/coinmonks\">Coinmonks<\/a> on Medium, where people are continuing the conversation by highlighting and responding to this story.<\/p>","protected":false},"excerpt":{"rendered":"<p>I spent the last few months building an Aave V3 indexer in Rust. It reads on-chain events from Ethereum Mainnet and writes protocol state into PostgreSQL -reserve configs, interest rate data, user supply\/borrow positions, and eMode categories. The stack is Rust with Tokio for async, Alloy for Ethereum interaction, and Diesel for PostgreSQL. It sounded [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":145849,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-145848","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-interesting"],"_links":{"self":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/posts\/145848"}],"collection":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=145848"}],"version-history":[{"count":0,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/posts\/145848\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=\/wp\/v2\/media\/145849"}],"wp:attachment":[{"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=145848"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=145848"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mycryptomania.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=145848"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}