Indexing Ethereum data efficiently with The Graph

Being able to consume relevant and concise data is essential to virtually any web application, the same applies to decentralized applications (or dApps), which are built on top of the blockchain, using it as the primary source of information. However, as opposed to typical data storage systems, some aspects of the blockchain make it time-consuming and painfully challenging to query through data. This is the problem The Graph strives to solve. This blog post will go over how and why to use The Graph to index relevant data from the Ethereum blockchain.A few terms in this article may be unclear for anyone unfamiliar with the whole Web3 context. So, before we dive in any further, let's go over some key concepts:BlockchainA blockchain is essentially a digital ledger of transactions that is duplicated and distributed across the entire network of computer systems on the blockchain.EthereumEthereum is a decentralized blockchain platform that executes and verifies application code, called smart contracts.dAppsDApps are web applications built on open, decentralized, peer-to-peer infrastructure services. The concept of DApps is meant to take the Web to what some believe to be its next evolutionary stage. The term used to describe this evolution is web3, meaning the third version of the web.Smart contractsA smart contract is a piece of code running on Ethereum. It’s called a “contract” because this code can control valuable things like ETH or other digital assets.<h2 id="diving-in">Diving in</h2>As mentioned before, a blockchain is an immutable, shared, write-only, distributed ledger. It is a database that stores encrypted blocks of data in a decentralized network. These blocks are chained together, each block consisting of records of all new transactions processed in that block and information from the previous block.Data stored in the blockchain is distributed by nature, making it challenging to read anything other than basic data directly. Typically, one would need to travel down the chain, block by block, to access simple information such as an account's transaction history. Complexity exponentially escalates when developing a dApp that requires large amounts of data processing.Currently, it is possible to use centralized platforms that facilitate consuming data from the blockchain. That, however, violates the very nature of a decentralized application. With The Graph, developers can use subgraphs to provide their dApps with relevant data instead of being limited by third-party blockchain data providers.<h2 id="what-is-the-graph">What is The graph</h2>The Graph is a decentralized protocol for indexing and querying blockchain data. It enables collecting, processing, accessing relevant information that is difficult to query directly and exposing them through open GraphQL based APIs. These APIs are called subgraphs and can be tailored to attend to an application's specific needs.One of the main advantages of using The Graph is that it can help overcome the initial technological barrier for developers to start working on the Web3 space since it provides a way to query data with GraphQL, a fairly used web 2.0 technology, which provides ordering and filtering out-of-the-box.The Graph also provides many available, ready-to-use subgraphs like Uniswap, LivePeer, Enzyme, and others. To use these subgraphs one can simply make requests to their API URL. For the full list of available subgraphs, visit the <a href="https://thegraph.com/explorer">Graph explorer</a>. It is also possible to test public subgraphs directly from the explorer using the playground feature.<h2 id="starting-with-the-graph">Starting with The graph</h2>The Graph documentation already offers thorough examples on <a href="https://thegraph.com/docs/en/developer/create-subgraph-hosted/">how to create a subgraph</a> from start to finish so we will skip some steps. For explanation purposes, however, the code samples used in this blog post illustrate a simple subgraph for tracking activity for the <a href="https://meebits.larvalabs.com/">Meebits NFT</a> smart contract.To check more about the Meebits contract and other contracts, visit <a href="https://etherscan.io">Etherscan</a>, a web explorer for the Ethereum network. Simply type in “Meebits“ into the search field and it will display key pieces of information on the contract address, last transactions, and much more.<h3 id="creating-a-subgraph">Creating a Subgraph</h3>The first step to creating a subgraph is defining what data we want to index. This is specified in a YAML file, called manifest. The manifest is the main subgraph configuration. It defines the addresses for the smart contracts the subgraph should consider, which contract events or functions to listen to, and which handlers will transform these events into entities to be stored. A subgraph manifest file looks like the following:<pre><code class="language-yaml">specVersion: 0.0.1 description: Meebits Subgraph repository: https://github.com/<your-account>/meebits-subgraph schema: file: ./schema.graphql dataSources: - kind: ethereum/contract name: Meebits network: mainnet source: address: "0x7Bd29408f11D2bFC23c34f18275bBf23bB716Bc7" abi: Meebits mapping: kind: ethereum/events apiVersion: 0.0.5 language: wasm/assemblyscript entities: - User - NFT abis: - name: Meebits file: ./abis/Meebits.json eventHandlers: - event: Transfer(indexed address,indexed address,indexed uint256) handler: handleTransfer file: ./src/mapping.ts</code></pre>Every time a Meebit token is transferred from one account to another, it emits a Transfer event. With this manifest, our subgraph will listen to each of these events and then call the handleTransfer method, which will be defined later.The next step is to define <a href="https://thegraph.com/docs/en/developer/create-subgraph-hosted/#the-graph-ql-schema">a GraphQL schema</a>. This does require some knowledge of how entity types are declared. Gladly, GraphQL provides extensive documentation and an active community around it. This schema will define the data model for the subgraph, the format in which this data can be queried, and also be used to generate fields for querying instances of that entity type. It allows indexing data in a format made to suit the application’s UI. For example, to save a list of Meebits NFTs owned by a specific account, here is what the schema would look like:<pre><code class="language-graphql">type User @entity { id: ID! nfts: [NFT!]! @derivedFrom(field: "owner") } type NFT @entity { id: ID! owner: User! URI: String! }</code></pre>Finally, we should define our <a href="https://thegraph.com/docs/en/developer/create-subgraph-hosted/#writing-mappings">mappings</a>. A mapping, in this context, is the code that will handle events from the contracts and allows saving to the entities defined in the schema. These mappings are written in AssemblyScript, which is a subset of Typescript. They are what glue things together in the subgraph and implement the methods specified previously in the manifest.<pre><code class="language-typescript">import { Meebits, Transfer, } from "../generated/Meebits/Meebits" import { NFT, User } from "../generated/schema" export function handleTransfer(event: Transfer): void { let contract = Meebits.bind(event.address) const tokenId = event.params.tokenId; const userAddress = event.params.to.toHexString() // Load existing NFT object let nft = new NFT(tokenId.toHex()); if (!nft) { // create new NFT in case it does not exist nft = new NFT(tokenId.toHex()); } // Load existing User object let user = User.load(userAddress) if(!user){ // Create new user in case it does not exist user = new User(userAddress) user.save() } nft.owner = user.id // Calls the contract to save the NFT URI nft.URI = contract.tokenURI(event.params.tokenId) nft.save(); }</code></pre>After that, we can use the following query to retrieve the information from the subgraph:<pre><code class="language-graphql">{ user(id: “ACCOUNT_HEX_ADDRESS_HERE”) { nfts { id URI } } }</code></pre>The graph offers a command line <a href="https://thegraph.com/docs/en/developer/quick-start/#2-initialize-your-subgraph">scaffolding tool</a> to walk through the steps for creating a subgraph and generating most files.<h3 id="deploying-a-subgraph">Deploying a Subgraph</h3>After creating a subgraph, it needs to be deployed. The graph offers a Hosted service, where subgraphs can be published. Follow the docs on how to deploy a subgraph<a href="https://thegraph.com/docs/en/developer/quick-start/#4-deploy-your-subgraph"> here</a>. Before publishing, one can also use a query URL for development purposes.The demo Meebit subgraph has been deployed and can be found <a href="https://thegraph.com/hosted-service/subgraph/marycaroline/meebits">here</a>. Try it out using the hosted service playground.<figure class="kg-card kg-image-card kg-width-wide"><img src="https://s3.amazonaws.com/vintasoftware-wagtail-ghost/blog/2022/02/Peek-2022-02-03-16-01.gif" class="kg-image"></figure><h2 id="final-thoughts">Final thoughts</h2>As with any other technology, there are some drawbacks one must consider:<ul><li>Local deployment is possible, but indexing a subgraph locally can take up to a few hours, slowing the development process.</li><li>Despite being a subset of Typescript, AssemblyScript is an even newer, still evolving technology, and as such it has limitations. As of now, some important features are not yet supported, such as closures and exception handling. Learn more about some AssembyScript limitations on <a href="https://thegraph.com/docs/en/developer/assemblyscript-api/">The graph docs</a> or check <a href="https://www.assemblyscript.org/status.html#language-features">here</a> for the full list of language features.</li><li>Debugging is challenging, especially for errors that occur while indexing since logs are not optimal.</li></ul>Nevertheless, The graph is growing and gaining substantial interest from the community as an important tool for developing dApps. As of now, its hosted service is home to over 10k subgraphs, and that number keeps rising, making it, at the very least, worth checking out.<h4 id="references">References</h4><a href="https://thegraph.com/docs/en/about/introduction/">https://thegraph.com/docs/en/about/introduction/</a><a href="https://ethereum.org/en/whitepaper/">https://ethereum.org/en/whitepaper/</a><a href="https://ethereum.org/en/developers/docs/dapps/">https://ethereum.org/en/developers/docs/dapps/</a>

Indexing Ethereum data efficiently with The Graph

Join the Tech Forward newsletter

Related articles

Django AI Assistant: bring AI features to your Django project

Controlling access: Django permission apps comparison

Re-using a file in multiple serializers in the same request