Select Page

We are proud to announce the completion of the final milestone of our Web3 Foundation Grant for Validated Streams – a developer-friendly solution for creating reactive oracles built on top of Substrate! As part of that final milestone, we added essential features to support real-life networks, tested them, and documented the results.

Blockchains and decentralized systems have revolutionized the software landscape, offering novel possibilities for automation and eliminating the limitations of trust in centralized systems. However, a common challenge in those trustless systems lies in sourcing and using real-world data. That is where oracles play a vital role, bridging the gap between external data sources and the blockchain.

Addressing the need for a more general solution for creating oracles, our team embarked on a journey to develop an innovative framework designed to simplify the process of creating reactive blockchain oracles, over the first half of 2023. Join us as we delve into the story behind Validated Streams and explore its potential for creating decentralized data sources for decentralized applications.

The case for oracles in blockchain systems

Decentralized blockchains like Polkadot and Ethereum excel at processing transactions in a trustless manner. They achieve that by mandating that every piece of data entering the system is cryptographically signed before it can be used. Once data is inside the system, it is straightforward to work with, as all nodes can perform the same sequence of operations to arrive at the same results. However, bringing data from the wider world onto the blockchain is challenging, as it requires nodes to collectively agree on the events occurring in the real world – something that cannot be computed directly.

To address this challenge, decentralized systems rely on the so-called “oracles” – programs, contracts, or entities tasked with the responsibility of obtaining, signing, and submitting observed real-world data to the blockchain.

There exists a whole realm of oracle solutions out there – from the more popular ones like ChainLink to the specialized ones like TLSNotary to the occasional on-chain oracles. Each solution has its pros and cons; some are locked to a particular blockchain while others require trusting specific nodes to verify the proofs; some opt for higher performance while others settle for stronger security. In general, developers have to carefully pick one based on the needs of their particular applications.

Introducing Validated Streams: a simple yet powerful oracle solution

When we set out to make Validated Streams, we aimed to create a versatile solution that could verify any kind of data – potentially transforming the blockchain into an advanced byzantine-fault-tolerant communications protocol and rendering the need for other smart contracts obsolete. While there are still some refinements to be made, Validated Streams stands out among other oracle-building solutions with the following key features:

  1. Modularity – Through a modular architecture, Validated Streams enables the creation of new oracles with minimal development effort. Developers can leverage their familiarity with a wide range of programming languages to create a “trusted client”, a piece of software that observes events and submits them via GRPC, and Validated Streams handles the rest.
  2. Reactivity – Unlike oracles that require you to first submit a request before they query real-world events, Validated Streams oracles directly react to events and submit them to the chain. This reduces latency and simplifies the development of autonomous applications that operate without human intervention.
  3. Optimized on-chain activity – By default, Validated Streams incorporates a mode that allows its nodes to validate signatures off-chain and submit only the event data to the blockchain. This reduces the blockchain’s size, and, since validators already need to sign each block to achieve finality, security is not impacted by discarding old event signature data.

Validated Streams is a framework for creating decentralized, reactive oracles. If you are interested in building on it, please get in touch! We would love to help bring your vision to life as we continue refining our project.

Creating your own oracle with Validated Streams

With Validated Streams, creating an oracle is as simple as writing a new “trusted client” which connects over GRPC to submit new events.

As an illustrative example, consider the process of creating a trusted client which observes events occurring on an IRC channel. In this example, we will be using C#, although Validated Streams supports using any programming language that has GRPC bindings. All the code for this example can be found in its respective folder of the repository.

IRC sample architecture

The goal of this example will be to observe real-world events from a user sending specially-formatted text messages on an IRC channel, then witness those events on a Validated Streams chain. Drawn as an architectural diagram, our example would resemble the following:

A diagram showing a user sending a message to an IRC server which is then received by multiple trusted clients, which then each forward it to its respective validator node as a witnessed event.

Since this is a toy example, we won’t delve into what happens after the event is received. Assume this is just a guestbook where visitors can leave notes. To improve the user experience, however, we will wait for a confirmation that the message has been irreversibly stored in the chain and inform the user. Conceptually, running the same diagram in reverse:

The same diagram, modified to show the validator nodes finalizing a block and sending it to the trusted clients listening for validated events, who then attempt to arrange for one of them to message the user back.

At this point, the user will have confirmation that their message has been stored on the Validated Streams chain.

IRC sample implementation – witnessing events

To begin, we set up a C# console project for the trusted client. We modify the project files to reference the Validated Streams protobufs file to connect to the validator node and the IrcDotNet library to connect to IRC.

<Project Sdk="Microsoft.NET.Sdk">

  <ItemGroup>
    <PackageReference Include="Google.Protobuf" Version="3.23.1" />
    <PackageReference Include="Grpc.Net.Client" Version="2.53.0" />
    <PackageReference Include="Grpc.Tools" Version="2.54.0">
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
      <PrivateAssets>all</PrivateAssets>
    </PackageReference>
    <PackageReference Include="IrcDotNet" Version="0.7.0" />
  </ItemGroup>

  <ItemGroup>
    <Protobuf Include="path\to\..\validated-streams\proto\streams.proto" GrpcServices="Client" />
  </ItemGroup>

</Project>

Next, we establish a connection to the IRC server as we would normally:

using IrcDotNet;

var ircServer = Environment.GetCommandLineArgs()[2];
var botChannel = Environment.GetCommandLineArgs()[3];
var botNickname = Environment.GetCommandLineArgs()[4];

var registrationInfo = new IrcUserRegistrationInfo()
{
    NickName = botNickname,
    UserName = botNickname,
    RealName = "Validated Streams bot"
};

var ircClient = new StandardIrcClient();
using (var registeredSemaphore = new SemaphoreSlim(0, 1))
{
    ircClient.Registered += (sender, e2) => registeredSemaphore.Release();
    ircClient.Connect(ircUri, registrationInfo);
    await registeredSemaphore.WaitAsync();
}

ircClient.Channels.Join(botChannel);

var ircChannel = ircClient.Channels.FirstOrDefault(x => x.Name == botChannel);

We also connect to the Validated Streams node:

using Grpc.Net.Client;
using ValidatedStreams;

var validatedStreamsNode = Environment.GetCommandLineArgs()[1];

using var grpcChannel = GrpcChannel.ForAddress( validatedStreamsNode);
var validatedStreamsClient = new Streams.StreamsClient(grpcChannel);

At this point, we are all set to start processing messages from the IRC channel.

We parse each message containing something to be witnessed and submit it as an event. We hash the messages because Validated Streams currently requires each event to be exactly 32 bytes long.

var witnessCommandRegex = new Regex(@"^!w(?:itness)? (?<data>.+)$");
var users = new ConcurrentDictionary<ByteString, string>();

ircChannel.MessageReceived += (sender, ev) =>
{
    var witnessCommandMatch = witnessCommandRegex.Match(ev.Text);
    if (witnessCommandMatch.Success)
    {
        var data = witnessCommandMatch.Groups["data"];
        var user = ev.Source.Name;
        using var sha256 = SHA256.Create();
        var hashBytes = sha256.ComputeHash(Encoding.UTF8.GetBytes($"{user}\0{data}"));
        var hash = ByteString.CopyFrom(hashBytes);
        ValidatedStreamsClient.WitnessEvent(new()
        {
            EventId = hash
        });
        users[hash] = user;
    }
};

In the above code snippet, we also store the user who submitted a particular event in a dictionary. This will allow us to quickly look them up when sending a reply back. A better solution would be to store this data on IPFS and not in memory, but this approach keeps the example simpler.

IRC sample implementation – relaying validated events

So far, running the example would enable witnessing events and subsequently observing them on Validated Streams’s internal Substrate chain. However, having to look everything up on-chain is going to be inconvenient for users, so to enhance the experience, we will proceed to relay all finalized (or “validated”) events back to the IRC channel.

To achieve this, we first obtain the stream of validated events. Then, for each event, we send a reply to the user who originally submitted it. To avoid all trusted clients spamming the user, we sort them with a hash function and assign them “reply slots”. We then wait based on the reply slot before sending a message to provide an opportunity for clients with lower reply slots to respond first.

var replyTimers = ConcurrentDictionary<ByteString, Timer?>();
var replyFormat =  "{1}: {0} validated!";

var validatedEvents = ValidatedStreamsClient.ValidatedEvents(new()
{
    FromLatest = true,
});
await foreach (var events in validatedEvents.ResponseStream.ReadAllAsync())
{
    foreach (var @event in events.Events)
    {
        var hash = @event.EventId;
        using var sha256 = SHA256.Create();
        var slotNumber =
            ircChannel.Users
            .Select(channelUser => channelUser.User)
            .OrderBy(client =>
                sha256.ComputeHash(hash.ToByteArray().Concat(
                    Encoding.UTF8.GetBytes(client.NickName)
                ).ToArray()), new ByteArrayComparer())
            .TakeWhile(x => !(x is IrcLocalUser)).Count();
        var timeToWait = TimeSpan.FromSeconds(slotNumber * 0.5f);
        
        replyTimers.AddOrUpdate(hash,
            _ => new Timer(_ =>
           {
                if (!users.TryGetValue(hash, out var user))
                {
                    user = "<unknown user>";
                }
                IrcChannel.Client.LocalUser.SendMessage(IrcChannel,
                    String.Format(replyFormat, hashHex, user)
                );
            }, null, timeToWait, Timeout.InfiniteTimeSpan),
            (_, timer) => timer
        );
    }
}

Finally, to wrap everything together, we cancel the timer for the event if we detect someone replying before us:

var replyRegex = new Regex(@"^(?<nickname>[^!@: ]+|<unknown user>): (?<eventId>[A-Z0-9]+) validated!$");

ircChannel.MessageReceived += (sender, ev) =>
{
    var replyMatch = replyRegex.Match(ev.Text);
    if (replyMatch.Success)
    {
        var hashHex = replyMatch.Groups["eventId"].Value;
        var hash = ByteString.CopyFrom(Convert.FromHexString(hashHex));
        replyTimers.AddOrUpdate(hash,
            _ => null,
            (_, timer) => { timer?.Dispose(); return null; });
    }
};

With this last step, the implementation of the IRC example is complete. It allows events to be sent on the IRC channel, witnessed to the Validated Streams chain, and echoed back to the IRC channel once they are validated.

Running the IRC sample

To execute the code we just wrote, we need to specify an IRC server and a Validated Streams node. For instance, to run it with an IRC server and validator node both running on localhost, we can use the following command:

$ dotnet run http://127.0.0.1:6000 irc://127.0.0.1:6667 '#some-chanel' some-nickname

In the Validated Streams framework, each validator node has to operate its own trusted client. Hence, we would need to run the same or similar code with each of the validators in the network.

If we are running a development network through Docker Compose, the configuration for that might look something like this:

version: "3.3"
services:
  ircd:
    image: inspircd/inspircd-docker
    ports:
      - "6667:6667"
  validator1:
    image: comradecoop/validated-streams
    command: --alice
  trustedclient1:
    image: comradecoop/validated-streams-irc-client
    command: 'http://172.19.0.3:6000 irc://172.19.0.2:6667 #validated-stream validator1'
  # .. Same for validator2, validator3, etc. ..

Then we would connect to irc://localhost:6667 with a regular IRC client to test the setup.

Below is a sample interaction demonstrating a user interacting with the IRC sample. Observe that while all bots (trusted clients) observe the messages, only one replies to the user with status updates about the observed events.

Screenshot of a chat between a user and a couple IRC bots. The user sends "!w <text>", one of the bots replies "user: witnessing <hash>..." followed by "user: <hash> validated!"

Note that this interaction showcases the full implementation of the IRC sample, which includes additional features such as a help command and immediate feedback when the witness command is used.

Things to consider before deploying an oracle

While Validated Streams simplifies the creation of blockchain oracles, deploying one in production still requires careful consideration of a variety of factors.

Here are a few questions we think developers should consider before deploying their oracle:

Are there any single points of failure in the way data is sourced?

The example described earlier used IRC as a way for users to input data into the oracle. However, relying on a centralized service, like IRC, introduces potential vulnerabilities: if the server becomes unavailable or starts acting maliciously, the whole oracle’s availability and data integrity will be likewise impacted.

For some oracles that might not be a huge problem (if validators cannot access the IRC network, so can’t users), but for others it might compromise the applications built on top (e.g. if a DAO uses IRC as a voting method and the IRC server can fake any message, we are putting the whole DAO at the mercy of that IRC server).

Consider if there are ways to switch to using data from decentralized/peer-to-peer sources (e.g. changing the example to use some kind of p2p chat) or if there are ways to avoid having a single point of failure by combining multiple centralized sources.

Are there enough validators running the oracle?

If the validators running an oracle collude, they can produce fake just about anything. For an oracle to be truly decentralized and trustworthy, there needs to be a sufficient amount of entities running it. Finding and incentivizing enough participants to run the software while also ensuring they do not collude is a challenge that we plan on tackling in the future. Some ideas to address this issue are exploring various mechanisms of rewarding the validators and ensuring new people can become validators should they desire to.

Are there any bugs or vulnerabilities in the oracle’s code?

Validators sharing the same code introduce a potential point of failure in the fact that any bugs in the code will impact all of them the same. Therefore, it is good practice to thoroughly vet that code and encourage the existence of multiple independent implementations to reduce the risk of a catastrophic failure bringing the network down. Running testnets (perhaps with bug bounties) can also help identify issues before deploying the code in production.

The IRC sample described above actually has a bug in the way it processes input data: it never validates that the user with a particular nickname is the person they claim to be before incorporating it into the event data – ideally, we would confirm with the IRC server that they are logged in first. More sophisticated oracles should likewise be cautious about blindly trusting their input data.

Are applications upstream of the oracle hosted securely?

As the saying goes, a chain is as strong as its weakest link. In a blockchain system involving oracles, smart contracts, inter-blockchain relays, and more moving parts, a vulnerability in any single component can compromise or destabilize the entire system. For the whole application to be trustworthy, users need to have confidence not only in the oracle used but in the applications that consume the oracle’s data. If an oracle is bridged to another chain over inter-blockchain protocols, then the inter-blockchain layer must also be secure, or vulnerabilities there could enable malicious actors to manipulate the data flowing into the application.

In the IRC example, validator nodes reported data back to users over IRC, a protocol that could allow others to fake bot messages. Adding links to a blockchain explorer that users can use to double-check the bot’s replies – or better yet, instructing users on how they can validate those events with a light node locally – would help improve the security of the example.

The story of Validated Streams

Back in early 2020, we had a vision for Apocryph, a network of digital proactive, autonomous agents that can interact with the whole world around them. We dove headfirst into the problem as we had no idea how complicated it would prove. We made a variety of bold, creative decisions, certain we can significantly improve the status quo… and at the same time hoped to deliver a working solution within a few short months.

Two years of writing code later, we were exhausted. We were way past our project’s allotment of “innovation tokens”, way past our initial estimates, and as much as we wished to get the project done, it just wasn’t. But amidst that despair, a glimmer of hope shone through. We realized that we could start delivering smaller pieces of the project one by one, and leave our overwhelmingly big plan in the background for the moment. We took the piece that would have been Apocryph’s way of observing the world and reimagined it as a generic oracle solution. And thus Validated Streams was born.

To build Validated Streams, we needed some kind of blockchain node framework we could easily extend. And we wanted it to be already used in production, as we had seen enough battles getting our own blockchain framework to a testnet state.

Enter Substrate. As soon as we opened the documentation, we realized we were onto something. The concepts were named differently, but they were exactly what we needed to express in a blockchain. It was like someone knew what we had tried to do with Apocryph’s blockchain code and wrapped it in neat Rust code that Just Makes Sense – except, of course, it was simply a case of great minds thinking alike.

Once Substrate was brought up, another team member recalled Web3 Foundation’s Grants program. We decided to fire up a proposal and were quite pleasantly surprised by the way the Foundation’s team handled it: respectfully, with due diligence at every step, yet without any excessive red tape. We are honored to have participated in Web3 Foundation’s Grants program, glad to have chosen to build Validated Streams on top of Substrate, and hopeful that Substrate continues to grow as the swiss-army batteries-included blockchain-building framework it is.

Our plans for Validated Streams’s future

Looking back, we are proud to have achieved our initial goals for Validated Streams. It is a practical solution for developing decentralized oracles, enabling the easy creation of new oracles through the development of trusted clients. And to top that off, the oracles are reasonably performant – benchmark results suggest that Validated Streams oracles support processing upwards of 250 events/second.

Moving forward, there are several areas we would like to explore. First, we want to improve the performance further by customizing the gossip protocol and making good use of Merkle trees on-chain. Second, we want to make it possible to run the same trusted client code with existing blockchains directly – eliminating the latency of inter-blockchain communication. Finally, we would like to explore using Validated Streams to observe continuous data (prices, temperature, etc.) and not just discrete events.

In all of that, what excites us is the possibility of exploring more use cases for Validated Streams through projects both from the Cooperative and from third parties. It is use cases that propel the development of frameworks, and we think that what Validated Streams needs next is people using it in production.

Meanwhile, keep an eye on Apocryph. Validated Streams is just the first of a series of small Apocryph-inspired projects, and we hear that work is already underway to unveil the next piece. Stay tuned!


Comrade Cooperative is a cooperative, the legal organization closest to a DAO. If the concepts and ideas resonate with you, feel free to get in touch on our Discord server – we would love to chat with you!

Learn more about Web3 Foundation by visiting their website, and stay up to date with the latest developments by following them on Medium or Twitter.

AI disclaimer: while the first draft of this article was written by a human (hi!), ChatGPT provided many of the subsequent line edits.