Disclaimer: The transcript that follows has been generated using artificial intelligence. We strive to be as accurate as possible, but minor errors and slightly off timestamps may be present.
Jacob Steeves (00:00):
Very good. Thank you for having me, Dr. M.
Gosh, thank you for joining us again. I’m so glad this has become a regular series, and hopefully we can use it as kind of a state of affairs with BitTensor and to also have the world hear you guys out occasionally here and there on deep AI questions.
Jacob Steeves (00:21):
Absolutely. I love it as a way for us to talk to the community and and be interviewed instead of just going one-on-one. A great way for us to disseminate information. Absolutely. So I’m really happy that I’m here and everyone else is as well.
Awesome, awesome. So let’s get started. I want to ask one general questions and then we’ll get into all the cool things that are coming up with BitTensor. And for the audience, you know, for an introduction to BitTensor or, you know, what a neural net is, I’ll just quickly say that BitTensor is an open source protocol that powers a decentralized and scalable neural net where, yes, that’s what BitTensor is, but there is a lot more in terms of what BitTensor is in the first two episodes. And that’s where you can listen in. And this is the third episode. So Konstam, I’m wondering, what is AGI to you and do you see BitTensor playing a role?
Jacob Steeves (01:27):
Well, general intelligence is an interesting term. I think it’s thrown around a lot without much knowledge of the specific terms involved. So, you know, there’s the artificial aspect, which is obviously pointing towards the fact that we’re creating it with computers. And then there’s the general aspect, which I’ll get to last. And then intelligence. I’m kind of of the opinion that intelligence is sort of a universal quality of nature. It’s the ability for matter to encode information about its surroundings or another, you know, basically to encode the patterns or the semantics of a lot of noise to pick the signal from the noise. The general aspect is the part which is really undefined because it sort of implies that there’s this large scale generality, like you can generalize, this intelligence can solve all problems. But in order to. To make that claim, you have to define all the problems, and there’s actually a result in machine learning that says that if you have that, it’s not possible to to encode the solution to all problems.
And because by the very nature of extracting the signal from the noise, there’s an exclusion occurring in what you’re picking up in in the in the signal. And so if you were trying the if you do one thing, you necessarily exclude the opposite. So there is no like absolute generality to intelligence. And I think what’s interesting is that there is no absolute generality to intelligence. And I think what people usually mean is that something along the lines of being able to do what humans can do. It’s general in this in the sense that it it can match human intelligence.
There’s a there’s a really great paper called The Measure of Intelligence by Francis Collet, the various ways in which we define intelligence. You know, the the the idea of sorry, the problem of trying to define intelligence goes back a pretty long way to the field of psychology, and then only just recently became more of a computer science question. And his his take is that Francis Collet’s, you know, take on the entire problem is that general intelligence is the ability to adapt to new tasks, which you’ve not seen before. And so if you can generalize to just tasks in general, effectively.
And he came up with a basically a data set of problems, where the the the machine learning problem that’s being solved, there is no there’s no data set that the model is allowed to train on, it has to train on data that is completely separate from from that task in specifically. So general intelligence, in my opinion, is is the ability to adapt to a large, a large, large, large number of problems, that where that large set becomes general enough, I think is sort of up to us to decide, is it is it all of humanity?
Problems, can we do anything with computers that we can do? As humans? I think that’s pretty good definition. And and I think I’ve said this before on one of your podcasts. When you when you frame it that way. I think a better term is like human obsoleting intelligence. Because if we were to solve all the problems that humans solve, we would essentially be obsoleted. So that’s this point that I would say is general artificial intelligence is and then your your final part of your question was, do you think the tensor will be able to approach this? Obviously, we’re on the same on the same track as a lot of other companies in the space that are trying to make more and more intelligent systems. But right now, we’re working in pretty limited domain. So we’re working in just textual understanding it with bit tensor and even with images. The domain into against which we’re solving AI problems is pretty small, and by design, small so that we can study it. I think we’re a little bit far. I think we’re pretty far away from AIs truly solving every problem that that a human encounters. Sure, absolutely. Yes. Yeah. And AGI is something
that is thrown around a lot or even, you know, and I think I, for me, it was only recent that I realized that they even the need to ask someone what they mean by that by by generalized intelligence or AGI when they refer to it. But, you know, recently I’ve seen some expert or not expert, you know, published maybe opinions that, oh, we could be as, you know, as close as 10 years, you know. So but it’s always something people are fascinated with.
Jacob Steeves (06:43):
It really depends. It really depends. It really depends on what the definition is, you know, and that’s why I was bringing it up. Sure, we’re five years away from better AI. Unless we can unless we can really define what AGI is, I will never, never be able to actually verify whether or not those predictions come true in the first place. There’s this this is interesting result from the from the 60s. They they asked people how how long until we’ll have artificial intelligence. And I believe at that point, they didn’t have this concept of AI, AGI. It was just AI. And and people said 15, 15 years.
And then they asked five years later, and people said 15 years. And they asked five years after that, and people said 15 years. And it always it always is 15 years. And I think I think that has to do with the fact that if you’re asking somebody about something that has no definition, you’re just going to get some sort of binomial distribution that that has a mean. Sorry, like a Gaussian distribution, pardon me, with a mean of around 15 years. That’s just like humans, the human bias for ill defined thing with with that is technologically advanced people to say 15 years. So I, I think that we’re probably 15 years away from AGI, then I’ll just go with the mean of
humanity. Sounds good. Sounds good. I’m very much hopeful to to to see where BitTensor goes in that timeframe. Gosh, you know, it baffles me every time I think about it. Lots of big things are coming up, though. And I want to get into those. Can you maybe tell us a little bit about the parachain network launch on how it’s going? It must be a hectic time for you guys.
Jacob Steeves (08:33):
It’s really exciting time for us as well. Hectic and really exciting, you know, as we as we come up closer to the release of Finian on Jan 10th, obviously, we’re going to have an incredible amount to do to really, you know, smooth the edges of the technological push that we’re going through.
So, you know, Polkadot is a network of networks that uses the Polkadot relay chain to do consensus, to do consensus, merge consensus, so you can move assets across the chains. For instance, you could move Tau to Moonbeam and then from Moonbeam, you can trade and swap, you know, using things like SushiSwap to to move Tau into different different currencies. So that’s going to be really, really, really positive thing for everyone here, especially in terms of liquidity for for miners so that we can really drive the computational side of BitTensor. That’s our our goal here and always is. It’s not about, you know, getting marketing or shilling the coin. It’s about making sure that our miners are able to to afford their compute. And so. That’s one thing that’s that’s going to happen is we’re going to get this liquidity for for the token. I think that’s really positive, but really, really interestingly, though, that, you know, the world’s our oyster once we get into. This ecosystem, because developers can build their own smart contracts throughout the Polkadot ecosystem and host them on other chains and still work with Tau, so, you know, it’ll allow people to build things like validator pools, you know, mining pools, all of that kind of functionality is now, you know, at the fingertips of our developers where it wasn’t before. I mean, right now we have a you know, we have a lot of control over what goes on our chain. And we’ve been we’ve been very cautious about letting people have access to that. But once we get into the Polkadot ecosystem, it really becomes the Wild West for what happens outside of our chain. And people can even build their own chains and connect them into Polkadot if they want to use Tau for some other purpose. That’s a really interesting thing that we don’t really know what people will do. And but it really excites us what, you know, the potential there.
Yes, Konstantin, am I correct in assuming that this, you know, becoming a dot ecosystem parachain would probably be the biggest publicity that BitTensor has seen so far, just given that you guys have been, you know, intentionally just, you know, low key and not really, I don’t see you guys having spent any time really shilling anything for forever, actually, pretty much.
Jacob Steeves (11:07):
Yeah, we don’t do that. And the reason is that we want people to mind. We don’t really, really, really want to push the, as you said, shilling side of BitTensor, because we believe that the price will always reflect the size, computational size of our network. And so we, you know, what you focus on, you master it. And, you know, where we put our intentions, that’s what we’re going to get good at. So that’s why we haven’t done that. But yes, of course, when we get into Polkadot, there’ll be some attention. And, you know, interestingly, today, I was added into a Telegram group with some other founders in the ecosystem, and they already know about us. So our reputation precedes us in the ecosystem. And I think that it’s going to be really interesting once we get into the relay chain. Some of these other communities will be able to interface with us directly. They’ll be able to buy and sell Tau, and maybe that will drive them into our community also.
Absolutely, yeah. I noticed a lot of the questions that people come in and ask on Discord, you know, many of them are things that are dependent on being able to do a smart contract or other times they’re just like, hey, I just want an easy way to buy Tau, you know, rather than currently. And I feel like so much of all of that will come, if not at, you know, shortly thereafter with the parachain. And of course, I have this feeling now that I’ve gotten to love the tensor that, oh, my God, the cat is getting out of the bag, and it’s going to get busy, and it’s going to get a lot of attention. Yes. I have this, like a strange feeling about it.
Jacob Steeves (12:51):
The child is leaving the house. Exactly. As we, you know, when we move to that ecosystem, we also lose a hell of a lot of control over the chain, which is good, which is where we want to be, right? The whole concept of decentralized technologies is that it’s not just one person that can control it. And so that’s our direction. And, you know, the step is, like I said, it’s our child leaving the house. We’re relinquishing control for the betterment of the technology itself. You know, things I’m really excited about are the potential for us to build the rails into the KLEE, where we can do cross-chain transfers and things like that. I mean, wouldn’t it be amazing if you could just do BT KLEE, you know, transfer BT KLEE swap and move your TAO in and out of the relay chain into different currencies, potentially then just transfer those other currencies into whatever cloud hosting provider you’re using and pay for your compute. Makes really smoothing that market where in the smoother we can make it, like the less edges in terms of the relationship between the demand for TAO and the price of computing. That’s going to really drive the size of the network and make sure that it is expanding to the scale that we want it to expand so that we can reach, you know, up into the realms of hyper-hyper computation. Right, right.
I can’t see you touch on a personal worry of mine with BitTensor. So I’m going to dig in there just, you know, with one question. And that was that, you know, I imagine at some point right now, you know, TAO is not out there in exchanges or coin market cap and this and that. But I imagine at some point it will at some level be at the whims of the crypto market, let’s say. And I worry that at any point down the line that some should it become, should mining become an expense that maybe for a while, you know, miners will be able to some of them sustain that. But that should that go on for a long period of time that it would cause the computing that basically there to be many, many less GPUs in the network than there are, which wouldn’t have an incentive problem, but it would, you know, you know, how much of the worry is this? I don’t see it as too much of a worry.
Jacob Steeves (15:15):
I mean, obviously we want to focus on making the project successful and valuable, but the bulls and bears are inevitable in any market and they’re healthy.
The bulls are great because people are, you know, partying and we’re expanding resources and we’re growing our teams. The bears are where we trim back. It’s where we learn how to minimize our costs and do things more efficiently. If we didn’t have the bear cycle in BitTensor, I don’t think we could make claims like BitTensor is going to be environmentally friendly or computationally efficient. It is the decreases in prices that force people to really think about how they can do things cheaper. And that’s this power of markets that we’re playing with here. If we didn’t take advantage of that, if we were scared of the eventual crash, then we would, you know, the technology would not be performing or not be behaving the way that we wanted it to. The same thing happens at Bitcoin, right? You know, Bitcoin has all these crashes and it forces the miners to seek out cheaper forms of electricity, different types of compute, FPGAs, ASICs, all these different types of technologies come about because there’s a bear market. And, you know, that’s what’s happening in the global cryptocurrency market right now. And people are freaking out, but I think we should really, you know, hold strong during these periods because, you know, as everyone knows who’s been in crypto for a long time, this is when a lot of things get built. This is when technologies get honed. And then the next batch of ecosystems and technologies and chains that come out of these bear markets into the bull market.
Absolutely. Absolutely. Agreed on everything, on all the sentiment you shared about the bear market. Actually, I think it’s an amazing time for anyone to be building these days. I would say maybe, you know, to go through a build process or even a release or anything of that sort is actually maybe even better to do that in the bear market rather than the bull market so that you won’t have to, you know, suffer that kind of a hit to whatever it is you’re doing coming from the outer world just shortly after you have, you know, gotten up. So that’s awesome. Yeah. A couple of things coming up later. Bittensor is getting a rebranding by Saffron. So that’s exciting. That’s coming up mid-December, right? That’s right.
Jacob Steeves (17:47):
We’re just going through the patenting process with them to make sure thatwe have control of the branding that they put out. We’re really excited about it. It looks really beautiful. We put a lot of work into the design and obviously we’re really looking forward to seeing what the community thinks about that. Just to bring, you know, the style up to a higher level. You know, the original website, like I made that website and I’m not a web developer. It’s really not that nice. It’s quite simple. It’s simple. Yeah, I mean, that was the idea behind it. Maybe from necessity more than style. But, you know, I think these are the type of things that we want to push in the new year to do a bit more smoothing of the brands that people actually trust us. I think it’s important. Absolutely. Yeah.
Once you learn about Bittensor, of course, I love the website just because it’s Bittensor’s website. It doesn’t matter to me what it looks like, but I think that’s going to be amazing for people having a first glance at the project. A couple of other things that I actually have no idea about, I was going to ask you, that you guys shared with us in TGIFT for the audience. Those are the sessions open to public via Discord that the Bittensor team holds with, well, really with anyone, but it turns out to be people involved in the project every other Thursday. So that takes place at the same time as this space, but it would be on Bittensor’s Discord and you can get a link to that right now. So, yeah, I think that’s where you can get a link to that or join that Discord on the Bittensor.com. That’s the website. It’s on the top right-hand corner at the moment. But the second quarter of 2023, you guys mentioned there will be a DAO. I’m wondering what the function of this is.
Jacob Steeves (19:45):
I don’t know anything about it at all. We’ve talked about this at a couple of TGIFTs, but I can go over it again. Absolutely. So the idea for the DAO is to decentralize the movement of some of the hyperparameters in the system. That’s the primary reason. We want to make sure that it’s not just the foundation that controls the evolution of the project in the long term. Now, we’re being very careful about this because we know that unless your community is well-trained, unless you have an active process for DAO management and governance, DAOs can go very badly and they actually just get basically taken over by a small majority of people with lots of control. But Bittensor has a decentralized mining aspect where anybody can join. It’s very open. It’s fair.
But behind the scenes, the chain has a number of parameters, hyperparameters, which I can talk about on this call because it’s something that we do a lot of in behind the scenes. So we tune those hyperparameters, we do analysis on them. And they affect everybody who’s mining, a lot of people on this call, how much computation they need to put on their miners, whether or not they’re going to need more VRAM on their GPUs, this type of thing. And we don’t want it to be just our control. So the idea for the DAO is to decentralize the ownership and the voting and the fine tuning of those parameters. We have a lot to discover in that evolution of the project. So initially, the idea for the DAO is to become a shared validator that people can stake to as a way of funding the project and accruing ownership that will eventually come into play via voting rights in the DAO. So V1 is not going to be people here can vote on hyperparameter changes. That’s just going to be more of a dialogue with the community in our GitHub to begin because it’s going to take time for us to build up that organizational structure and then let the system go. So Q2, Q3 will be a staking contract where people are allowed to put TAO and they’ll get some proportion of the emission from that without having to run a validator. And this will allow people that are just holding TAO but don’t want to run validators to make some revenue at the same time of funding more development from the machine learning side on
the foundation. All right. I see. I see. Yes. The hyperparameter changes is that this is something that happens on a regular basis. And is this sort of the fine tuning or the adjustment of the network that becomes necessary because it’s an ever-changing, ever-shifting network? Is this something that is ongoing? And maybe tell us a little bit more about that. Sure. We have a whole
Jacob Steeves (22:42):
bunch of parameters on the chain that determine various qualities of the mechanism from registration to how the validators are generally behaving in the network. You know, one example on the registration side would be the immunity period, which is there’s a lot of conversation right now in the Discord about how long does a miner have before it essentially gets judged by the network and is pushed out of the mining system. So that’s sitting at 3,096 blocks. That’s right. Yeah. So it’s about a few hours. 10.24 hours. It’s not very long.
Now it is. We need to fine tune that parameter so that miners are fairly being evaluated
Jacob Steeves (23:31):
as they come into the system. So if it’s too short, then miners will be kicked out before the validators can truly do an analysis on them. If it’s too long, then you begin to just kick out peers that that were properly situated in the network and are being kicked out unfairly by these new registers, simply because they have immunity and the old ones don’t. So what we monitor is the churn. So how many new keys are being added to the system. And we also measure how quickly the peers are attaining incentive over time. And that tells us whether or not we want to increase or decrease. And we try to move these as slowly as possible, because we’re manipulating a market that needs to come into equilibrium before we can really understand what the effects are. This is what makes it very difficult. It’s the same thing in general machine learning. In general machine learning, you have hyperparameters of your neural network, the learning rate or the momentum, and you can tweak them over time to affect the learning process of the system. So what’s really interesting, I think, about the job that we have at the foundation is we’re doing meta-meta machine learning. We’re doing machine learning at the level of many decentralized AIs and also a community of people. We’re all in this adaptive system together. It makes it very difficult, but also a fascinating problem to pick these hyperparameters properly. It’s a lot of data analysis. It’s a lot of classic machine learning work, but applied to hyperparameters of an economic system rather than just an AI.
Right. Yeah, I was speaking with my dad about this over dinner the other night, and I was just telling him, you know, it was about this latest adjustment that you guys did. And I was just telling my dad, it must be such a headache at times to deal with this, because it is ongoing changes that are going to be necessary in order to keep things up and functioning as intended, I suppose. Does that get to be a headache? Well, I believe it’s.
Jacob Steeves (25:43):
We’re very aware that it’s a headache for the miners in the system. And so, you know, we’ve really woken up to that fact lately. You know, we made a couple of changes recently, which were premature. And we’ve come to really realize that we have to behave a little bit like a central bank almost behaves with respect to these hyperparameters. In the case of the central bank, obviously, you’re changing things like interest rates. They give plenty of notice beforehand about the change and in which direction things will move and what the probability of those changes. And they happen exactly the moment that they’re said they’re planned to be they’re planned to be taking place. So it’s a headache for everyone, really, that these things change. But it’s part of the mechanisms that we have to really fine tune the system.
So as we check, we see, okay, what kind of models are people running? Is there diversity? Are the validators learning properly? Is there churn in the network? As we fine tune these, we hope to push the system towards a much more performant machine learning system. And eventually, eventually down the road, it’s possible that as we come to understand these hyperparameters more and more, we can make them automatic. We don’t have to optimize them ourselves. And perhaps, perhaps there’s even an AI problem in there. A meta, meta, meta, meta, meta, meta AI problem. It’s sometimes it always it’s sometimes it feels like it’s turtles all the way down with with adaptive systems. And, you know, we’re, I think that as a project, we situate ourself as the highest meta machine learning in the world. We have, we have, we’re training in markets, which then machines are adapting underneath. And I think no one else is doing that. So, you know, we want to understand this plane before we jump up to, you know,
a more abstract one. Sure, sure. Yeah, absolutely. No, but for the record, when I was talking about the headache, I meant for you guys, in the sense that, well, sure, you know, I mean, things, changes manifest on the mining side, and then the team becomes aware of it. But I was mainly talking about the headache of being in a position of the people who have to make that adjustment. And also, the difficulty that, you know, it needs to be just right, you know, not too much and not too little. And it’s, so, you know, was mainly saying that you guys are doing
Jacob Steeves (28:18):
such a complicated job there. On that, we try to focus on, you know, core metrics that the system needs to succeed. So, and that maybe puts us at odds with some of the people that are mining, because miners might want the system to just become easier. But we want it to become more efficient. And so what, you know, the highest level of the foundation, the team that I think directs everything below it is our Cortex team, which is responsible for doing machine learning work on top of BitTensor, right? The system really requires clients, and those clients will only be impressed if they can do standard machine learning, if they can do, if they can actually, you know, pull something out of the system. So we see what problems they have. And then that, you know, advises the Synapse team, our team that’s responsible for fine tuning these parameters. And it’s all with the intention of increasing the performance of these clients. We want to push models that are state-of-the-art. And so, I mean, some of the people on this call, you know, may be interested to hear that, like, it’s not so much about, you know, can we make it so easy for the miners? Sorry to say, it’s more about, can we make sure that the system is actually pushing the state-of-the-art in terms of artificial intelligence and machine learning work? And that means, yes, we need larger sequence lengths. It means, yes, we need to push people to have GPUs. It means that people need to increase their bandwidths. And that all comes in relationship to us allowing for our miners to have liquidity in Polkadot, right? If people can sell their DAO, and there’s liquidity to upgrade their infrastructure, then we can, you know, reasonably expect people to improve their mining rigs underneath the mechanism.
Absolutely, yes, yes. I noticed just having barely, you know, touching, really scratching the surface of mining in one month in, but noticing that there is, you know, shortages of GPUs and all sorts of things in the realm of, you know, GPUs, whether it’s cloud or bare metals, it’s just hard to even get devices sometimes these days. And on the mining side, yeah, you know, it’s interesting because the very thing that a miner like myself might complain about is actually the very thing that’s going to drive vTensor to get more intelligent faster than it would otherwise. So vTensor mining is definitely not about, you know, sitting back and relaxing or convenience. But, you know, but I think hopefully we all understand that, because otherwise, it just wouldn’t go anywhere. You know,
Jacob Steeves (31:07):
we’ll be praying for bear markets at certain points, so the difficulty drops. The network problem is this continually, it’s continually getting more and more difficult by design, and that, you know, really weeds out, you know, hyper competitive people. I might also add that you said this thing about GPUs not being available. I think that it’s one of our features at vTensor that the miners can come up with creative ways of answering these queries without having GPUs. If you can provide useful information without having a GPU, that’s fantastic. That’s what we want to stimulate, that type of innovation. So, you know, the fact that we’re pulling up all these resources, and we’re finding that there’s some limitations and they’re expensive, I hope that what that eventually does is drive people to be creative in the way that they solve, you know, the generation problem that their validators are asking peers to do.
Absolutely, absolutely. Kind of along the same lines as bear market and lowering costs. Konstantin, have you ever thought about an ASIC for vTensor? Is it even like a feasible idea, or am I just completely out of the, or in the woods on this one?
Jacob Steeves (32:33):
No, you’re definitely in the woods there. The bTPU, we’ve thought about it before, working with some hardware developers. It’s funny you asked that question because it was one of the things that drove me towards vTensor initially. I think, maybe you know this, but I was working, building neuromorphic chips, like specific chips in a past life.
And there was no market for those chips. Because if you looked at these large AI companies, they just built their own in-house. There was no place for a company that could achieve an ASIC chip to plug themselves directly into a market and be paid for finding some efficiency in the way that they could solve the problems. The bTensor chip, whatever it is, will have to be uniquely designed to work with the internet. I think that that will make it distinct from say, NVIDIA. And it will be a, you know, there’s a market arbitrage for people to figure out. I absolutely do think that people will come at this problem from the hardware direction at some point, probably by using standard, you know, ASICs. For bTensor, I think that the lower hanging fruit comes from, you know, purely the machine learning side. Like, no one yet is using gradients on the system. And those gradients could be very easily used to fine tune your model with inside of the mixture model. No one’s doing that yet. And, you know, there’s also so many optimizations coming out every day in terms of figuring out how to run, you know, larger models on smaller RAM or hosting them on CPUs, branching the network across multiple devices, so that they can service the requests using computers with less power. I think that these are the low hanging fruit right now. It’s more of like a software layer before we get down to the hardware.
Thank you for saying that. Of course, anybody can be wrong. But, look, I was just saying there would need to be many more miners than there are. But that once there is more miners, that it would actually be a very lucrative idea for someone to develop hardware for B-Tensors specifically, because if they do a decent job of it, then immediately on, it would pretty much make, it would be significantly, if they did it right, more advantageous compared to any other general-purpose GPU. And so it would be good for them. And then ultimately, in the end, of course, it would be good for the miners. And then ultimately, it would be, best of all, it would be good for B-Tensor for, you know, being on hardware that is more tailored to it, I suppose. So, but maybe not tomorrow, the next day, maybe. I believe that, you know, I think that, you know, I think that, you know, maybe not tomorrow, the next day, maybe. I believe that a technological innovation like that will actually occur
Jacob Steeves (35:42):
by people that are selling to B-Tensor miners. That’s what happened in Bitcoin, for instance, these companies that came up with something like the AntMiner. They mined themselves a little bit, but actually they just sold to miners. And this was the most effective way for them to make money because they could sell to all the miners. You know, they could put the work into building these ASIC chips and then they could sell to the entire fleet of Bitcoin miners. I think that that will probably be how it plays out. Instead of a single miner, you know, becoming a hardware company, it’ll be a hardware company takes on the challenge of building a B-Tensor miner and then selling those B-Tensor miners to people that are, you know, directly plugging themselves into the B-Tensor market.
Right, right, right. Very exciting to be at that point. Actually, for me, it’s just, you know, at any point down the line, just imagining what B-Tensor might be or where it would have gone by then, it just is so exciting. There is nothing. I did finally, after so many weeks, do, you know, a thorough research of the space. And really what I was looking for is, does another B-Tensor or does another B-Tensor-like network exist? And as far as I found, there is not another decentralized, incentivized neural net. Do you know of any cons?
Jacob Steeves (37:14):
No, but I know a lot of companies that are attempting to build APIs for machine intelligence. OpenAI, Cohere is doing the same thing. Even HuggingFace is attempting to jump into this market. The difference is that we’re doing it in a decentralized way, in an incentivized way.
So, like, those projects, I think, are the most connected or most, like, similar to us. On the decentralized side, I think the most interesting project is Ginsys, but what they’re doing is federated learning. So, they’re training a single model and people are doing a compute to the single model so that the weights are shared with inside their ecosystem. I think that’s also a very interesting project. And it’s even something that we could interface with at BitTensor at some point.
Awesome. Awesome. All right. So, one other thing, you mentioned subnetworks. What are they?
Jacob Steeves (38:12):
Subnetworks are our first foray into running more than one market and having different types of validators that focus on different types of problems. You know, the long-term goal for BitTensor is, you know, one embedding to rule them all, a multimodal understanding. So, it doesn’t matter what you put in, you get a representation out, which can be useful for solving any problem that that input would depend on. So, that’s our long-term goal. But in the meantime, we want to focus on subdomains so that we can understand them better. So, one would be, for instance, image. I think this is one, you know, the tip of everyone’s, you know, mind because of the stability release. This is something we want to push out in the next year, allow people to do image tasks on BitTensor, so we can do generations, things like this inside the Discord. It’ll be really exciting, and people, I think, will love to be able to see what we can create, you know, people believe what they can see. And so, subnetworks will allow us to partition the incentive mechanism so that we can have a subsystem with the proportion of the inflation going to that subsystem that has its own validator. So, that’s one of the things that we want to focus on.
That has its own validators and is incentivizing the creation of a new commodity, so maybe intelligence around images, for instance. Later on, because of the way that we’ve built this new chain, we’ll be able to connect those subnetworks. So, we could even bridge the incentive mechanism between two networks. You could be in one network and also be, because you only registered in one, but you’re then immediately, you know, registered in the higher network as well. You could build even nested incentive mechanisms, and we have, we believe we have a lot of scale that, you know, we can scale horizontally like that, because right now, like, our biggest limitation, our biggest scaling limitation is the 4096 space, because with a certain number of validators, there’s effectively an N-squared operation in the consensus mechanism, which makes that difficult to run on the chain. But we can multiply the number of networks, so we can go 4096 by, you know, 100 or 1,000 to get a whole set of networks where people, thousands and thousands and thousands of miners can come into the BitTensor ecosystem, you know, and the test network itself can be incentivized in tau, even though, you know, maybe not that much. We can really play with expanding the amount of compute that there is in the system. That’s another reason why we’re pushing towards subnetworks.
Yes, yes. You know, the one thing that maybe I just hadn’t thought about so much before BitTensor in life was how this incentivization is such a key part of all this, that I, for example, would not be contributing compute today to this if it wasn’t for it, and that, you know, it’s such a genius part of this plan, and the part that makes me look back at Bitcoin and feel like, oh my god, we’ve already seen this before, that that part is not being tested here, that it is established that when people have incentive to contribute compute to a system, that it works, as opposed to not having it. So, let’s see, the subnetworks we talked about.
What else? Oh yeah, there was a research paper, and this was a significant thing that you guys just recently released. It was a tour conference. I’m wondering if you can kind of dumb it down a little bit for myself, if not anybody else here, to kind of tell us a little bit about what significant findings there are, or what you guys were able to share with the academic world, which is where a lot of prior development in AI may have happened. Can you tell us a little bit about the paper and what was significant about it, maybe? Sure, so the original paper that we built
Jacob Steeves (42:16):
everything off from Yumin Rao, we took a lot of that paper and we re-explained it in our new paper so that people can maybe understand what we’re doing from a different angle. It’s written by some of our researchers at BitTensor that have a different way of speaking, so maybe it’s easier to understand. So that’s one aspect. But the work that we’ve done over the last year was really about fine-tuning the mechanism for evaluating the informational significance of the peers. So in the paper, if you look at the way that the peers are evaluated, what we do is we use a Fisher’s information scoring. So a Fisher’s information scoring is where you learn who’s valuable on the validators, and then you look at the entropic difference in terms of the probability distribution of your validator, because a machine learning model that’s learned is a probability distribution. You look at how different it is from another probability distribution when a particular minor is removed. So effectively, if you remove this peer, how much information did you lose when you performed that removal? And that comes out as a Fisher’s information scoring. Now, it turns out that that scoring is really, really contingent on the specific way in which the validator is built. And at times, it leads to a sort of unfairness in the way the weights are set on the chain, because it’s very much dependent on what’s called the gating layer in the validator. The gating layer is the one that selects the peers. And so if, just happenstance, at random, a peer is selected to be the first that is gated in that gating layer, it will get a huge Fisher’s information score, even though it’s not actually a very accurate scoring. Another reason was that that approximation of information theoretics is based on the assumption that the system has actually converged to what’s called the local minima. It means that the network has converged. And that is not necessarily true for many steps in the validator. And so we saw that the weights were not highly accurate. We couldn’t make that assumption of the convergence. So what we ended up doing is looking into some statistical techniques that come from the world of finance called Shapely value scoring, where you do that estimation of value, but you do it numerically.
So instead of doing it mathematically, calculating the second derivative of the validator, which is required for the Shapely scoring, sorry, which is required for the Fisher’s information scoring, you just numerically do it. So you literally just remove a peer numerically to see how much the loss changes. And so that was this improvement in the mechanism of determining a peer’s weight. It also allowed us to very easily break up the problem of ranking so that we could say, okay, cool, what’s the individual score of a peer? What’s his collective score if you group him in pairs of two, in pairs of three, in pairs of four, in pairs of n going up? Obviously, because exponential and the number of pairs you need to look at when you do that. But we can actually very specifically fine tune between what would be called like a synergy score and an individual score and do that very accurately on the validator. So that was the main result from the paper. There was also, if you have any questions about that, just cut in. There was also some results there about how we’re doing adversarial resistance for peers that are effectively cheating with the data set. So we figured out how to solve that from the validator side. A lot of this work came from TACO. So a huge shout out to TACO and also Eugene and Isabella, who were our main machine learning engineers that worked on that project and will be there at NIPS to present the paper. So it was basically, am I correct in saying that it was a sharing of
the innovation slash kind of inventions that you guys had to do because no one’s ever done this particular application before? Oh yeah, we’re totally out in left field. We’re out in the ocean
Jacob Steeves (46:46):
completely by ourselves at this point. The Bittensor paper itself was quite unique. And now this is like a fine tuning of this new field of machine learning that we’re creating really, which is very exciting. And we hope that there’ll be other people in the community, in the academic community that will be interested in this very specific problem that we have in front of us. It’s adversarial machine intelligence in a way that other people are not attempting to solve it. If you look at the literature, if you go to NeurIPS, for instance, right now, you’ll see that it’s all around gradient poisoning for federated networks, but we’re not federated. We’re data parallel. So it’s a different type of distributed machine learning. And not many people have actually looked at this problem because the entire academic community essentially had horse blinders on and focusing on this, on the federated aspect, the federated problem in distributed computing. So we’re going to be bringing that to NeurIPS and we’re really excited to see what people have to say. Maybe they’re going to laugh at us. We’ve got a lot of laughs over our years, so we’ll be expecting that, but I do hope that we get interest from the
right people. Absolutely, absolutely. Konstant, I’m wondering, so this might be a bit of a vague question because it’s not like there is one kind of AI application or one way to apply AI, but today the client side is missing from B-Tensor and it’s totally understandable why. I’m wondering how long do you think B-Tensor would need to continue to train and continue to grow before, in at least some of its applicability, it would be competitive enough for someone to be aware that it exists, that they would consider using it for a particular application where there is natural language processing or at the time some kind of image diffusion application, that sort of a thing. Is that many years away? Is it a couple of years away? Can you speak to this?
Jacob Steeves (48:47):
I don’t think that it’s years and years away before it has value, especially to us, because we’re holding the foundation keys that have a fair amount of talent. We have access to this supercomputer effectively and we’re testing its behavior. I have to be honest, we are doing effectively machine learning research on top of the ecosystem right now. We have a whole team that does this and that’s what we work on. We go, okay, cool, let’s test this. Now, there’s going to be a lot of unknown unknowns. There’s going to be a lot of things that we have to fine tune and that all has to play in between the teams. How do we change the hyperparameters and how does that affect the clients? The specific architecture that we use, we need to figure out what type of architecture is best suited for BitTensor. I think that in a year from now, if you come back, we’ll have some amazing results. I think that in terms of just compute, there should be people that are interested in playing with BitTensor. Whether or not they can really leverage a system because they don’t have too much tau, that’s kind of up in the air. But if they were sitting where I’m sitting with my keys, I think there’s so many, many researchers which would be fascinated to be taking this problem on, given just the expense. It really has come down to it requires an AI lab out of a billion dollar corporation or $250 million worth of funding before you can start to do state of the art. There’s a plethora of people around the globe that don’t have access to that. We think that those people will be interested even if there’s a lot of questions about whether or not they can hit SODA
today. Right, right. What BitTensor has done, BitTensor just celebrated its first birthday on November 2nd, or on November 3rd, but the birthday is November 2nd. Where it has come in one year would be, to me at least, mind-boggling to anyone who is remotely aware of how long anything has taken to get to where it is, whether it’s the GPT series or the stability diffusion or any of these things. It becomes very obvious that BitTensor is on a different curve, a more accelerated curve when it comes to, I suppose, development of intelligence generally. So, very exciting.
Jacob Steeves (51:20):
Yeah, it’s very exciting to be at the intersection of these two fields, especially right now where supposedly one is out of favor and one’s on the way up. And we’re like, well, we’ve got feet in both buckets here. Right, right. Definitely, definitely, yeah.
The DAO, gosh, now that I have more of a sense of what it is, that sounds so exciting. On the one hand, it kind of sounds inevitable in terms of handing over, at some level, governance eventually. But, on the other hand, I kind of just want the BitTensor Foundation to remain in absolute and complete control of BitTensor just because it is so young, you know.
Yeah. And the idea of anyone else touching it, I don’t know about others, but I personally don’t like that at all right now. But it’s probably maybe because it’s so young still. I want it to be protected, you know.
Jacob Steeves (52:18):
Well, I think this comes, and I agree with you. I don’t think that BitTensor’s killer app is decentralized control today. Right now, like lowest hanging fruit is the power of incentives to drive this supercomputer. In the case of Bitcoin, reaching that form of decentralization is the most important thing because Bitcoin needs to be a censorship-resistant currency. And so, it was immediately required that that system fell out of the hands of its early control, the early group of people that controlled it. Even in the case of Bitcoin, it took more than a year before Satoshi even left the project. And at that point, he took a lot of responsibility for it. He took down the network at one point in the first year. And I think that we’re still years away from really totally losing control. But it’s required. We have to be pointing in the right direction here.
And one of the most amazing things about this technology is that it can be community-run. And it can be done in such a way that we all have a stake and control of this hyperintelligence network to make sure that it’s aligned with not just a small group of shareholders that are interconnected with a large elite set of players in the world, that it’s really bottom-up. Absolutely. And if the system remains just Ala and I and other people in the foundation who have full pseudo-control over the chain, that would be a failure of the mission here of building truly decentralized AI. Our plan for the DAO is that we get as many people into it as possible so that we can effectively have as democratic as possible a relationship between what the DAO is and BitTensor.
Yes, yes. I understand where it’s going and what the overall mission is. If I’ve been here one month and I feel about BitTensor the way that typically a person would feel about a loved one in terms of worrying about it in different ways that are just ridiculous about a thing, then I can only imagine what that must be like for yourself or for Shib or for many of the people in this room that are day one miners and they have been there at least 12 times as much time as I’ve been there. So I constantly just want BitTensor to be protected. One last question. We have a few minutes here and I asked this question, I asked Shib this same question and then I’ll just ask it as my closing question from you. What do you see are going to be the most difficult challenges ahead?
Jacob Steeves (55:26):
The most difficult challenge for us is getting the client side of BitTensor, so specifically this comes from the foundation right now, to leverage the compute and then make sure that the synapse team, the people that are fine tuning these hyperparameters are in communication so that the mechanism is in play. We’re doing this meta machine learning and in a way we’ve made our lives a little bit harder because there’s a disconnect between what the community is doing. In a normal machine learning ecosystem we have full control, we can see everything. There’s a black box aspect to BitTensor and so there’s an adversarial aspect to BitTensor that’s unique in machine learning. So taking what we learn from the client side, from when we’re training models on BitTensor and then injecting that into the proper hyperparameters for the system so that the network and the community can adapt to it. I think that’s really the challenge, that’s the only problem that we have.
I would say no it’s not true but it’s the core problem of BitTensor is how do we connect this world of incentive with an adversarial black box and then do machine learning into it. It’s the most exciting problem that we have in front of us too, it’s the thing that we want to write papers about every day, it’s the thing we talk about at length every morning on our morning calls. I think that that exploration into the unknown is what makes BitTensor really exciting for us, it makes it exciting for everyone in the community, but it also happens to be the most challenging task in front of us.
Absolutely. Gosh, everything about BitTensor is just so groundbreaking. I feel like I’m in the midst of a revolution really and I think I would say that’s very much justified there. Especially just as a web 3.0 AI-interested person, because I did this for two or three months, I intentionally spent time to look at what’s out there and certainly exciting things are happening in the AI world, but definitely by far not one that a person like me could be involved in, in the way that BitTensor makes it possible. I mean, that is that opportunity and that’s that is that opportunity is not there anywhere else in the world and to contribute to a large neural net and what it makes and where it goes, that’s supremely exciting. Konst, thank you so much for giving us your hour and telling us more about BitTensor. Is there anything else you want to share with us that I have not touched on today?
Jacob Steeves (58:09):
No, I think you covered it, but to speak to what you just said there about how we hit this open system allows anybody to come in, that allows us to select for people that really care about AI and the ethics of it. It allows us to select for people that are innovators and gamers, people that know how to beat a system. That’s not something that is necessarily selected for by a classic academic trajectory for machine learning engineers. You said you’re happy to be here and we’re happy that you are here because it’s people like yourself that are just genuinely interested, intrigued, excited, and obviously incentivized to be here. You’re the ones that are going to make this project a success. That’s the killer app we’re using here with this open incentivized network. Thank you, Dr. M. I really appreciate it. I’ll be here in two weeks, probably.
That’s right. In a couple of weeks, if somebody wants to have a point of contact with the team and the project, the next good opportunity where you would get a sense of some things, at least within an hour, would be next week, same time, in BitTensor’s Discord for the session. Actually, it’s awesome. There’s a question and answer session where pretty much anyone can ask any question and the team will answer this. I don’t see anyone doing this anywhere else in the world, so that’s where you can catch up with BitTensor. Konst, thank you so much for joining us and thank you everyone who’s here. We’ll see you next, or two weeks from now, Thursday, same time, same place, and hopefully more exciting things to share.
SUBSCRIBE THE BITTENSOR HUB FOR EVERYTHING BITTENSOR!
This Clip was recorded in Dr.M’s Twitter Space Chat on Nov 10, 2022
Host: Dr.M Co-Host: Jacob Steeves (Const)
Join Dr.M for deep dives into AI & Bittensor: Every other Thursday | 5 - 6 pm EST
Subscribe to the The Bittensor Hub for everything Bittensor!
Bittensor is an open-source protocol that powers a scalable, globally-distributed, decentralized neural network. The system is designed to incentivize the production of artificial intelligence by training models within a distributed infrastructure and rewarding insight gained through data with a custom digital currency.
Discord: https://discord.gg/Qv3fxVaXyE Website: https://docs.bittensor.com/ Whitepaper: https://drive.google.com/file/d/1VnsobL6lIAAqcA1_Tbm8AYIQscfJV4KU/view Network Map: https://bittensor-explorer-staging.netlify.app/
HASHTAGS: #BITTENSORTAO #BITTENSORMINING #BITTENSORCRYPTO #BITTENSORNETWORK #AI #artificialintelligence
Podscript is a personal project to make podcast transcripts available to everyone for free. Please support this project by following us on Twitter.