What’s holding back AGI is a complete lack of progress toward anything like intelligence. What we have now isn’t intelligent, it’s multi-variable probability.
It’s not that it’s not intelligent, it’s that predictive language models are obviously just one piece of the puzzle, and we’re going to need all the pieces to get to AGI. It’s looking incredibly doable if we figured out how to make something that’s dumb but sounds smarter than most of us already. We just need to connect it to other models that handle other things better.
Oh god yes. This is going to be pretty simplified, but: The sheer compute required to run something like ChatGPT is mindboggling, we’re talking thousands of A100 GPUs ($10k a piece, each one has 80GB of VRAM) networked together, and probably petabytes of SSD storage for the DB. Most neutral networks require a bunch of GPUs working in parallel because they need a lot of very fast memory to hold all the data they sift through, and a lot of parallel compute to sift through that data as quickly as possible. That’s why GPUs are good for this - you can think of a CPU like a human, very versatile but there’s only so much one person can do at a time. Meanwhile GPUs are like bug swarms, a bunch of much simpler brains, but specialized, and they “make it up on volume”. It’s only because of advances in computing power, specifically in the amount of compute cores and VRAM on GPU dies, that the current level of AI became possible. Try downloading GPT4All and compare free models that run on your machine to the performance of ChatGPT - you’ll certainly see the speed difference, and if you ask the free ones for code or logic you’ll see the performance difference too.
This is all to say that superconducting traces and transistors mean no heat is generated by dumping power through them, so you can fit them closer together - even right next to and on top of each other, doesn’t matter, because they don’t need to be cooled. And, because you lose no power to heat, it can all go to compute instead, so it’ll be perfectly efficient. It’ll bring down the cost of everything, but specifically computer components, and thus OpenAI will be able to bring more servers online to improve the capabilities of their models.
There is still heat generated by the act of computation itself, unless you use something like reversible computing but I don’t believe there’s any current way to do that.
And even then, superconducting semiconductors are still going to be some ways off. We could have superconductors for the next decade in power transmission and still have virtually no changes to processesors. I don’t doubt that we will eventually do something close to what you describe, but I’d say it’s easily a long way off still. We’ll probably only be seeing cheaper versions of things that already use superconductors, like MRI machines.
Edit: my first draft was harsher then it needed to be, sorry, long day.
First of all, nobody’s saying this is going to happen overnight. Secondly, traditional computing systems generate heat due to electrical resistance and inefficiencies in semiconducting transistors; the process of computation does not inherently require the generation of heat, nor cause it through some other means than electrical resistance. It’s not magic.
Superconduction and semiconduction are mutually exclusive - it’s in the name. A semiconductor has resistance properties midway between a conductor and an insulator. A superconductor exhibits no electrical resistance at all. A material can be a superconductor in one “direction” and a semiconductor in another, or a semiconductor can be “warped” into being a superconductor, but you can’t have electrons flowing in the same direction with some resistance and no resistance at the same time. There’s either resistance, or there’s not.
Finally, there is absolutely no reason that a transistor has to be made of a semiconducting material. They can be made of superconducting materials, and if they are then there’s no reason they’d generate heat beyond manufacturing defects.
Yes, I’m talking about a perfectly superconducting system and I’m not allowing for inefficiencies where components interface or component imperfections resulting in some small amount of resistance that generates heat; that would be a manufacturing defect and isn’t relevant. And of course this is all theoretical right now anyway; we don’t even know for sure if this is actually a breakthrough yet (even if it’s really beginning to look like it). We need to better understand the material and what applications it’s suited to before we can make concrete predictions on what impacts it will have. But everything I suggest is grounded in the way computer hardware actually works.
I appreciate you revising your reply to be less harsh, I wasn’t aiming to correct you on anything I was just offering some thoughts, I find this stuff interesting and like to chat about it. I’m sorry if I made your day worse, I hope things improve.
I said superconducting semiconductors as just a handy wavy way to refer to logic gates/transistors in general. I’m aware that those terms are mutually exclusive, but thats on me, I should have quoted to indicate it as a loose analogy or something.
The only thing I disagree with is your assessment that computation doesn’t create heat, it does. Albeit an entirely negligble amount, due to the fact that traditional computation involves deleting information, which necessarily causes an increase in entropy, heat is created. It’s called Landauer’s principle. It’s an extremely small proportion compared to resistive loss and the like, but it’s there none the less. You could pretty much deal with it by just absorbing the heat into a housing or something. We can of course, design architectures that don’t delete information but I’m reasonably confident we don’t have anything ready to go.
All I really meant to say is that while we can theoretically create superconducting classical computers, a room temperature superconductor would mostly still be used to replace current superconductors, removing the need for liquid helium or nitrogen cooling. Computing will take a long time to sort out, there’s a fair bit of ground to make up yet.
Okay, you’re kind of reaching with that one 😋 I didn’t mention Landauer’s Principle because it’s so negligible as to be irrelevant (seriously, the heat generated by writing or erasing a bit is about equivalent to the energy levels of a single electron in a hydrogen atom, in the range of ~0.018 eV at room temperature), and superconductors will reduce even that. I kind of wish we had another word, for when “negligible” doesn’t do the insignificance justice.
I do appreciate the clarification on the point of superconducting semiconductors - and the concern for my day haha! It really wasn’t anything to do with you, hence the edit. And, your point here is absolutely correct - LK-99 isn’t some magical material that can be all things to all people. Its other properties may make it unsuitable for use with existing hardware manufacturing techniques or in existing designs, and we may not find superconductors that can fill every role that semiconductors currently occupy.
Edit: lol, looks like its “other properties” include not being a fucking superconductor. Savage.
I think “rounding error” is probably the closest term I can think of. A quick back of the envelope estimation says erasing 1 byte at 1GHz will increase an average silicon wafer 1K° in ~10 years, that’s hilariously lower than I’m used to these things turning out to be, but I’m normally doing relativistic stuff so it’s not really fair to assume they’ll be even remotely similar.
Really appreciate the write up! I didn’t know the computing power required!
Another stupid question (if you don’t mind) - adding superconductors to GPUs doesn’t really se like it would make a huge difference on the heat generation. Sure, some of the heat generated is through trace resistance, but the overwhelming majority is the switching losses of the transistors which will not be effected by superconductor technology. Are we assuming these superconductors will be able to replace semiconductors too? Where are these CPU/GPU efficiencies coming from?
I sort of covered this in my other reply, but yes, switching losses are also due to electrical resistance in the semiconducting transistor, and yes I’m assuming that semiconductors are replaced with superconductors throughout the system. Electrical resistance is pretty much the only reason any component generates heat, so replacing semiconductors with superconductors to eliminate the resistance will also eliminate heat generation. I’m not sure why you think superconductors can’t be used for transistors though? Resistance isn’t required for semiconductors to work, it’s an unfortunate byproduct of the material physics rather than something we build in, and I’m not aware of any reason a superconductor couldn’t work where a semiconductor does in existing hardware designs.
Then again I’m also not an IC designer or electrical engineer, so there may be specific design circumstances that I’m not aware of where resistance is desired or even required, and in those situations of course you’d still have some waste heat to remove. I’m speaking generally; the majority of applications, GPUs included, will benefit from this technology.
Semiconductors are used for transistors because they give us the ability to electrically control whether they conduct or resist electrical current. I don’t know what mechanism you’d use to do that with superconductors. I agree you don’t ‘have’ to have resistance in order to achieve this functionality, but at this time semiconductors or mechanical relays are the only ways we have to do that. My focus is not in semiconductor / IC design either so I may by way off base, but I don’t know of a mechanism that would allow superconductors to function as transistors (or “electrically controlled electrical connections”), but I really hope I’m wrong!
Simply throwing computing power at the existing models won’t get us general AI. It will let us develop bigger and more complex models, but there’s no guarantee that’ll get us closer to the real thing.
Stupid question probably - is computing power what is holding back general AI? I’ve not heard that.
What’s holding back AGI is a complete lack of progress toward anything like intelligence. What we have now isn’t intelligent, it’s multi-variable probability.
It’s not that it’s not intelligent, it’s that predictive language models are obviously just one piece of the puzzle, and we’re going to need all the pieces to get to AGI. It’s looking incredibly doable if we figured out how to make something that’s dumb but sounds smarter than most of us already. We just need to connect it to other models that handle other things better.
You don’t speak predictively. It’s not one of the pieces, it’s a parlor trick.
deleted by creator
Oh god yes. This is going to be pretty simplified, but: The sheer compute required to run something like ChatGPT is mindboggling, we’re talking thousands of A100 GPUs ($10k a piece, each one has 80GB of VRAM) networked together, and probably petabytes of SSD storage for the DB. Most neutral networks require a bunch of GPUs working in parallel because they need a lot of very fast memory to hold all the data they sift through, and a lot of parallel compute to sift through that data as quickly as possible. That’s why GPUs are good for this - you can think of a CPU like a human, very versatile but there’s only so much one person can do at a time. Meanwhile GPUs are like bug swarms, a bunch of much simpler brains, but specialized, and they “make it up on volume”. It’s only because of advances in computing power, specifically in the amount of compute cores and VRAM on GPU dies, that the current level of AI became possible. Try downloading GPT4All and compare free models that run on your machine to the performance of ChatGPT - you’ll certainly see the speed difference, and if you ask the free ones for code or logic you’ll see the performance difference too.
This is all to say that superconducting traces and transistors mean no heat is generated by dumping power through them, so you can fit them closer together - even right next to and on top of each other, doesn’t matter, because they don’t need to be cooled. And, because you lose no power to heat, it can all go to compute instead, so it’ll be perfectly efficient. It’ll bring down the cost of everything, but specifically computer components, and thus OpenAI will be able to bring more servers online to improve the capabilities of their models.
There is still heat generated by the act of computation itself, unless you use something like reversible computing but I don’t believe there’s any current way to do that.
And even then, superconducting semiconductors are still going to be some ways off. We could have superconductors for the next decade in power transmission and still have virtually no changes to processesors. I don’t doubt that we will eventually do something close to what you describe, but I’d say it’s easily a long way off still. We’ll probably only be seeing cheaper versions of things that already use superconductors, like MRI machines.
Edit: my first draft was harsher then it needed to be, sorry, long day.
First of all, nobody’s saying this is going to happen overnight. Secondly, traditional computing systems generate heat due to electrical resistance and inefficiencies in semiconducting transistors; the process of computation does not inherently require the generation of heat, nor cause it through some other means than electrical resistance. It’s not magic.
Superconduction and semiconduction are mutually exclusive - it’s in the name. A semiconductor has resistance properties midway between a conductor and an insulator. A superconductor exhibits no electrical resistance at all. A material can be a superconductor in one “direction” and a semiconductor in another, or a semiconductor can be “warped” into being a superconductor, but you can’t have electrons flowing in the same direction with some resistance and no resistance at the same time. There’s either resistance, or there’s not.
Finally, there is absolutely no reason that a transistor has to be made of a semiconducting material. They can be made of superconducting materials, and if they are then there’s no reason they’d generate heat beyond manufacturing defects.
Yes, I’m talking about a perfectly superconducting system and I’m not allowing for inefficiencies where components interface or component imperfections resulting in some small amount of resistance that generates heat; that would be a manufacturing defect and isn’t relevant. And of course this is all theoretical right now anyway; we don’t even know for sure if this is actually a breakthrough yet (even if it’s really beginning to look like it). We need to better understand the material and what applications it’s suited to before we can make concrete predictions on what impacts it will have. But everything I suggest is grounded in the way computer hardware actually works.
I appreciate you revising your reply to be less harsh, I wasn’t aiming to correct you on anything I was just offering some thoughts, I find this stuff interesting and like to chat about it. I’m sorry if I made your day worse, I hope things improve.
I said superconducting semiconductors as just a handy wavy way to refer to logic gates/transistors in general. I’m aware that those terms are mutually exclusive, but thats on me, I should have quoted to indicate it as a loose analogy or something.
The only thing I disagree with is your assessment that computation doesn’t create heat, it does. Albeit an entirely negligble amount, due to the fact that traditional computation involves deleting information, which necessarily causes an increase in entropy, heat is created. It’s called Landauer’s principle. It’s an extremely small proportion compared to resistive loss and the like, but it’s there none the less. You could pretty much deal with it by just absorbing the heat into a housing or something. We can of course, design architectures that don’t delete information but I’m reasonably confident we don’t have anything ready to go.
All I really meant to say is that while we can theoretically create superconducting classical computers, a room temperature superconductor would mostly still be used to replace current superconductors, removing the need for liquid helium or nitrogen cooling. Computing will take a long time to sort out, there’s a fair bit of ground to make up yet.
Okay, you’re kind of reaching with that one 😋 I didn’t mention Landauer’s Principle because it’s so negligible as to be irrelevant (seriously, the heat generated by writing or erasing a bit is about equivalent to the energy levels of a single electron in a hydrogen atom, in the range of ~0.018 eV at room temperature), and superconductors will reduce even that. I kind of wish we had another word, for when “negligible” doesn’t do the insignificance justice.
I do appreciate the clarification on the point of superconducting semiconductors - and the concern for my day haha! It really wasn’t anything to do with you, hence the edit. And, your point here is absolutely correct - LK-99 isn’t some magical material that can be all things to all people. Its other properties may make it unsuitable for use with existing hardware manufacturing techniques or in existing designs, and we may not find superconductors that can fill every role that semiconductors currently occupy.
Edit: lol, looks like its “other properties” include not being a fucking superconductor. Savage.
I think “rounding error” is probably the closest term I can think of. A quick back of the envelope estimation says erasing 1 byte at 1GHz will increase an average silicon wafer 1K° in ~10 years, that’s hilariously lower than I’m used to these things turning out to be, but I’m normally doing relativistic stuff so it’s not really fair to assume they’ll be even remotely similar.
Really appreciate the write up! I didn’t know the computing power required!
Another stupid question (if you don’t mind) - adding superconductors to GPUs doesn’t really se like it would make a huge difference on the heat generation. Sure, some of the heat generated is through trace resistance, but the overwhelming majority is the switching losses of the transistors which will not be effected by superconductor technology. Are we assuming these superconductors will be able to replace semiconductors too? Where are these CPU/GPU efficiencies coming from?
I sort of covered this in my other reply, but yes, switching losses are also due to electrical resistance in the semiconducting transistor, and yes I’m assuming that semiconductors are replaced with superconductors throughout the system. Electrical resistance is pretty much the only reason any component generates heat, so replacing semiconductors with superconductors to eliminate the resistance will also eliminate heat generation. I’m not sure why you think superconductors can’t be used for transistors though? Resistance isn’t required for semiconductors to work, it’s an unfortunate byproduct of the material physics rather than something we build in, and I’m not aware of any reason a superconductor couldn’t work where a semiconductor does in existing hardware designs.
Then again I’m also not an IC designer or electrical engineer, so there may be specific design circumstances that I’m not aware of where resistance is desired or even required, and in those situations of course you’d still have some waste heat to remove. I’m speaking generally; the majority of applications, GPUs included, will benefit from this technology.
Semiconductors are used for transistors because they give us the ability to electrically control whether they conduct or resist electrical current. I don’t know what mechanism you’d use to do that with superconductors. I agree you don’t ‘have’ to have resistance in order to achieve this functionality, but at this time semiconductors or mechanical relays are the only ways we have to do that. My focus is not in semiconductor / IC design either so I may by way off base, but I don’t know of a mechanism that would allow superconductors to function as transistors (or “electrically controlled electrical connections”), but I really hope I’m wrong!
Simply throwing computing power at the existing models won’t get us general AI. It will let us develop bigger and more complex models, but there’s no guarantee that’ll get us closer to the real thing.