Gordon Moore made his prediction talking about component costs. Reading what he wrote, he seems to use "components" because he is referencing integrated circuits (ICs) against simple circuits with discrete components. At the time ICs were new. The way I interpret it (I may be wrong, I am not an electrical engineer), we have basically replaced the idea of components in the law with transistors because everything uses ICs today and it's just simpler to talk about. But more importantly he talked about costs. He wrote: "The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph on next page)." But I don't see how that can be true as we are pursuing advanced packaging to lash smaller ICs together. Certainly the current situation doesn't follow his graph, which he seems to interpret as a log-log linear relationship between the number of components per integrated circuit and the cost per component (I believe with an inflection because the rate of change was in flux at the time). In other words, a straight line results from plotting a curve for the logarithm of the number of components per IC against the logarithm of the cost per component for each year and adjoining the minimum cost for each of the curves (if you look at the graph in his paper it will be easier to understand).
Now the cost per component is not decreasing nearly as much as it used to, so the number of components per IC is not increasing as much as it used to. In fact die sizes which used to result in chips that were relatively close to the complexity for minimum component costs are apparently not today, and so chip companies are splitting things into chiplets/tiles. I am sure if we plotted the same type of graph as Moore plotted in his seminal paper for the last 10 years and projected it out 10 years into the future we would see a much different phenomenon than Moore noted in 1965 that led to the formulation of "Moore's Law". The minimum component costs.
By the way, this discussion in Moore's paper was under a section titled "Costs and curves" and immediately preceding the quote I included above he wrote: "Reduced cost is one of the big attractions of integrated electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate. For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent piece of semiconductor in the equivalent package containing more components. But as components are added, decreased yields more than compensate for the increased complexity, tending to raise the cost per component. Thus there is a minimum cost at any given time in the evolution of the technology. At present, it is reached when 50 components are used per circuit. But the minimum is rising rapidly while the entire cost curve is falling (see graph below). If we look ahead five years, a plot of costs suggests that the minimum cost per component might be expected in circuits with about 1,000 components per circuit (providing such circuit functions can be produced in moderate quantities.) In 1970, the manufacturing cost per component can be expected to be only a tenth of the present cost."
So he was clearly mostly concerned about cost. From what I remember this idea of cost was a part of "Moore's Law" in the 1990s, and slowly, as that has becomes more and more obviously not holding true, that element of the law has been dropped from it. Another change we seem to have made is to only talk about logic transistor density, whereas Moore was clearly concerned with components which, I would think, should include consideration of SRAM transistor scaling and dark transistors as well. But even if you look at logic transistor density it seems to be under 2x every 2 years now, and so even the very weak formulation of the law seems to be "slowing down".
By the way, the graph on the "next page" he is talking about is labeled "log base 2 of the number of components per integrated function" for the y axis and "year" for the x axis, but from the discussion it's clear he constructed that graph by considering the complexity of minimum component cost for each year.
I think the Wikipedia page is more informative. There are similar trends which were noted (e.g. Dennard Scaling), and logical consequences (House's 18-month performance doubling period), all of which sort of gets bundled under the heading of Moore's Law (correctly or not).
The Wikipedia page on Moore's Law is not very good. It does give Moore's original quote, and it also says "Moore posited a log-linear relationship between device complexity (higher circuit density at reduced cost) and time." but it doesn't really explain the situation well enough to explain what Moore was talking about. Nor does it discuss the practical difference between considering functional components and logic transistor density.
Anyway, what really matters is looking at the trendlines that we're actually on, and trying to understand when & how those break down. Looking backwards is mostly just for historical interest.
I think at best they can recover about 11 out of those 36 percents since rectangular masks that are projected partially onto the wafer will be unusable
I'm gonna guess Tr, as in mTr/mm2, means Transistor. Using a capital-M for Mega-Transistors, rather than milli-Transistors, would be a good idea me thinks: 56.246 MTr/mm2.
I'm a bit fuzzy on this stuff, but I seem to recall that valuations are traditionally about 10x expected annual revenue. So, if a single system costs $2M, then they're expecting to sustain delivery of about 200 per year? That seems like simultaneously a lot, and also not very much.
Sets a bad precedent, though. We certainly don't need lots of copycats following in their footsteps, sopping up whatever tidbits of fab capacity remain.
Well, I'm reading on wikipedia that each Tesla D1 has 362 TFLOPS (presumably of BFloat16). They combine 25 of those into a "tile" (9.05 PFLOPS), and the machine includes 120 tiles = 1.09 EFLOPS.
I just checked the linked article on the WSE2, and they pointedly don't mention any performance numbers. However, if you figure 850k cores per wafer each delivering something like 32 FLO per cycle @ 1 GHz, that would work out to 27.2 PFLOPS per wafer. So, about 3x Tesla's "tile".
Of course, I could be off by some multiple. For instance, if WSE2 cores are each 1024 bits wide and use BFloat16, then that would be 4x the above figure (i.e. 108.8 PFLOPS). In any case, it's probably safe to say WSE2 is in the range of tens to hundreds of PFLOPS.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
24 Comments
Back to Article
austonia - Wednesday, November 10, 2021 - link
i'mma need a bigger motherboard to stick this inbrucethemoose - Wednesday, November 10, 2021 - link
But can it run AND play Crisis?Between Google's DeepMind and Nvidia's interactive demo, I suspect the answer is "Yes."
DougMcC - Wednesday, November 10, 2021 - link
That's not what this kind of thing is for. It can design and implement Crysis.brucethemoose - Wednesday, November 10, 2021 - link
Also, Moore's law is not dead. Cost is simply scaling with performance now.Yojimbo - Wednesday, November 10, 2021 - link
Moore's law will never die as long as we keep redefining what it is. Here is Gordon Moore's paper: https://newsroom.intel.com/wp-content/uploads/site...Gordon Moore made his prediction talking about component costs. Reading what he wrote, he seems to use "components" because he is referencing integrated circuits (ICs) against simple circuits with discrete components. At the time ICs were new. The way I interpret it (I may be wrong, I am not an electrical engineer), we have basically replaced the idea of components in the law with transistors because everything uses ICs today and it's just simpler to talk about. But more importantly he talked about costs. He wrote: "The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph on next page)." But I don't see how that can be true as we are pursuing advanced packaging to lash smaller ICs together. Certainly the current situation doesn't follow his graph, which he seems to interpret as a log-log linear relationship between the number of components per integrated circuit and the cost per component (I believe with an inflection because the rate of change was in flux at the time). In other words, a straight line results from plotting a curve for the logarithm of the number of components per IC against the logarithm of the cost per component for each year and adjoining the minimum cost for each of the curves (if you look at the graph in his paper it will be easier to understand).
Now the cost per component is not decreasing nearly as much as it used to, so the number of components per IC is not increasing as much as it used to. In fact die sizes which used to result in chips that were relatively close to the complexity for minimum component costs are apparently not today, and so chip companies are splitting things into chiplets/tiles. I am sure if we plotted the same type of graph as Moore plotted in his seminal paper for the last 10 years and projected it out 10 years into the future we would see a much different phenomenon than Moore noted in 1965 that led to the formulation of "Moore's Law". The minimum component costs.
By the way, this discussion in Moore's paper was under a section titled "Costs and curves" and immediately preceding the quote I included above he wrote:
"Reduced cost is one of the big attractions of integrated electronics, and the cost advantage continues to increase as the technology evolves toward the production of larger and larger circuit functions on a single semiconductor substrate. For simple circuits, the cost per component is nearly inversely proportional to the number of components, the result of the equivalent piece of semiconductor in the equivalent package containing more components. But as components are added, decreased yields more than compensate for the increased complexity, tending to raise the cost per component. Thus there is a minimum cost at any given time in the evolution of the technology. At present, it is reached when 50 components are used per circuit. But the minimum is rising rapidly while the entire cost curve is falling (see graph below). If we look ahead five years, a plot of costs suggests that the minimum cost per component might be expected in circuits with about 1,000 components per circuit (providing such circuit functions can be produced in moderate quantities.) In 1970, the manufacturing cost per component can be expected to be only a tenth of the present cost."
So he was clearly mostly concerned about cost. From what I remember this idea of cost was a part of "Moore's Law" in the 1990s, and slowly, as that has becomes more and more obviously not holding true, that element of the law has been dropped from it. Another change we seem to have made is to only talk about logic transistor density, whereas Moore was clearly concerned with components which, I would think, should include consideration of SRAM transistor scaling and dark transistors as well. But even if you look at logic transistor density it seems to be under 2x every 2 years now, and so even the very weak formulation of the law seems to be "slowing down".
Yojimbo - Wednesday, November 10, 2021 - link
By the way, the graph on the "next page" he is talking about is labeled "log base 2 of the number of components per integrated function" for the y axis and "year" for the x axis, but from the discussion it's clear he constructed that graph by considering the complexity of minimum component cost for each year.mode_13h - Wednesday, November 10, 2021 - link
I think the Wikipedia page is more informative. There are similar trends which were noted (e.g. Dennard Scaling), and logical consequences (House's 18-month performance doubling period), all of which sort of gets bundled under the heading of Moore's Law (correctly or not).https://en.wikipedia.org/wiki/Moore%27s_law
Yojimbo - Thursday, November 11, 2021 - link
The Wikipedia page on Moore's Law is not very good. It does give Moore's original quote, and it also says "Moore posited a log-linear relationship between device complexity (higher circuit density at reduced cost) and time." but it doesn't really explain the situation well enough to explain what Moore was talking about. Nor does it discuss the practical difference between considering functional components and logic transistor density.mode_13h - Wednesday, November 10, 2021 - link
Anyway, what really matters is looking at the trendlines that we're actually on, and trying to understand when & how those break down. Looking backwards is mostly just for historical interest.Yojimbo - Thursday, November 11, 2021 - link
In other words Moore's Law can continue forever if we forget what it meant in the past and redefine it continually to apply to the future...Oxford Guy - Saturday, November 13, 2021 - link
Speaking of minimum cost...One question I’ve had from the start concerning wafer scale is what its minimum cost is. What node, what wafer size, etc.
Just how cheap can it be, taking the margin of the wafer chip seller out of the equation?
And, beyond some awful ancient node, how much would it cost for the venerable 28nm?
Wereweeb - Monday, November 15, 2021 - link
"Moore's Law is not dead. It's just that *says Moore's Law is dead*"eSyr - Wednesday, November 10, 2021 - link
“arm+leg” sounds incredibly chip for such a beast.Oxford Guy - Saturday, November 13, 2021 - link
The arm of Tom Brady and the leg of Naomi Osaka?Wrs - Wednesday, November 10, 2021 - link
Hmm, yields are almost 100% but they throw away 36% of the wafer. If only we could process round chips...SteinFG - Wednesday, November 10, 2021 - link
I think at best they can recover about 11 out of those 36 percents since rectangular masks that are projected partially onto the wafer will be unusableevanh - Wednesday, November 10, 2021 - link
I'm gonna guess Tr, as in mTr/mm2, means Transistor. Using a capital-M for Mega-Transistors, rather than milli-Transistors, would be a good idea me thinks: 56.246 MTr/mm2.mode_13h - Wednesday, November 10, 2021 - link
I'm a bit fuzzy on this stuff, but I seem to recall that valuations are traditionally about 10x expected annual revenue. So, if a single system costs $2M, then they're expecting to sustain delivery of about 200 per year? That seems like simultaneously a lot, and also not very much.nandnandnand - Thursday, November 11, 2021 - link
Compared to other startups you hear about, this one really made a splash. Keep up the good work.mode_13h - Thursday, November 11, 2021 - link
It's certainly neat tech & quite an achievement.Sets a bad precedent, though. We certainly don't need lots of copycats following in their footsteps, sopping up whatever tidbits of fab capacity remain.
teshy.com - Thursday, November 11, 2021 - link
I wonder how this compares to Teslas D1 chip for the Dojo supercomputer?mode_13h - Friday, November 12, 2021 - link
Well, I'm reading on wikipedia that each Tesla D1 has 362 TFLOPS (presumably of BFloat16). They combine 25 of those into a "tile" (9.05 PFLOPS), and the machine includes 120 tiles = 1.09 EFLOPS.I just checked the linked article on the WSE2, and they pointedly don't mention any performance numbers. However, if you figure 850k cores per wafer each delivering something like 32 FLO per cycle @ 1 GHz, that would work out to 27.2 PFLOPS per wafer. So, about 3x Tesla's "tile".
Of course, I could be off by some multiple. For instance, if WSE2 cores are each 1024 bits wide and use BFloat16, then that would be 4x the above figure (i.e. 108.8 PFLOPS). In any case, it's probably safe to say WSE2 is in the range of tens to hundreds of PFLOPS.
Farfolomew - Wednesday, November 24, 2021 - link
Any word on them going public with an IPO?