wewbull

joined 1 year ago
[–] wewbull@feddit.uk 16 points 4 months ago (1 children)

The AI numbers are pretty solid. Papers published on Hugging face list training times and platform and convert that into CO2. Those will be full load for weeks/months across arrays of GPUs.

In this case, I don't see why you'd need that kind of hardware for this application. You might be right that it's not running at maximum load. If so, then somebody has been mis-sold the hardware. Whatever you're doing it will be at a consistent load though. They are always doing the same thing.

[–] wewbull@feddit.uk 8 points 4 months ago

...and sky donkeys.

[–] wewbull@feddit.uk 3 points 4 months ago

Weirdly, I think it all got me my first job. I interviewed at a graphics card manufacturer and the interviewer placed one of their cards on the table and said "tell me what's on that card".

I picked it up and pointed out all the components because I knew them all by their part numbers that were written on them. I hadn't seen them before, but I knew they were options in XFree86. Then add in that the regular array of chips was likely VRAM and the chips with the same logo on them as was above the door where the companies own video processors.

I didn't know how any of it worked, but he didn't know that. All he saw was a fresh graduate that just effortlessly identified some quite esoteric components of a design he'd personally made.

[–] wewbull@feddit.uk 7 points 4 months ago (6 children)

The thing that sticks with me is video card support. Back then (before Nvidia, 3dfx, etc) you had VGA cards that had one of a number of chipsets on, but it would be paired with a video timing chip and a RAMDAC. Buying a card required knowing which combination of parts it used and which combinations had support in XFree86. Then writing the configuration required knowing the video timings supported by your monitor. Not just frequencies, but blanking periods and such like.

EDID solved that last problem.

[–] wewbull@feddit.uk 2 points 5 months ago (1 children)

If the 8088 had used all but one 256 8-bit values as legal instructions, all your new instructions after that point would need to start with that unused value and then you can add a maximum of 256 instructions by using the next byte. End result is 511 instructions can be encoded in 16-bits.

[–] wewbull@feddit.uk 5 points 5 months ago (3 children)

So "instruction encoding length".

I don't think that works though. For something like RISC-V, RV64 has a maximum 32-bit instruction encoding. For x86-64 those original 8-bit intructions still exist, and take up a huge part of the encoding space, cutting the number of n-bit instructions to more like 2^(n-7)

[–] wewbull@feddit.uk 7 points 5 months ago

Yes, because 256 memory locations is a bit limiting.

[–] wewbull@feddit.uk 4 points 5 months ago (1 children)

Even then, at what point do you measure it? DDR interface is likely very much narrower than the interfaces between cache levels. Where does the core end and the memory begin?

[–] wewbull@feddit.uk 5 points 5 months ago (2 children)

I expect the engineers are telling the marketing people "No! You can't do that. You'll scare everyone that it's incompatible."

[–] wewbull@feddit.uk 1 points 5 months ago

...but they're not in 100 percent correlation in this case, and you're naive if you think they are .

[–] wewbull@feddit.uk 142 points 5 months ago* (last edited 5 months ago) (20 children)

We do, depending on how you count it.

There's two major widths in a processor. The data register width and the address bus width, but even that is not the whole story. If you go back to a processor like the 68000, the classic 16-bit processor, it has:

  • 32-bit data registers
  • 16- bit ALU
  • 16-bit data bus
  • 32-bit address registers
  • 24-bit address bus

Some people called it a 16/32 bit processor, but really it was the 16-bit ALU that classified it as 16-bits.

If you look at a Zen 4 core it has:

  • 64-bit data registers
  • 512-bit AVX data registers
  • 6 x 64-bit integer ALUs
  • 4 x 256-bit AVX ALUs
  • 2 x 128-bit data bus to DDR5 (dual edge 64-bit)
  • ~40-bits of addressable physical RAM

So, what do you want to call this processor?

64-bit (integer width), 128-bit (physical data bus width), 256-bit (widest ALU) or 512-bit (widest register width)? Do you want to multiply those numbers up by the number of ALUs in a core? ...by the number of cores on a piece of silicon?

Me, I'd say Zen4 was a 256-bit core, but you could argue any of the above numbers.

Basically, it's a measurement that lost all meaning so people stopped using it.

[–] wewbull@feddit.uk 15 points 5 months ago (1 children)

We can, but it's awkward to do so. By having everything work with powers of 2 you don't need to have everything the same size, but can still pack things in memory efficiently.

If your registers were 48bits long, you can use it to store 6 bytes, or 3 short ints, but only one int with 16-bits going unused. If they are powers of two in size, you can always fit smaller things in them with no wasted space.

view more: ‹ prev next ›