31 min listen
A Requiem for SPARC with Tom Lyon
ratings:
Length:
93 minutes
Released:
May 10, 2021
Format:
Podcast episode
Description
Oxide and Friends Twitter Space: May 10, 2021A Requiem for SPARC with Tom LyonWe’ve been holding a Twitter Space weekly on Mondays at 5p for about an hour. In addition to [@bcantrill](https://twitter.com/bcantrill) and [@ahl](https://twitter.com/ahl), speakers included special guest Tom Lyon plus Joshua Clulow, Dan McDonald, Dan Cross, Tom Killalea, Theo Schlossnagle, Antranig Vartanian, and [@perlhack](https://twitter.com/perlhack).We recorded the space; the recording is here.Some of the topics we hit on, in the order that we hit them:
[@2:06](https://youtu.be/79NNXn5Kr90?t=126) SPARC 30th anniversary dinner > SPARC was an amazing achievement for its time, > but there were some nasty trade-offs made.
[@2:56](https://youtu.be/79NNXn5Kr90?t=176) illumos announcement on the end of SPARC supportSPARCstation 2
[@4:37](https://youtu.be/79NNXn5Kr90?t=277) “There is no photography allowed in the bring-up lab” story
SPARCstation 1 (code-named Campus) > They bricked their first CPU..
[@6:23](https://youtu.be/79NNXn5Kr90?t=383) UltraSPARC-II E-cache parity error
[@8:51](https://youtu.be/79NNXn5Kr90?t=531) Register windows > Most people don’t know, about that first SPARC, > there was no integer multiply or divide.. > It would trap on the instructions.
I feel so decadent, I’ve just been sprinkling multiplications around my code for years.
[@9:55](https://youtu.be/79NNXn5Kr90?t=595) popc instruction (also called Hamming Weight)
IBM Stretch 1961, and the one-of-a-kind IBM Harvest made for the NSA
Henry Warren’s 2002 Hacker’s Delight Ch. 5 shows a ~20 instruction algorithm (no branches, only adds/shifts/masks by constants) > Warren: According to computer folklore, the population count function is important to the > National Security Agency. No one (outside of NSA) seems to know just what they use it for, > but it may be in cryptography work or in searching huge amounts of material.
According to Agner Fog, Ice Lake performs popcnt with a 3 cycle latency, and Zen 3 with just 1 cycle latency.
Phil Bagwell’s 2001 Ideal Hash Trees depend on pop count > Bagwell: Note that the performance of the algorithm is seriously impacted > by the poor execution speed of the POPCT emulation in Java, a problem > the Java designers may wish to address. Persistent versions of Bagwell’s trees are used for the built-in hash maps of Clojure, and in libraries for Scala etc.
[@11:39](https://youtu.be/79NNXn5Kr90?t=699) This was the debate between Roger Faulkner and Jeff Bonwick: register windows
Roger Faulkner (RIP) thought they were horrific
[@12:35](https://youtu.be/79NNXn5Kr90?t=755) Register fishing: Bryan’s version and Adam’s version > When you want to know the state of some other process, you have to flush > those register windows to memory to be able to recover the stack trace.
[@14:30](https://youtu.be/79NNXn5Kr90?t=870) Delay slot > We sat around the lunch table talking about how crazy it would > be to have a branch that executed right after a branch.
DCTI couple (delayed control transfer instruction)
[@15:31](https://youtu.be/79NNXn5Kr90?t=931) “Well, the instruction set doesn’t allow that..” story > Bedlam. As far as Solaris kernel discussions go, bedlam.
Leibniz vs. Newton
[@20:14](https://youtu.be/79NNXn5Kr90?t=1214) Annulled branches
[@22:17](https://youtu.be/79NNXn5Kr90?t=1337) Praise for SPARCSPARC address space identifiers > When we were porting Solaris to x86, and deciding what fraction of the > address space would belong to the kernel vs the user, it felt disgusting to me.
[@25:26](https://youtu.be/79NNXn5Kr90?t=1526) Software-filled TLB > They just didn’t have the room to cram a hardware page table walk into the chip.
MIPS would give you a trap on a VAC conflict (virtual address cache)
[@27:34](https://youtu.be/79NNXn5Kr90?t=1654) It was slow, it was late, and it had a lot of problems, it was wrong.
UltraSPARC-III, code-named “Cheetah” > It’s weird, I compile this thing over and over, and every 80th time when > I compile and run it, i
[@2:06](https://youtu.be/79NNXn5Kr90?t=126) SPARC 30th anniversary dinner > SPARC was an amazing achievement for its time, > but there were some nasty trade-offs made.
[@2:56](https://youtu.be/79NNXn5Kr90?t=176) illumos announcement on the end of SPARC supportSPARCstation 2
[@4:37](https://youtu.be/79NNXn5Kr90?t=277) “There is no photography allowed in the bring-up lab” story
SPARCstation 1 (code-named Campus) > They bricked their first CPU..
[@6:23](https://youtu.be/79NNXn5Kr90?t=383) UltraSPARC-II E-cache parity error
[@8:51](https://youtu.be/79NNXn5Kr90?t=531) Register windows > Most people don’t know, about that first SPARC, > there was no integer multiply or divide.. > It would trap on the instructions.
I feel so decadent, I’ve just been sprinkling multiplications around my code for years.
[@9:55](https://youtu.be/79NNXn5Kr90?t=595) popc instruction (also called Hamming Weight)
IBM Stretch 1961, and the one-of-a-kind IBM Harvest made for the NSA
Henry Warren’s 2002 Hacker’s Delight Ch. 5 shows a ~20 instruction algorithm (no branches, only adds/shifts/masks by constants) > Warren: According to computer folklore, the population count function is important to the > National Security Agency. No one (outside of NSA) seems to know just what they use it for, > but it may be in cryptography work or in searching huge amounts of material.
According to Agner Fog, Ice Lake performs popcnt with a 3 cycle latency, and Zen 3 with just 1 cycle latency.
Phil Bagwell’s 2001 Ideal Hash Trees depend on pop count > Bagwell: Note that the performance of the algorithm is seriously impacted > by the poor execution speed of the POPCT emulation in Java, a problem > the Java designers may wish to address. Persistent versions of Bagwell’s trees are used for the built-in hash maps of Clojure, and in libraries for Scala etc.
[@11:39](https://youtu.be/79NNXn5Kr90?t=699) This was the debate between Roger Faulkner and Jeff Bonwick: register windows
Roger Faulkner (RIP) thought they were horrific
[@12:35](https://youtu.be/79NNXn5Kr90?t=755) Register fishing: Bryan’s version and Adam’s version > When you want to know the state of some other process, you have to flush > those register windows to memory to be able to recover the stack trace.
[@14:30](https://youtu.be/79NNXn5Kr90?t=870) Delay slot > We sat around the lunch table talking about how crazy it would > be to have a branch that executed right after a branch.
DCTI couple (delayed control transfer instruction)
[@15:31](https://youtu.be/79NNXn5Kr90?t=931) “Well, the instruction set doesn’t allow that..” story > Bedlam. As far as Solaris kernel discussions go, bedlam.
Leibniz vs. Newton
[@20:14](https://youtu.be/79NNXn5Kr90?t=1214) Annulled branches
[@22:17](https://youtu.be/79NNXn5Kr90?t=1337) Praise for SPARCSPARC address space identifiers > When we were porting Solaris to x86, and deciding what fraction of the > address space would belong to the kernel vs the user, it felt disgusting to me.
[@25:26](https://youtu.be/79NNXn5Kr90?t=1526) Software-filled TLB > They just didn’t have the room to cram a hardware page table walk into the chip.
MIPS would give you a trap on a VAC conflict (virtual address cache)
[@27:34](https://youtu.be/79NNXn5Kr90?t=1654) It was slow, it was late, and it had a lot of problems, it was wrong.
UltraSPARC-III, code-named “Cheetah” > It’s weird, I compile this thing over and over, and every 80th time when > I compile and run it, i
Released:
May 10, 2021
Format:
Podcast episode
Titles in the series (100)
Mr. Leventhal, Come here I want to see you by Oxide and Friends