Recently, people whose work I admire made me have to confront the "art not artist" dilemma once more. In this case, Nick Bostrom with racism, and Justin Roiland with domestic abuse.
Thinking about it, more generally, I guess it comes down to:
However, it makes me think about the question: what if an AI were to be in a similar situation? Done something good and also done something bad. The current vibe seems to be that AI is a "tool" and "guns don't kill people, people kill people". But once you assign agency to AI, it starts opening up unexplored questions I think.
For example, what if you clone an AI state, one goes on to kill, the other goes on to save lives, in what way is the other liable? It's a bit like the entanglement experiment that won the 2022 Nobel physics prize -- you're entangling across space (two forks of a mind) vs time (old "good" version of a celebrity vs new "bad" version of a celebrity) where all versions are equally capable of bad in theory. To what extent are versions of people connected, and their potential?
It also reminds me of the sci-fi story Accelerando by Charles Stross (which I recommend, and you can read online for free here) where different forks of humans can be liable for debts incurred by their forks.
On a related note, I was recently reading a section in Existential Physics by Sabine Hossenfelder titled "Free Will and Morals". Forgive the awful photos, but give it a read:
So it doesn't even have to be AI. If someone is criminally insane, they are no longer agents responsible for their own actions, but rather chaotic systems to be managed, just like you don't "blame" the weather for being bad, or a small child for making mistakes.
Then, what if in a sufficiently advanced society we could simply alter our memories or reprogram criminal intent away? Are we killing the undesirable version? The main reasons for punishment are retribution, incapacitation, deterrence, and rehabilitation, but is there research out there that has really thought about how this applies to AI?
There's a fifth reason that applies only to AI: Roko's Basilisk (warning: infohazard) but it's all connected, as I wonder what majority beliefs we hold today that future cultures will find morally reprehensible. It might be things like consuming animals or the treatment of non-human intelligence that is equivalent to or greater than humans by some metric. At least we can say that racism and domestic violence are pretty obviously bad though.