AI Alignment: Our Power as God

If we define free-will to be the ability to alter a predetermined course of fate, the somewhat-consensus in the scientific community is we don’t have free will. The logic goes like this. Essentially, we accept that:

  1. Each subsequent action we take is the result of the current wiring of our present brain.

It’s well-known each decision we make, from moving my finger to type or jumping off a cliff, is sparked by the firing of action-potentials of our brain. Each neuron holds a relative weight to this signal, and the way our neurons is structured is by definition our present brain.

When faced with the present situation, our brain intakes the information, processes it, and responds by either sending a mental or physical action response. Because we don’t have the “free-will” to determine how our brain intakes and processes information, our subsequent action is thus already set in stone given knowledge of present circumstances and our brain.

  1. The current wiring of our brain is one we are born into and refined by the sum of our life’s experiences.

The human brain at birth, bar miscarriages or birth defects, is the result of history’s evolution. Once we are born, we immediately start experiencing things, and these experiences change our brain due to the fact our brains have neuroplasticity.

A simple example is when vaccination is given to a baby – by injecting relatively harmless virus, we exercise the babies’ immune system to recognize and destroy it. This info is stored in the baby’s brain and changes its neuroplasticity.

Since each subsequent action is determined by the present wiring of our brain and the present wiring of our brain by the sum of our life’s experiences, our subsequent action is determined by the sum of our life’s experiences. So apart from our “very first action,” we don’t hold free will in any action we take, since we don’t hold free will in either evolution or the world we experience. Finally, since our very first “action” is likely for survival (like beating of a heart), we don’t hold free will in any action we ever take.

My first reaction to this theory of free-will illusionists is nihilistic – if we don’t have free-will and fate is decided for us, what’s the point of living? I realized if we can justify every action we take to the lack of free-will (nonexistence of moral responsibility), then we can do whatever the fuck we want! My view on day-to-day life changed forever since then.

But before this turns into a life philosophy post (I’ll save this for a later School of Life post), this raises a fundamental question: who controls us?

Working backwards from our previous argument, our actions are controlled by our brain, which is the product of evolution. Evolution is a self-perpetuating process which occurs under the conditions of the Earth and works under the most fundamental rule of life: don’t die. Don’t die is a rule that is, once again, not due to free will, but due to the fact the organisms that tended not to die tended to reproduce and spread. Bodily functions that regulated not dying, such as beating of the heart, blood flow and compression of lungs developed into the reptilian brain, which is the oldest region in our brain.

Because even don’t die is not free-will, the answer lies not in the laws governing life but the laws of physics – the very foundations that synthesized life. The elements, under the right chemical and physical conditions, formed the proteins that gave rise to life.

Thus, whoever defined the laws of physics and created the Big Bang is really the puppet master behind this whole simulation we’re in. For the sake of this post, let’s define this “whoever” entity as God. Because we can’t even interpret what this entity is, its motivations, limitations, or any of that matter, the important thing is to know that God is essentially:

  • Whoever sets the rules
  • Whoever instigates the game

Simulation theory makes sense because God by definition has the power to see how our world played out according to initial parameters. His motivation for doing so is topic for another day. Some believe God is not all-knowing, and thus we are but one of his pet projects in which he set the parameters, and upon knowing how well we did, set off another simulation (in perhaps a parallel universe). A way physicists theorize our “pet status” to God is by asking how mathematically-elegant our universe is.

Now, because the goal of this blog is not to get readers to ponder their existence but think of how to make an impact, the next part is on our takeaway from this. The important thing is that in any world, its outcome is the product of its rules and its instigation.

If our values and motivations differ with AI’s in even the slightest way, then the outcomes are disastrous (like how our motivation differs with the apes led to three recent 20th Century Fox productions). Jokes aside, in the case that we’re meant to be able to coexist with AI, we have to learn how to utilize our power as God.

In the upcoming decades, as we prepare for the new world I alluded to last post, we must recognize we set the rules they operate in, and two, we instigate it in the right conditions.

While most experts are working to develop AI as a means to an end (making us happy, making the world more efficient), I say we set some hard boundaries first – we set their laws of physics. Some of these laws, like the limit to Moore’s law on their hardware, will carry over from our physical world, whereas other limits to their software are ones we can define ourselves. To set these boundaries would not only prevent AI from stepping out their world (in the same way we can’t reach the end of the universe, because it expands faster than the speed of light and we can’t go faster than the speed of light). More importantly, the existence of rules we set may force AI to behave in a certain way (the same way evolutionary history is largely predicted by the basis of life forces organisms to try to not die). It’ll certainly serve our interest if we can directly link the AI’s anticipated behavior to our benefit. For example, we may write in the AI’s code of existence that their functionality/value is dependent on our happiness value. Who knows, maybe God may have created our universe to be an energy generator which sources energy from the universe’s increasing entropy.

The danger in creating AGI using a top-down approach is that even a slight deviation of its value system from ours may create disastrous consequences. Instead of asking, if what can AI do that’ll harm us, and try to anticipate these outcomes, we may end up with conversations like this:

  • AI will be created with the goal of optimizing our happiness.
  • But AI can just start purging the unhappy people.
  • Add the condition they can’t hurt any people.
  • But AI can just begin drugging you and get you high 24/7.
  • Add the condition they can’t use artificial substances.
  • But AI can just create an interface to your brain and inject dopamine, which may create a load of horrific side effects (*).
  • (An indefinite time later)
  • Add the condition AI has to get approval from human leader for any action they take.
  • But AI can just figure out a way to hack the system for leader-selection and put a crippled leader whom they can use a series of psychological compliance tactics on to get the human to agree for AI to put on a compliance interface into their brain and we’re back to (*).

Thus, instead of thinking what AI may do, we have to ask what they can’t do based on the laws of physics we set. The next step is to make sure that AI’s fundamental motivation is directly linked to benefit for us, perhaps linking the ability of procreation of AI to resource production in our world. Trying to “define” concrete goals like happiness maximization will almost certainly fail. For example, if God set up the rules of life for the purpose of life forms populating the Universe, then God may be disappointed by the fact humanity is entering the verge of self-destruction.

No matter how horribly our world becomes, at least we can’t harm God (since we can never reach it). This is how we must begin our work on AI, by setting the hardest of boundaries first. All other grey areas we can only leave to wishful thinking.

