Professor, University of Oxford
Director, Future of Humanity Institute
Hunkering down to focus on completing a book project (not quite announcement-ready yet). Though sometimes I have the impression that the world is a conspiracy to distract us from what's important.
Have also recently been doing some thinking on metaethics, and have released two papers on the ethics of (future) digital minds. Working on another paper with some colleagues that will focus on some technical challenges in detecting internal states of potential moral significance in large transformer models and other ML systems.
AIs with moral status and political rights? We'll need a modus vivendi, and it’s becoming urgent to figure out the parameters for that. This paper makes a load of specific claims that begin to stake out a position.
We present a heuristic for correcting for one kind of bias (status quo bias), which we suggest affects many of our judgments about the consequences of modifying human nature. We apply this heuristic to the case of cognitive enhancements, and argue that the consequentialist case for this is much stronger than commonly recognized.
Suns are illuminating and heating empty rooms, unused energy is being flushed down black holes, and our great common endowment of negentropy is being irreversibly degraded into entropy on a cosmic scale. These are resources that an advanced civilization could have used to create value-structures, such as sentient beings living worthwhile lives...
Cosmology shows that we might well be living in an infinite universe that contains infinitely many happy and sad people. Given some assumptions, aggregative ethics implies that such a world contains an infinite amount of positive value and an infinite amount of negative value. But you can presumably do only a finite amount of good or bad. Since an infinite cardinal quantity is unchanged by the addition or subtraction of a finite quantity, it looks as though you can't change the value of the world. Aggregative consequentialism (and many other important ethical theories) are threatened by total paralysis. We explore a variety of potential cures, and discover that none works perfectly and all have serious side-effects. Is aggregative ethics doomed?
In cases where several altruistic agents each have an opportunity to undertake some initiative, a phenomenon arises that is analogous to the winner's curse in auction theory. To combat this problem, we propose a principle of conformity. It has applications in technology policy and many other areas.
Does human enhancement threaten our dignity as some have asserted? Or could our dignity perhaps be technologically enhanced? After disentangling several different concepts of dignity, this essay focuses on the idea of dignity as a quality (a kind of excellence admitting of degrees). The interactions between enhancement and dignity as a quality are complex and link into fundamental issues in ethics and value theory.
Overview of ethical issues raised by the possibility of creating intelligent machines. Questions relate both to ensuring such machines do not harm humans and to the moral status of the machines themselves.
Short article summarizing some of the key issues and offering specific recommendations, illustrating the opportunity and need for "smart policy": the integration into public policy of a broad-spectrum of approaches aimed at protecting and enhancing cognitive capacities and epistemic performance of individuals and institutions.
After some definitions and conceptual clarification, I argue for two theses. First, some posthuman modes of being would be extremely worthwhile. Second, it could be good for human beings to become posthuman.
The revised version 2.1. The document represents an effort to develop a broadly based consensus articulation of the basics of responsible transhumanism. Some one hundred people collaborated with me in creating this text. Feels like from another era.
Wonderful ways of being may be located in the "posthuman realm", but we can't reach them. If we enhance ourselves using technology, however, we can go out there and realize these values. This paper sketches a transhumanist axiology.
The human desire to acquire new capacities, to extend life and overcome obstacles to happiness is as ancient as the species itself. But transhumanism has emerged gradually as a distinctive outlook, with no one person being responsible for its present shape. Here's one account of how it happened.
Existential risks are those that threaten the entire future of humanity. This paper elaborates the concept of existential risk and its relation to basic issues in axiology and develops an improved classification scheme for such risks. It also describes some of the theoretical and practical challenges posed by various existential risks and suggests a new way of thinking about the ideal of sustainability.
Examines the risk from physics experiments and natural events to the local fabric of spacetime. Argues that the Brookhaven report overlooks an observation selection effect. Shows how this limitation can be overcome by using data on planet formation rates.
Twenty-six leading experts look at the gravest risks facing humanity in the 21st century, including natural catastrophes, nuclear war, terrorism, global warming, biological weapons, totalitarianism, advanced nanotechnology, general artificial intelligence, and social collapse. The book also addresses overarching issues—policy responses and methods for predicting and managing catastrophes. Foreword by Lord Martin Rees.
This paper explores some dystopian scenarios where freewheeling evolutionary developments, while continuing to produce complex and intelligent forms of organization, lead to the gradual elimination of all forms of being worth caring about. We then discuss how such outcomes could be avoided and argue that under certain conditions the only possible remedy would be a globally coordinated effort to control human evolution by adopting social policies that modify the default fitness function of future life forms.
Technological revolutions are among the most important things that happen to humanity. This paper discusses some of the ethical and policy issues raised by anticipated technological revolutions, such as nanotechnology.
Existential risks are ways in which we could screw up badly and permanently. Remarkably, relatively little serious work has been done in this important area. The point, of course, is not to welter in doom and gloom but to better understand where the biggest dangers are so that we can develop strategies for reducing them.
Information hazards are risks that arise from the dissemination or the potential dissemination of true information that may cause harm or enable some agent to cause harm. Such hazards are often subtler than direct physical threats, and, as a consequence, are easily overlooked. They can, however, be important.
The embryo selection during IVF can be vastly potentiated when the technology for stem-cell derived gametes becomes available for use in humans. This would enable iterated embryo selection (IES), compressing the effective generation time in a selection program from decades to months.
Human beings are a marvel of evolved complexity. Such systems can be difficult to upgrade. We describe a heuristic for identifying and evaluating potential human enhancements, based on evolutionary considerations.
Presents two theses, the orthogonality thesis and the instrumental convergence thesis, that help understand the possible range of behavior of superintelligent agents—also pointing to some potential dangers in building such an agent.
Cognitive enhancement comes in many diverse forms. In this paper, we survey the current state of the art in cognitive enhancement methods and consider their prospects for the near-term future. We then review some of ethical issues arising from these technologies. We conclude with a discussion of the challenges for public policy and regulation created by present and anticipated methods for cognitive enhancement.
This paper argues that at least one of the following propositions is true: (1) the human species is very likely to go extinct before reaching the posthuman stage; (2) any posthuman civilization is extremely unlikely to run significant number of simulations or (variations) of their evolutionary history; (3) we are almost certainly living in a computer simulation. It follows that the naïve transhumanist dogma that there is a significant chance that we will one day become posthumans who run ancestor-simulations is false, unless we are currently living in a simulation. A number of other consequences of this result are also discussed.
Superintelligence is out in paperback. Buy many copies now!
“I highly recommend this book.”—Bill Gates
“very deep … every paragraph has like six ideas embedded within it.”—Nate Silver
“terribly important … groundbreaking” “extraordinary sagacity and clarity, enabling him to combine his wide-ranging knowledge over an impressively broad spectrum of disciplines – engineering, natural sciences, medicine, social sciences and philosophy – into a comprehensible whole” “If this book gets the reception that it deserves, it may turn out the most important alarm bell since Rachel Carson's Silent Spring from 1962, or ever.”—Olle Haggstrom, Professor of Mathematical Statistics
“Nick Bostrom makes a persuasive case that the future impact of AI is perhaps the most important issue the human race has ever faced. … It marks the beginning of a new era.”—Stuart Russell, Professor of Computer Science, University of California, Berkeley
“Those disposed to dismiss an 'AI takeover' as science fiction may think again after reading this original and well-argued book.” —Martin Rees, Past President, Royal Society
“Worth reading…. We need to be super careful with AI. Potentially more dangerous than nukes”—Elon Musk
“There is no doubting the force of [Bostrom's] arguments … the problem is a research challenge worthy of the next generation's best mathematical talent. Human civilisation is at stake.” —Financial Times
“This superb analysis by one of the world's clearest thinkers tackles one of humanity's greatest challenges: if future superhuman artificial intelligence becomes the biggest event in human history, then how can we ensure that it doesn't become the last?” —Professor Max Tegmark, MIT
Failure to consider observation selection effects result in a kind of bias that infest many branches of science and philosophy. This book presented the first mathematical theory for how to correct for these biases. It also discusses some implications for cosmology, evolutionary biology, game theory, the foundations of quantum mechanics, the Doomsday argument, the Sleeping Beauty problem, the search for extraterrestrial life, the question of whether God exists, and traffic planning.
Current cosmological theories say that the world is so big that all possible observations are in fact made. But then, how can such theories be tested? What could count as negative evidence? To answer that, we need to consider observation selection effects.
"Anthropic shadow" is an observation selection effect that prevent observers from observing certain kinds of catastrophes in their recent geological and evolutionary past. We risk underestimating the risk of catastrophe types that lie in this shadow.
The Doomsday argument purports to prove, from basic probability theory and a few seemingly innocuous empirical premises, that the risk that our species will go extinct soon is much greater than previously thought. My view is that the Doomsday argument is inconclusive—although not for any trivial reason. In my book, I argued that a theory of observation selection effects is needed to explain where it goes wrong.
When driving on the motorway, have you ever wondered about (and cursed!) the fact that cars in the other lane seem to be getting ahead faster than you? One might be tempted to account for this by invoking Murphy's Law ("If anything can go wrong, it will", discovered by Edward A. Murphy, Jr, in 1949). But there is an alternative explanation, based on observational selection effects…
If two brains are in identical states, are there two numerically distinct phenomenal experiences or only one? Two, I argue. But what happens in intermediary cases? This paper looks in detail at this question and suggests that there can be a fractional (non-integer) number of qualitatively identical experiences. This has implications for what it is to implement a computation and for Chalmer's Fading Qualia thought experiment.
Nick Bostrom is a Swedish-born philosopher with a background in theoretical physics, computational neuroscience, logic, and artificial intelligence, as well as philosophy. He is the most-cited professional philosopher in the world under the age of 50.
He is a Professor at Oxford University, where he heads the Future of Humanity Institute as its founding director. He is the author of some 200 publications, including Anthropic Bias (2002), Global Catastrophic Risks (2008), Human Enhancement (2009), and Superintelligence: Paths, Dangers, Strategies (2014), a New York Times bestseller which helped spark a global conversation about the future of AI. He has also published a series of influential papers, including ones that introduced the simulation argument (2003) and the concept of existential risk (2002).
Bostrom’s academic work has been translated into more than 30 languages. He is a repeat main TED speaker and has been interviewed more than 1,000 times by various media. He has been on Foreign Policy’s Top 100 Global Thinkers list twice and was included in Prospect’s World Thinkers list, the youngest person in the top 15. As a graduate student he dabbled in stand-up comedy on the London circuit, but he has since reconnected with the heavy gloom of his Swedish roots.
My interests cut across many disciplines and may therefore at the surface appear somewhat scattershot, but in fact they all share a common aim, namely to figure out how to orient ourselves with respect to important values. I sometimes refer to this as “macrostrategy”: the study of how the value of ultimate outcomes may be connected to present-day actions. My work seeks to contribute to this at both the object level (e.g. by investigating the implications of hypothetical technologies or structural conditions in the current world order) and the meta level (e.g. by developing concepts and analytical techniques that help us think more effectively about this type of questions).
Much of this thinking takes place in a pre-paradigm environment. This is a situation where it is not clear what the right questions are or how to get a grip on them intellectually. It may not even be obvious that there are any problems there to be solved. For example, this was the state of affairs with respect to AI alignment (an area I’ve been interested in since the mid 90s). Until about a decade ago, AGI and superintelligence were widely dismissed as a science fiction topic and ignored by academia. It has now emerged as a thriving research field, with many smart people writing code and equations and gradually making advances. But significant cognitive labor was required to get to this point where cumulative technical progress can start happening.
While there is no recipe for paradigm-creating, I think that some factors—such as intellectual curiosity; a self-critical habit of noticing subtle confusions or tensions in one’s present outlook and leaning into them rather away from them; a nose for what is important and promising; and simply sustained hard thinking—make success more likely. Since the results can be quite impactful, it seems worth seeing if one can cultivate this kind of fruit—at least in the negative sense of removing impediments to its growth. I’ve tried to do this within the FHI: nurture an intellectual microclimate where some very talented people have the freedom to explore big-picture questions for humanity, insulated as far as possible from the dysfunctions of the university world.
Aside from my work related to the AI control problem, I have also originated or contributed to the development of ideas such as the simulation argument, existential risk, transhumanism, information hazards, superintelligence strategy, astronomical waste, crucial considerations, observation selection effects in cosmology and other contexts of self-locating belief, anthropic shadow, the unilateralist’s curse, the parliamentary model of decision-making under normative uncertainty, the notion of a singleton, the vulnerable world hypothesis, along with a number of analyses of future technological capabilities and concomitant ethical issues, risks, and opportunities. Most recently, I’ve been doing some work on the moral and political status of digital minds, and on some issues in metaethics. (I also have a big writing project underway, but it’s too early-stage to discuss quite yet).
One misinterpretation I’ve noticed some people making is to impute to me the view that we ought to be fanatically focused on reducing existential risk and that nothing else really deserves our attention. I must accept some responsibility for this misunderstanding, inasmuch as I have certainly presented arguments to the effect that some moral theories appear to imply (given some assumptions) something close to this conclusion—notably, aggregative consequentialism, including various flavors of utilitarianism. However, I’m not a consequentialist. In other papers I’ve pointed out difficulties for aggregative consequentialism; see, e.g., “Infinite Ethics” and “Pascal's Mugging”. (By the way, I’m also not against AI, in case somebody got that impression.) My actual views are more complex. I don’t think I have yet managed to properly articulate my all-things-considered views on these matters. Something like the parliamentary model of normative uncertainty is perhaps closer to what I would favor, though I don’t think that is quite right either. In any case, I’m neither temperamentally nor ideologically inclined towards fanaticism—or indeed towards any kind of “-ism”.
If you need to contact me directly (I regret I am not always able to respond to emails):
ON THE BANK
On the bank at the end
Of what was there before us
Gazing over to the other side
On what we can become
Veiled in the mist of naïve speculation
We are busy here preparing
Rafts to carry us across
Before the light goes out leaving us
In the eternal night of could-have-been