The third source of scientific knowledge is simulation. While there are many historically important analog simulations run on mechanical models, I'm going to focus on computer simulations. At its core, a simulation consists of a model and initial conditions. We can think of a model as a representation of some real or imagined part of the world, and the initial conditions as the starting point. Let's take a simple and famous example to begin with. The economist, Thomas Schelling, presented a very simple model that shows how racial segregation can occur in cities even when little racism is present. To be clear, he didn't argue that cities don't have racism, only that extensive racial segregation can happen even when cities are very but not completely racially tolerant. His model was agent-based, meaning that individuals are explicitly represented in the model. The model represents a city as a grid and people as colored dots on the grid. So let's use blue and green people. We also need to define neighborhoods. Each individual has its neighborhood consisting of the eight adjacent squares, what's called the more neighborhood in this literature. Every individual prefers that at least 30% of its neighbors be of the same type. The blue ones want to have 30% or more blue neighbors, and the green ones want to have 30% or more green neighbors. No one particularly cares about being in a homogeneous neighborhood or even having a majority of neighbors look alike. Now, we need to know two more things to understand the simulation. First, how do the agents move in this model? Remember, their preference is for having 30% or more like neighbors. In each time step of the model, every agent considers its neighbors and decides whether or not this threshold has been met. If it has, the agent does nothing. And if it hasn't, then the agent moves to the nearest open space. The second thing we need are the initial conditions. How the agents are initially distributed on the grid. The simplest way to work with this model is to randomize the initial placement of agents. And then repeatedly run the simulation with a different random configuration. Let's watch what happens. As we saw in these examples, Schelling's model shows that the non-segregated state is very unstable. Even when most agents are initially satisfied with their neighbors, they can become disgruntled as soon as a neighbor leaves or a new one moves in. And this leads to the new agents becoming disgruntled. So a small patch of dissatisfaction results in widespread dissatisfaction and widespread movement and ultimately segregation. Though there are a few grid configurations that are integrated where every agent is happy, these are extremely rare and impossible to find randomly. That's Schelling's major result is that small preferences for similarity can lead to massive segregation and this result is quite robust. Meaning it occurs across many changes to the model including different tolerance thresholds, different rules for movement, differing neighborhood sizes, different special considerations, and so forth. In fact, it's extremely hard to avoid segregation when agents have some preference at all for neighbors like themselves. Now, Schelling simulation is very simple, but it illustrates very well how simulations are put together and what we can learn from them. All of the same principles apply to something more complex. For example, I talked in the past about the phenomenon of global warming. And when scientists want to make predictions about changes to future climates, they of course can't do experiments to directly bear on the subject. So instead, they use simulations, much more complicated than Schelling but still fundamentally the same. Specifically they start with a model of the climate. This has to take into account all of the air in the atmosphere, the oceans, the incoming solar radiation, the landmasses, many, many other things that influence the weather and the climate. And then this model has to represent how energy is transferred through the system, and how this in turn changes climatic variables including, temperature, rainfall, and sea levels. If you follow the discussions of climate change, you might know that there is no single simulation or model in use. So why is that? The reason is the enormous complexity of the climate. The climate is what scientists call a complex system which means it's composed of a very large number of interacting parts. This means that there is no way to fully and realistically simulate the system, so approximations have to be made. And while there's no disagreement about the physical structure of the Earth or the physics of energy transfer, there are different ways to make approximations. So the result is that multiple but closely related distinct models differ in how the complexity is simplified. And I want to come back to this point. So as we saw in the Schelling's case, we also need to include initial conditions and other inputs to the model. Climate models are often initialized with past climate records. But to make predictions about the future, climate scientists need to know how much carbon will continue to burn. Since that's of course unknown, they actually run the model with different scenarios. For example, increased emissions over the next 50 years, steady emissions over the next 50 years, reduction by 25%, and so forth. Like the Schelling simulation, climate simulations are run multiple times randomizing some variables and looking for general trends in the outputs. Moreover, since there is no one master model behind all the simulations, each of the major models has to be tried out. This is the basis for graphs like this one that show massive increases in the global mean temperature unless there are drastic reductions in carbon emissions in the near future. Notice that there are multiple lines in this graph. This corresponds to results from models that considered how much carbon we end up emitting among other variables. Having now talked about two simulations, a simple one and a highly complex one, let's pull back and ask some philosophical questions about them. Specifically, are they really empirical and how can we be sure if they're reliable? First, let's talk about the extent to which simulations are empirical. I've argued that science gets a lot of its power from its empiricism, its ultimate foundation in public observations of the world. But simulations look different. They start from a model which is often represented in a computer, and they don't carry out any conventional observation. Instead, scientists use simulations to compute what happens in a model given some initial conditions. Ultimately, it looks like we're doing math here, not observation. Something's clearly right about this, and there's certainly a difference in procedure. Simulations don't require us to go out in the field to observe or even to manipulate them in the laboratory. But there are some important similarities between experiments and simulations. In both cases, scientists create conditions to test what happens when one or just a few variables are manipulated. Moreover, although experimental systems are physically in the world in the way that most models are not, experimental manipulations happen on experimental systems, which are often modified versions of what we'd find in the world. For example, the Levine lab fruit flies are real fruit flies, but they've been bred for laboratory purposes. We would never find these particular flies in nature unless one escaped. So this means that in both experiments and simulations, we need to make an additional inferential step about the world from what we've learned by manipulating a system. Be it an experimental system like a laboratory fruit fly or the model at the heart of a simulation. There's another way that many simulations are grounded in public observations. The components of models and the initial conditions of the models are of course based on prior observations about the world. In some cases, like in climate simulations, the long accumulated knowledge of many different scientific fields is used to create the model, and observations are used to give the simulation its initial conditions. And in other kinds of models, like Schelling simulation, very simple descriptions of how people might behave are used to construct the model. So if we would want to draw inferences on the basis of this simulation, we'd have to investigate whether the social and psychological assumptions in the model have any real relationship to real communities. So in the end, despite possibly seeming not empirical, simulations really are thoroughly empirical. We can think of simulations as a tool for exploring the consequences effects that we have determined empirically, or facts that we would need to further investigate empirically. One final question about simulations, are they reliable? Can we count on them as a guide to making decisions about the future or using them to intervene in matters of health or public policy? There are two versions of this question. The first concerns particular simulations. It's always a good idea to ask about the quality of a simulation. As a non-expert, usually the best we could do is evaluate the reliability of the research team that performed the simulation, or maybe the venue in which it's published. Do they have a track record. Do they have an ulterior motive or agenda? But the philosophical question is a deeper one. Simulations often take what is known about the past behavior of systems and use them to make predictions about new scenarios, including new domains, new people, or the future. This is definitely a real question, but it's part of a nexus of quite general questions about scientific inference. These questions include, how do we know we haven't neglected some important process or factor that will be important in the future? How do we know that the future will resemble the past in the important respects? And when we move from one system, say fruit flies in the lab to another, say fruit flies in nature, how do we know that there isn't an additional consideration we've neglected. These are deep and important questions the answers to which you're never definitive. The epistemic power of science does not come and having definite ways to deal with all these issues once and for all. Rather, it comes from a general posture of openness. Openness to the possibility that some as-yet-undiscovered factor will prove to be important. That the future may differ in some important respect from the past, and so forth. And this is why scientists say that their results should always be taken to be provisional even when they're well supported. And this is the theme that we'll continue to return to when we discuss objectivity.