Jakub Szymanik. Data Analysis and Graphics Using R. John Maindonald. Bootstrap Methods and their Application. Stewart Shapiro. Belief, Evidence, and Uncertainty. Prasanta S. The Logic of Metaphor. Eric Steinhart. Sandra Laugier. Modern Perspectives in Type-Theoretical Semantics. Stergios Chatzikyriakidis. Argumentation Machines. Exploring Textual Data. Ludovic Lebart. Julie Kezik. Aggregation Functions. Michel Grabisch. Probability in the Sciences. The Logic of Time. Johan van Benthem. Resource-Sensitivity, Binding and Anaphora. Geert-Jan M. Rainer E. How to write a great review.

The review must be at least 50 characters long. The title should be at least 4 characters long. Your display name should be at least 2 characters long. At Kobo, we try to ensure that published reviews do not contain rude or profane language, spoilers, or any of our reviewer's personal information. You submitted the following rating and review. We'll publish them on our site once we've reviewed them. Continue shopping. Item s unavailable for purchase. Please review your cart. You can remove the unavailable item s now or we'll automatically remove it at Checkout.

Remove FREE. Unavailable for purchase. Continue shopping Checkout Continue shopping. Chi ama i libri sceglie Kobo e inMondadori. View Synopsis. Take the case of Bob and Susie who have been armed with rocks to throw at a glass bottle. Bob is standing a little closer to the bottle than Susie is, so Susie aims and throws her rock a little earlier than Bob throws his, but their rocks hit the glass simultaneously, breaking it shortly after impact. It is assumed that once each child aims and throws their rock, it hits the glass with probability one and the glass breaks with probability one they have excellent aim and a strong desire to destroy glassware.

Why should one be the genuine cause of the glass breaking, simply because it was earlier? This case can also be altered slightly so that one event is clearly responsible, making it a case of preemption. There are two senses in which something may be spurious, which correspond to looking for particular earlier events that explain the effect better than the spurious cause versus making a partition and looking at kinds of events. Eells A second major advance in probabilistic causality comes from the work of Ellery Eells , who proposed separate theories of type and token-level causality.

Recall that type causation refers to relationships between kinds of events, factors, or properties, while token causation refers to relationships between particular events that actually occur. This approach gives a new way of measuring the strength of a causal relationship and probabilistically analyzing token causality. While Mackie and Lewis explicitly addressed token causality, this is one of the few probabilistic accounts.

Type-level causation At the type level, Eells focuses not on finding a single factor that renders a relationship spurious as Reichenbach and Suppes do , but rather on quantifying the difference a potential cause makes to the probability of its effect. To do this, the probability difference is calculated while holding fixed a set of background contexts, averaging over all of these. A background context is a particular assignment of truth values for a set of variables, so with n factors other than the cause, there are 2n ways of holding these fixed.

Before quantifying the importance of a cause for an effect, Eells defines that C is a positive causal factor for E iff for each i Lastly, C may also have mixed relevance for E, where it is not entirely negative, positive, or neutral. Eells defines that C is causally relevant to E if it has mixed, positive, or negative relevance for E — i.

Consider how this would work for smoking and lung cancer. The back- ground factors here may be genetic predispositions G and asbestos expo- sure A. For smoking to be a positive causal factor, the probability of lung cancer LC would have to be greater when smoking S is present than absent, with respect to each context.

This requirement that a causal rela- tionship must hold in all background contents is called context unanimity, and is the subject of ongoing debate. Context unanimity is not assumed in this book during inference discussed in section 4. There may be For further factors that bring about the effect in some scenarios even though they lower discussion its probability in conjunction with other conditions.

Similarly, many of the things that determine whether a cause a. Wearing a seatbelt generally lowers the probability of death from a car accident, but in some cases may injure a person on impact or may prevent them from escaping a vehicle that is on fire. Thus, it will have mixed causal relevance for death from car accidents, even though the majority of the time it is a negative cause.

In addition to determining whether C is causally relevant to E, we may want to describe how relevant C is to E. However, among other details, Eells omits factors that are causally intermediate between X and Y , so effects of X would be excluded. Thus, as before, the causes inferred may. Token-level causation While some theories link type and token causation by relating known type- level causes to single cases or trying to draw type-level conclusions from observations of token-level cases, Eells proposes two unconnected theories.

This poses some problems if taken as a methodological recommendation, but the key point is that type-level relationships do not necessitate token- level ones and, whether or not the theories are connected, information about one level is insufficient. Recall the case of a lit match and a house fire, which was discussed in terms of INUS conditions.

Regardless of the general relationship between lit matches and house fires, we need to know more about the individual situation to determine whether a lit match caused a particular fire. For example, it is highly unlikely that a match lit days Examples of before a fire should be a token cause of it. Even more challenging is a case this type, such where a type-level positive cause is a negative token-level cause. Here there are two actually occurring events discussed in specified by their locations in time and space that are instances of two depth in general types.

This could be a particular person, Anne, driving a car while section 6. To make these deter- minations, Eells suggests examining not a single probability, but rather how the probability of the effect changed over time. This so-called probability trajectory details the probability of y being of type Y over time, starting before x is of type X and ending with the actual occurrence of y. Then, it is said that y is of type Y because of x if the following are all true:. The probability of Y changes at the time x occurs; 2.

Just after x the probability of y is high; 3. The probability is higher than it was before x; and 4. The probability remains high until the time of y. Continuing with the example of Anne, we would deem the car crash occur- ring at time t to be because of her drunk driving if the probability of the crash increased after she got behind the wheel and remained increased until the crash actually occurred.

The probability of the effect changed once the cause occurred, becoming higher than it was before, and did not decrease before the effect finally occurred which would make it seem that something else must have occurred to raise it again. Then, x is causally relevant to y if it happened either because of or despite x. These are 1. Factors that token occur in the particular case, are token uncaused by x being X and that interact with X with respect to Y holding fixed what is actually true before xt. These factors may occur at any time before yt and the causal background context is obtained by holding positively fixed all factors of these two kinds.

However, holding fixed these factors does not improve the classification of all relationships. Consider an event z where z occurs at some time after x and before y. In an example given by Eells, there is a patient who is very ill at time t1 , who is likely to survive until t2 but not until a later t3.

Now assume that at t1 a treatment is administered that is equally. At t2 a completely effective cure is discovered and administered and the only remaining chance of death is due to the first ineffective treatment — not the disease. However, the probability of death did not change after the first treatment, so death was token causally independent of it. But, the relation should actually be despite, as the treatment put the patient at unnecessary risk due to its severe side effects which remain unchanged by the second treatment that cured the underlying disease.

In this example, the second drug is causally relevant to Y survival and is not caused by the administration of the first drug. When holding fixed the second drug being given, using the first kind of factor described, the first drug again has no effect on the probability of Y. Using the second kind of factor has no effect in this case, as the two drugs do not interact, so the probability of survival after the first drug does not change dependent on the presence or absence of the second drug. In what cases will we be able to actually separate the factors that were caused by x on a particu- lar occasion from those that were interacting with it?

Further, outside of examples in physics, it is unlikely that we could find the probability tra- jectories.

## Causality, probability, and time

Remember, this is a probability based not on some relationship between type-level probabilities and token-level information but rather how the probability changes over time in a specific scenario. Eells seeks to elucidate the concept of causality itself, but the counterex- amples to his theory suggest that it does not cover all possibilities. On the other hand, such theories may be useful as the basis for inference suggest- ing the types of evidence to be amassed but the stringent requirements on knowledge make this difficult. To summarize, Eells proposes two probabilistic theories of causality.

At the type level, causes must either positively or negatively produce their effects in all background contexts or be deemed mixed causes if they do both , where these contexts include events earlier than the effect. For the second, at the token level, Eells analyzes how the probability of the token effect changes in relation to the occurrence of the cause. Causal Inference Algorithms The philosophical approaches described so far aim to tell us what it is for something to be a cause, or how we can learn of causes, but to do this from data we need automated methods.

Causal Inference Algorithms Much of this work involves field specific methods that are designed to work with particular data types. Computational approaches generally aim to infer causal relationships from data with significantly less work on extending this to the case of token causality , and can be categorized into two main traditions along with extensions to these. First, the majority of work on characterizing what can be inferred in general and how it can be inferred has been using graphical models Pearl, ; Spirtes et al.

The theories are technically probabilistic, but it is usually assumed that the relationships themselves are deterministic and the probabilities are due to the limits of what may be observed. It has also been used by many researchers in finance and neuroscience, so it will be useful to examine how the approach works and exactly what it is inferring. Bayesian networks One of the first steps toward causal inference was the development of For a more theories connecting graphical models to causal concepts, formulated in detailed parallel by Pearl and Spirtes, Glymour and Scheines hereafter SGS introduction to Bayesian as described in Spirtes et al.

In these graphs, and Friedman variables are represented as vertices, and edges between the vertices indicate , Korb conditional dependence. The techniques do not require temporal data but and Nicholson rather are designed to take a set of observations that may or may not , and be ordered and produce one or more graphs showing the independence Neapolitan relations that are consistent with the data.

During the inference process, Three main assumptions are needed: the causal Markov condition CMC , faithfulness, and causal sufficiency. CMC says that a variable. The influence of X on Y is entirely mediated by Z , so once it is known, X is no longer relevant for predicting Y. With CMC, if two events are dependent and neither one is a cause of the other, then there must be some common causes in the set of variables such that the two events are independent conditional on these common causes.

The graphs are not nec- essarily complete, as there may be causes of some variables or variables intermediate between cause and effect that are left out. Thus, vertices are connected if one is a direct cause of the other, relative to the set of variables in the graph. The graphs are assumed to be complete though in the sense that all common causes of pairs on the set of variables are included. In the structure shown in figure 2. These independence relations allow the conditional probability of any variable to be calculated efficiently, and represented in a compact way.

In general, the probability distribution for a set of variables, x1 ,. CMC is perhaps the most debated portion of the theory, leading to a multitude of papers criticizing Cartwright, , ; Freedman and Humphreys, ; Humphreys and Freedman, and defending it Hausman and Woodward, , The main issue is that depend- ing on how accurately a system is represented, common causes may not always render their effects independent.

One example Spirtes et al. When the TV does turn on, both the sound and picture turn on as well. Thus, even after knowing that the switch is on, knowing that the sound is on still provides information about whether the picture is on as well. Here the picture is not independent of the sound, conditioned on the state of the switch, violating CMC since there is no edge between picture and sound and the switch fails to screen them off from one another.

Adding a variable indicating when there is a closed circuit, shown in figure 2. Previously, the picture or sound gave information about the circuit that was not provided by knowledge of the status of the switch. This case was clear with common knowledge about how circuits work, but it is less clear how such a scenario can be resolved in cases where the structure must be inferred from data. Next, the faithfulness condition says that exactly the independence rela- tions in the graph hold in the probability distribution over the set of variables. The implication is that the independence relations obtained are due to the causal structure, rather than coincidence or latent unmeasured variables.

This is only true in the large sample limit, as with little data, the observations cannot be assumed to be indicative of the true probabilities. There are other cases where faithfulness can fail, though, even with sufficient data. Recall the case shown in figure 2. There were two paths from living in the country to lung cancer: one where this directly lowered the probability of lung cancer, and another where it raised the probability of smoking, which raised the probability of lung cancer.

Probability distri- butions generated from this structure would be said to be unfaithful if the health effect of living in the country exactly balances that of smoking, lead- ing to living in the country being independent of lung cancer. Faithfulness can fail even without exact independence, and in fact one cannot verify exact independence from finite sample data Humphreys and Freedman, In practice, one must choose a threshold at which to call variables conditionally independent, so there can be violations of this condition in a wider variety of cases than one may think, such as if the threshold for inde- pendence is too low or the distributions come close to canceling out without quite balancing.

Biological systems have features designed to ensure exactly this type of behavior, so this can potentially pose a more serious practical problem than it may seem. Another way faithfulness can fail is through selection bias. This is a considerable problem in studies from observational data, as it can happen without any missing causes. As discussed by Cooper , using data from an emergency department ED , it may seem that fever and abdominal pain are statistically dependent.

However, this may be because only patients with those symptoms visit the ED, while those who have only a fever or only abdominal pain stay home. Finally, causal sufficiency means that the set of measured variables includes all of the common causes of pairs on that set. This differs from completeness in that it assumes that the true graph includes these common causes and that they are part of the set of variables measured.

If in mea- surements of the variables in figure 2. Without this assumption, two common effects of a cause will erroneously seem dependent when their cause is not included. When sufficiency and the other assumptions do not hold, a set of graphs that are consistent with the dependencies in the data will be inferred, along with vertices for possible unmeasured common causes. The main idea is to find the graph or set of graphs that best explain the data, and for this there are two primary meth- ods: 1 assigning scores to graphs and searching over the set of possible graphs while attempting to maximize a particular scoring function; 2 be- ginning with an undirected fully connected graph and using repeated con- ditional independence tests to remove and orient edges in the graph.

In the first approach, an initial graph is generated and then the search space is explored by altering this graph. The primary differences between algo- rithms of this type are how the search space is explored e. In the second approach, exemplified by the PC algorithm Spirtes et al. After removing these edges, the remaining ones are directed from cause to effect.

When some edges cannot be directed, the result is a partially directed graph. The primary criticism of this approach is, as discussed, with regards to its assumptions. While it is not explicitly stated, one assumption is that the variables are correctly specified. This is more critical than one might In chapter 7, think. Since BNs do not include temporal information, cases with a strong empirical temporal component will lead to erroneous results if this is not somehow results quantify the encoded into the variables. However, it is unlikely that without knowing level of this there is a relationship between two variables, we know its timing exactly.

In practice, the three primary assumptions CMC, faithfulness, causal suf- ficiency all fail in various scenarios, so the question is whether one can determine whether they hold or if they are true in the majority of cases of interest. Dynamic Bayesian networks While Bayesian networks can be used for representation and inference of causal relationships in the absence of time, many practical cases involve a lag between the cause and effect. We want to know not only that a stock will eventually go up after certain news, but exactly when, so that this information can be traded on.

However, BNs have no natural way of testing these relationships. The simplest case, a system that is stationary and Markov, is shown in figure 2. Recent work has extended DBNs to non-stationary time series, where there are so-called changepoints when the structure of the system how the variables are connected changes. Some approaches find these times for the whole system Robinson and Hartemink, , while others find variable specific changepoints Grzegorczyk and Husmeier, Variables can be defined arbitrarily, but there is no structured method for forming and testing hypotheses more complex than pairwise ones between variables.

One could not automatically determine that smok- ing for a period of 15 years while having a particular genetic mutation leads to lung cancer in 5—10 years with probability 0. Relationships between all variables at all lags in a range being tested are assessed simultaneously leading to a score for the entire graph , requiring searching over a large sample space all pairs of variables connected in all possible ways across a range of times. As a result, one must use heuristics, but these can be sensitive to the parameters chosen and overfit the data. Even more critically, few relationships involve discrete lags and, even in cases where the timing is precise, it is unlikely that it would seem that way from observational data.

Some researchers choose specific timepoints and create variables that group events occurring in a time range, but again one would need to know the timing of relationships before knowing of the relationships. Granger causality Clive Granger , developed a statistical method to take two time series and determine whether one is useful for forecasting the other. Granger did not attempt to relate this to philosophical definitions of causality but rather proposed a new definition that is most similar to correlation.

However, it is one of the few methods that explicitly include time and has been used widely in finance Granger, and neuroscience Bressler and Seth, ; Ding et al. It has also been used by physicists to model information flow Hlavackova-Schindler et al. Thus, it will be useful to discuss the basic idea and its limitations, as well as some misconceptions about the approach.

It is also included in the empirical comparisons in chapter 7. The notation used here is as follows. Then, Granger , defined:. X 2 may not be the best or only predictor of X 1 , rather it is simply found to be informative after accounting for other information. This definition has been debated in both philosophy Cartwright, and economics Chowdhury, ; Jacobs et al.

One cannot truly use all possible variables over an infinitely long timescale, so later work focused on making this approach feasible. While there are a number of methodological choices e. When researchers say they are using a Granger test, it is usually the bivariate test that is meant. It has the advantage of being simple and compu- tationally efficient, though it does not capture the intention of the original definition, which is to use all information.

In the bivariate test, only two time series are included: that of the effect, X 1 , and the cause, X 2. One bivariate method is to use an autoregressive model with the two variables, where if the coefficients of the lagged values of X 2 are nonzero, then X 2 is said to Granger-cause X 1. Each lagged value is weighted by a coefficient, so that a variable may depend more strongly on recent events than those that are more temporally distant.

There are many implementations of this including the granger. Further, it cannot distinguish between causal relationships and correlations between effects of a common cause. This can be seen in equation 2. A more accurate approach is the multivariate one, which includes other variables in the model of each time series. The system is represented as:. Using this representation, X 2 Granger-causes X 1 if at least one of A12 j is nonzero. While this comes closer to causal inference than the bivariate test does, it has practical problems. Such a model quickly becomes computationally infeasible with even a moderate number of lags and variables.

Outside of finance where influence is often assumed to drop off steeply as m increases many areas of work involve influence over long periods of time, such as in epidemiological studies, but these would be prohibitively complex to test. Similarly, even if there were only a few lags, much work involves dozens to hundreds of variables. To illustrate the Chapter 7 complexity of this approach, the multivariate test was applied to the same gives set of data as the approach discussed in this book. That method took 2. The for the same method was implemented in R, but required more than the available bivariate test.

Thus, while the multivariate test has been shown to perform better in comparisons Blinowska et al. Nevertheless, researchers should be aware of its limitations. Much data in economics and finance is of this form, and infor- mation may be lost by binning such variables or treating them as binary such as only increasing or decreasing. Simi- larly, the approach of this book has been extended to continuous variables, with a measure of causal significance based on conditional expected value and a new logic that allows representation of constraints on continuous variables Kleinberg, Thus, there are other approaches that can be applied to handle continuous-valued time series.

Probability Whether we want to determine the likelihood of a stock market crash or if people with a given gene have a higher risk of a disease, we need to understand the details of how to calculate and assess probabilities. But first, what exactly are probabilities and where do they come from? There are For more two primary views. The frequentist view says that probabilities relate to the detail on proportion of occurrences in a series of events.

The probability then corresponds However, we also discuss the probability of events that may only happen once. We may want to know the probability that a recession will end if a policy is enacted, or the chances of a federal interest rate change on a particular date. In the frequentist case, we can get close to inferring the true probability by doing a large number of tests, but when an event may only occur once, we must instead rely on background knowledge and belief.

There is another interpretation of probability, referred to as the Bayesian or subjectivist view. Here the probabilities correspond to degrees of belief in the outcome occurring. In this case, one must have what is called a prior, on which the belief is based. For example, if you bet that the Yankees will beat the Mets in the World Series, you are basing this on your knowledge of both teams, and given that information, which team you think is likelier to prevail. How closely the subjective probability corresponds to the actual probability depends heavily on the prior, and can differ between individuals.

In the following sections, I review some basic concepts in probability that are needed for the following chapters. Readers seeking a more thorough introduction should consult Jaynes , while readers familiar with these details may move on to the next section on logic. Basic definitions We will begin with some basic concepts in probability. First, probabilities are defined relative to the set of possible outcomes, called the sample space. The sample space can be represented as a set, with events defined as subsets of this set. Here all outcomes are equally likely, so we can find the probability of an event by taking the number of favorable outcomes as a fraction of all outcomes.

We write the probability of an event x as P x , where P is the function that assigns this probability. In general, the outcomes in the sample space can have varying probabil- ities this will be the case if a coin is biased toward heads or tails , but there are some properties that a probability function must have.

First, the value of a probability must be greater than or equal to zero and less than or equal to one. An event with probability zero is usually considered impossible, although this is not the case when there is an infinite number of events. Second, the probabilities of all events in the sample space must add up to one.

This ensures that the probability that some outcome in the set will occur is one. Finally, if events are mutually exclusive meaning they cannot both occur , the probability of either event occurring is the sum of their individual probabilities. For example, the event of flipping both heads and both tails cannot be true at the same time. On the other hand, events such as increasing unemployment and decreasing interest rates are not mutually exclusive, as one occurring does not preclude the other from occurring, so the probability of either happening needs to be calculated differently.

P A is the probability of event A. Probability Figure 3. Equation 3. This is known as the addition rule. As shown in figure 3. Thus to find the probability of increasing unemployment or decreasing interest rates, we can sum their individual probabilities and then subtract the probability of both occurring. It is also useful to be able to calculate the probability of the negation of an outcome.

Going back to flipping a coin twice, let A be the event H H heads twice in a row. Now if we are interested in the probability that A does not occur, we want the probability of all the other events that can happen: H T, T H, and T T. This means, for example, that the probability that someone does not have the flu is one minus the probability that they do have the flu.

## Causality, probability, and time — New Jersey Research Community

See figure 3. Two other concepts needed are dependence and independence of events. If I flip a coin twice, the outcome of the second flip is unrelated to the outcome of the first flip. This is an example of events that are independent. Imagine the full sample space as a rectangle containing the circles showing where A and B are true in figure 3.

Then P A tells us if we pick a random point in the rectangle what the chances are that point will be inside the circle marked A. When A and B are independent, whether a point is in A has no bearing on whether it is also in B, which has probability P B. For example, it being Wednesday would not change the probability of a patient having a heart attack. However, if we are interested in the probability that someone smokes and has lung cancer, these events will likely be dependent.

If we compare this to the equation for the independent case, we see that when A and B are independent it implies:. Recalling the coin flipping case, this says that the probability of tails is unchanged by the prior flip being heads. In another case, this might be the probability of a particular politician being chosen as vice president, given who the nominee is. Clearly this will differ significantly between candidates. This corresponds to using the shaded areas of figures 3. A partition is defined as a group of disjoint sets whose union is the entire sample space.

With an arbitrary partition B1 , B2 ,. More generally, using equation 3. With a set B1 , B2 ,. The resulting probability of A is also called the marginal probability, and the process of summing over the set B1 ,. In some cases, it is convenient to calculate a conditional probability using other conditional probabilities. Then we want to know the probability that a patient has the disease given a positive test result. In this case, we have prior information on the probability of each of these occurrences. Observe that in equation 3.

This is one of the key theorems used throughout probabilistic causal inference, in which relationships are described using variants of the conditional probability of the effect given the cause. In the beginning of this chapter, I briefly mentioned the idea of Bayesian probabilities, where the probability of an outcome takes into account prior beliefs. This is useful for formally taking into account a frame of reference. Logic Most doctors would not take this information and then hire teams of people to pray for the recovery of their past patients.

In fact, such a study was indeed done, and showed that this temporally and spatially remote prayer intervention did yield shorter hospital stays Leibovici, , but there has been as far as I know no sudden upsurge of prayer in hospitals. The randomized controlled trial conformed to current standards and the results would normally be accepted given the usual convention of rejecting the null hypothesis at a significance level of 0. Yet no matter how low the p-value, this conflicts with many potential beliefs that the present cannot affect the past, that there should be a physical connection between the patient and treatment, and that prayer is not an effective treatment to name a few.

This disconnect can be accounted for by incorporating beliefs on the efficacy of such an intervention, showing that these are so low that essentially no study, no matter what the significance level, would change them. Logic The inference and explanation approach discussed in the rest of this book is based on a probabilistic temporal logic, but before we can discuss its details, we need to begin with a review of propositional logic followed by an introduction to modal and temporal logic.

Propositional and modal logic The previous discussion of probability made use of some logical operators and their set theoretic equivalents, but did not discuss their details. We can construct statements such as:. This is because if p is true, then q must be true. The statements described so far have been facts with a single truth value. However, we may wish to distinguish between things that must be true and those that could be true. Instead of the propositional case where statements were either true or false, the use of possibility and necessity in modal logic means that we can describe what could have been or must be or may be in the future.

Truth values of statements are then determined relative to a set of possible worlds, Possible where a possible world is simply a collection of propositions that are true worlds are in the world. Then with a set of worlds, W , we can define necessity and also possibility as: discussed in chapter 2. One of the possible worlds, called the actual world, represents what is actually true.

- Eddie Langsett?
- 2. Probability-raising Theories of Causation!
- The Black Cat (Reading Script Only).
- The Secrets of Love for Women.
- 1. Motivation and Preliminaries.
- Quick Before My Hubby Gets Back 5.

Some possible worlds will be accessible from the actual world, while others will not. Possibility and necessity are usually defined relative to a particular world: if all worlds accessible from it satisfy a statement, then it is necessary while if there is at least one accessible where a statement holds, then it is possible. This means that a formula can be false in the actual world, but still possible. Temporal logic In some cases, the truth value of a formula may be time dependent. For example, at one point it was possible that Al Gore would be the 43rd pres- ident of the United States, but this is no longer possible once a different person has been elected as the 43rd U.

While I am not cur- rently tired, I will eventually be tired and go to sleep. Someone who is born will always die. For a gun to be a useful weapon, it must remain loaded until it is fired. Temporal logic, introduced by Arthur Prior in the s Prior, , modified modal logic to describe when formulas must hold, or be true. In the s Amir Pnueli built upon these ideas to develop computational methods for checking these formulas in computer systems Pnueli, In general, temporal logics can express whether a property is true at, say, the next point in time, or at some future timepoint, although there are also different ways of thinking about the future.

In branching time logics, such as computation tree logic CTL Clarke et al. This means it is possible to express whether a property should hold for all possible paths in every possible future, I will eventually eat and sleep or if there simply exists a path where it is true it is possible that in some futures, I will eventually study physics again. In contrast, in linear time logics such as linear temporal logic LTL , from each state there is only one possible path through time. The logic used in this work is a branching time one, so I will only review CTL before moving on to probabilistic extensions of this logic.

The goal of temporal logic has been to reason about computer systems to determine if they satisfy desirable properties or can guarantee that they will avoid serious errors. For this reason, temporal logics are usually interpreted relative to specifications of systems that are encoded by graphs called Kripke structures, which describe how the systems work in a structured way Kripke, In chapter 5, the interpretation of formulas relative to observation sequences is discussed.

Such properties are verified by testing the states a system can occupy e. When using observations, the task is more similar to trying to learn the properties of the microwave by watching people use it. Here we are seeing only a subset of the possible states and possible paths through the system, and will not observe low probability sequences without a very long sequence of observations.

More formally, a Kripke structure is defined by a set of reachable states nodes of the graph , labels propositions that describe the properties true within each state, and a set of edges denoting the transitions of the system which states can be reached from other states Clarke et al. The system in figure 3.

Note that once the system is in state 4, it can only keep transitioning to this state via the self loop. Definition 3. Let A P be a set of atomic propositions. The function R being a total transition function means that for every state, there is at least one transition from that state to itself or to another state. The function L maps states to the truth value of propositions at that state. Since there are A P propositions, there are 2 A P possible truth values for these propositions and L maps each state to one of these.

### Closing thoughts

In the structure shown in figure 3. This means, for example, that s5 can transition to itself, while it cannot transition to s1. Notice that there is at least one transition possible from each state. Thus, a path in the Kripke structure is an infinite sequence of states. This says that the series of transitions described by the sequence is possible. To find the properties that are true in these systems, we need a formal method of representing the properties to be tested. There are a number See Clarke of temporal logics that each express slightly different sets of formulas, et al.

CTL allows expression of introduction properties such as: to temporal logic and For all A paths, at all states G , the sky is blue b. There exists E a path where eventually F food in the microwave will be cooked c. There exists E a path where at the next time X the subway doors will open s. Notice that in all cases we begin by stating the set of paths the formula pertains to all paths, A, or at least one path, E. We then constrain the set of states that must satisfy the formula — all states along the paths G , at least one state at some point F , or the next state X.

AcW o What has been described by example is that formulas in CTL are com- posed of paired path quantifiers and temporal operators. The temporal operators describe where along the path the properties will hold. More formally, the operators are: r F finally , at some state on the path the property will hold. Now we describe how formulas in the logic are constructed the syntax of the logic. First, there are two types of formulas: path formulas, which are true along specific paths, and state formulas, which are true in particular states.

Path formulas are specified by: r If f and g are state formulas, then F f , G f , X f , f W g, and f U g, are path formulas. Note that each tree continues infinitely beyond the states shown. Rather than depicting the structure of the system, these show how the future may unfold. For example, the computation tree associated with the system in figure 3.

Partial computation tree relating to the structure of figure 3. Once we can represent how a system behaves and express properties of interest, we need a method of determining when these properties are satisfied. In the microwave example, once we have a model and method for constructing formulas such as that there is no path to cooked food where the door does not eventually close, we want to determine if this is satisfied by the microwave model.

This is exactly the goal of model checking, which allows one to determine which states of a system satisfy a logical formula. If the initial states of the system are in that set of states, then the model satisfies the formula. We work from the innermost formulas, first finding states where f and g are true, then labeling them with these formulas. The basic principle is that states are labeled with subformulas that are true within them, and in each iteration more complex formulas are analyzed. The rules for labeling states with the formulas in the previous six cases are as follows.

Recall that all states begin labeled with the propositions true within them due to the labeling function, L. The final two cases are slightly more complex. As the formulas are built incrementally beginning with those of size one, the states satisfying f and g have already been labeled at this point. For example, let us check this formula in the structure shown in figure 3.

Initially, states s2 and s3 are labeled with f , and states s2 and s5 are labeled with g. Note that this means the formula can be true without f ever holding. States satisfying f that can transition to states known to satisfy j are labeled with the until formula. Here this is true for s3. The initial state, s1 , is labeled with h and thus the structure satisfies the formula.

- Sense of Endless Woes.
- Probabilistic causation - Wikipedia!
- Join Kobo & start eReading today.
- Account Options.
- Rabbinic Creativity in the Modern Middle East (The Robert and Arlene Kogod Library of Judaic Studies).
- Yanir Seroussi?
- My Shopping Bag;

Since each state has at least one transition, this requires that there is a transition to a strongly connected component where each state in the component is labeled with f. A strongly connected component SCC of a graph is a set of vertices such that for any pair v,u in the set, there is a path from u to v and from v to u. Note that a self loop is a trivial SCC. Finally, let us discuss the complexity of this approach to checking for- mulas.

Probabilistic Temporal Logic CTL allows us to answer questions about various futures and to determine what is possible, what must occur, and what is impossible. While it may be useful to know that a train will eventually arrive using a model of the subway system, most riders would want to know the time bounds on when the next train will arrive, and how likely it is that it will come in the specified interval. Probabilistic computation tree logic PCTL allows precisely this type of reasoning, so that one could determine that with probability greater than 0. It extends CTL by adding deadlines — so that instead of a property holding eventually, it must hold within a particular window of time, and quantifies the likelihood of this happening by adding probabilistic transitions to the structures used.

There are a variety of logics that quantify both time and probability, but in this work I build on that introduced by Hansson and Jonsson Probabilistic Kripke structures Clarke et al. There is an initial state from which paths through the system can begin and a transition function that defines, for each state, the set of states that may immediately follow it and the probability of each of these transitions.

This is a total transition function, which means that each state has at least one transition to itself or another state in S with a nonzero probability. Probabilistic Temporal Logic Note that this is a Kripke structure with a transition function that tells us which transitions are possible and the probability with which they will occur.

The sum of the transition probabilities from a state must equal one, as at each timepoint a transition must be made either to a different state or to the same state. Finally, L s is used to denote the labels of a particular state, s. As in CTL, there are two types of formulas: path formulas and state formulas.

State formulas express properties that must hold within a state, such as it being labeled with certain atomic propositions e. Then, the syntax of the logic tells us how valid PCTL formulas can be constructed. All atomic propositions are state formulas. In the third item are the until U and unless weak until W operators.

### Bestselling Series

Unless is defined the same way, but with no guarantee that g will hold. If g does not become true within time t, then f must hold for a minimum of t time units. Continuing with the drug example, we can now say that either a drug stops seizures in 10 minutes, which happens with at least probability 0. Take the following structure, where the start state is indicated with an arrow and labeled with a.

Remember that each state has at least one transition, but this can be satisfied by a self loop with probability one, as is true of the states in the previous diagram labeled with d and e. Now say we are calculating the probability of the set of paths of length two from a to e. There are two such paths: one through b and one through c. The probability of this set of paths is the sum of the individual path probabilities: 0. A structure K satisfies a state formula if the initial state satisfies it.

Let us now discuss the truth values of PCTL formulas more formally. Then, we have the following path relations:. As defined in equation 3. Instead, it will later be necessary stipulate that there must be at least one transition between f and g.

- Les drogues (Psychologie t. 310) (French Edition).
- Secrets and Desires of the Heart!
- My Wishlist.

Thus, we aim to construct formulas such as:. In appendix B. PCTL model checking Model checking in the probabilistic case is similar to that for CTL, labeling states with subformulas true within them, beginning with all states labeled with the propositions true within them. This section describes how formulas can be checked relative to a model, but in many cases we will have only observations of a system, not a model of how it functions.

I discuss how PCTL formulas can be checked against such observations Some without first inferring a model. Kim et al.