Bayes Theorem and the Justice System November 22, 2009
Posted by Stephen Godfrey in Probability.Tags: Bayes theorem, general, justice, law, Maths, Probability
add a comment
I have been reading a previous issue of New Scientist and came across an article by Angela Saini which you can find here on how probability is used in the court room. Also if you goto the page you can take an online test to determine if the article is relevant for you next time you get stuck on jury duty.
It appears that to be a good jury member you need to have a good understanding of conditional and Bayesian Probability. So firstly what do I mean by conditional probability?
Let us consider two events and then the conditional probability of occurring given that has already happened is denoted by . Let us consider a fairly simple example. Consider rolling two unique 6 sided dice, let be the even where the sum of the dice is and let be the event where one dice landed on 4. So here we have that and , however here we have that
because we already know that we have one dice being a 4 so the other dice also needs to be a 4 and that only happens one time in six.
Technical aside: To those probability nerds out there I know I should have labeled these dice, die 1 and die 2 to avoid any complications in finding however I am just trying to give a simple overview and will just skim over this.
The actual definition of conditional probability is given as follows. If , the conditional probability of given is
where we read as the probability of both and occurring at the same time. You can check that we could have used this expression in the above example.
The key point here is that the probability of an even occurring can change if we are given some extra information. Also note that if the two events are independent (i.e. are not related to each other) then .
Now what has this got to do with court cases? It has all to do with how some evidence will be presented in court, normally you should consider it as a conditional probability. We have all watched a some tv show that has involved some court case where an expert witness has said that “only 3% of the population has a AB blood type and so does the defendant” So what would we make of this in terms of probability?
Let be the even the defendant is innocent and let be the event that some evidence is being used for or against the defendant. So what out expert witness has given us is . We view it this way any person from that 3% of the population could have left that blood and secondly we assume innocence and have to prove guilt.
What we want to know is so we need some way to relate these two probabilities. This is where Bayes formula comes into the picture.
Let and be the same events as above then
Here you the jury would have a gut feel for what should be, for instance motive, past record and even the way he looks. The probability would be given to you by the person that gives the evidence and can be calculated as
where stands for not i.e. means not innocent. Also note that , as here the defendant actually committed the crime. Lets have a look at the example
Suppose that you are 80% certain that the defendant is innocent. So we have . A Forensics expert gives some evidence that states that some blood of type AB was found at the scene and only 3% of the population have that type of blood and that the defendant has type AB blood. This means that we take . To find we substitute in to the equation
So once this evidence is given to you the defendants probable innocents plummets from 80% down to 22.4%. In this example you could still not convict using just this pice of evidence. What you can do is keep on adjusting during the entire trial (where relevant) using the above reasoning.
I will finish this post with four problems in understanding probabilities in court cases are (in no real order):
- 1) Prosecutor’s or Defendant’s Fallacy
- 2) Ultimate Issue Error Explicitly taking a small with the defendants likelihood of innocence.
- 3) Base-Rate Neglect
- 4) Dependent Evidence Fallacy This is related to the independence or dependence of events. In terms of court cases this would pop up in genetic effects. For instance we all know that certain physiological problems run in the family be it breast cancer or disease. If two events are independent from each other then the probability of both of these events happening can be found by multiplying both of the probabilities together.
However using the breast cancer example there is a 1/8 chance of a women having breast cancer during her life. So what is the probability of a mother and daughter both developing cancer during there lives. Well if mother has breast cancer then it is more likely that the daughter could develop breast cancer sometime during her life. I don’t know what that chance is so for sake of argument lets just say that it is twice as likely then the average person, that is a 1/4 chance. So we would find that there is a (1/8)(1/4)=1/32 chance that both mother and daughter will have cancer some time during there lives.
So this begs the question, if every one can be called up for jury duty should we be teaching more probability in schools so every one can understand trials that can include many confusing probabilities?
Integral Transforms and Partial Differential Equations October 21, 2009
Posted by Stephen Godfrey in Mathematics.Tags: Integral Transform, Maths
add a comment
1. The Fourier and Laplace transform
This started off as a quick little post on solutions of PDE’s but my fingers took over and it has grown to become what it is now.
Every one that has done a maths degree would have seen both of these before, normally the Laplace transform while doing a course on ODE’s and the Fourier transform would come later once you know some measure theory. At least that was how it was for me.
If you have not head of them before or have not looked at them in the past few years and need a refresher please have a look at the above links. An interesting discussion of the Fourier transform is on Terry Tao’s blog here esp is you have seen LCA groups before.
There are many different definitions of the Fourier transform, all the same except for will appear in different places. In can be very annoying in the literature if someone uses a Fourier transform with out stating which one. I shall be using the following definition of the Fourier transform
I will be taking this integral in the Lebesgue sense, however if you have never head of this integral before you can think of it as a normal integral. The difference between Lebesgue and Riemann integrals is that the former has a nicer theory under the hood.
The reason that you normally see the Fourier transform after doing some measure theory (Lebesgue Integral) is because of the particular spaces that can live in so that the integral converges. Applying the triangle inequality it is east to see that the Fourier transform is defined if This is a particular case of an Lp spaceWhen you get into more theory you can extend the underlying domain of the transform to Schawatrz functions or , for now I would recommend you have a look at Tao’s blog on the subject here.
The inverse Fourier transform is given by
Notice that the kernel or the transform is just the complex conjugate of the Fourier transform. (this is called). The trouble here is in what sense should we take this integral, what do we know about ? Using the same argument as above we see that the inverse fourier transform is defined if , however this need not be the case. So what path should we take, we can either restrict so that or we can extend the inversion formula to cover all possible situations. Both of these approaches have pro’s and con’s. Here we shall not concern ourselves with this technical problem and just assume that the maths gods are on our side! For this post we shall assume that this space is called . If you want to look at this in more detail please see….
The Laplace transform is much more standard as it really only has one real formulation which we will take to be
First a note on notation, I am using the classical mathematical notation for these transforms. At some stage someone decided that the Fourier transform would be denoted by and that the Laplace transform be denoted with a capital letter.
Secondly take note that I defined the Fourier transform over and the Laplace transform over . While we can take the Laplace transform over by making the exponent of the Laplace transform to be an inner product of and , generally speaking we normally only take the Laplace transform in the time variable when we are solving a PDE so we don’t need the multidimensional version. I shall mention why the Laplace transform is well suited to these types of problems soon.
Thirdly there is a connection between the Laplace transform and the Fourier transform. Namely that if we assume that for less than zero, then setting gives
Indeed the Laplace and Fourier transform will share many similar properties. The reason that we don’t just study and use the Fourier transform is because the formulation of it is of great importance and is worthwhile looking at in this form.
The inversion of the Laplace transform is a little bit more problematic as there are several different approaches. I will not go into the details here, however I will provide a list of the most commonly used methods.
- 1) Partial Fraction decomposition This is normally the method first show to undergrad students. It is heavily dependent on having a large table of Laplace transforms.
- 2) Convolution Same as above but is useful for productions of transforms. The Laplace transform of a convolution of two functions can be shown to beIf we apply the inverse Laplace Transform to (5) we see that the product of two Laplace transforms can be inverted via a convolution,
- 3) Contour Integral Suppose that exists for all and is the Laplace transform of a piecewise continuous function . Then the Laplace transform can be inverted by the following contour integral
where is greater then the real part of any of the singularities of .
- 4) Post-Widder If the Laplace transform converges for some , where is a locally integrable function, then
This formulation is more useful if you only want to know about the asymptotics of the solution.
2. Transform of Derivatives
Lets start by looking at the connection between the Fourier transform and derivatives and then Differential Equations.
I have already mentioned that I will be looking at the applying Integral transforms to solve Differential Equations. It will be convenient to introduce multi-index notation for derivatives. A multi-index is an -tuple of non-negative integers. If , define . We define for any multi-index by
and for polynomials , where , we define it to be
The importance of the Fourier Transform for solving Differential Equations lies in the following result.
Theorem Let be an integrable function on . Let be a multi-index. Assume that is times differentiable. Then
- .
- .
In (2) we also need the condition that .
Proof: The proof of these results on can be found in many places. On it is a simple case of induction.
The thing that you should take out of this is the following. Consider the Fourier transform as an operator , where is some space of functions that have Fourier transforms. It is interesting to note that differentiation on the space turns into multiplication by polynomials in the transform space . Furthermore multiplication by polynomials in turns into differentiation in .
An interesting question to ask is what is the Fourier transform of the operator
Observe that when the Fourier transform of is taken, we get an operator that is of a similar form to back
The operator is almost a fixed point of the Fourier transform. When the operator is called the Harmonic oscillator.
From this observation we see that using the Fourier transform to solve a differential equation involving a differential operator like is unlikely to be a fruitful approach. In general the Fourier transform is not well suited to solving differential equations with non-constant coefficients. There are, however some problems of this type that can be solved by the Fourier transform. The example we will present later is a Fokker-Plank equation.
The Laplace transform has similar properties If exists and is times differentiable then,
3. Transform Solutions of PDE
We shall solve the classic PDE’s. The heat, wave and Laplace equations by Fourier transforms. We shall also solve the heat equation with different conditions imposed. The general method of solution will be the same. That is, we shall take the Fourier transform of the PDE and its initial and boundary conditions to reduce it to an ODE. We then solve this ODE for the transformed function. We invert this function to determine the solution to our PDE.
This is not just a method that is specific to the Fourier transform. This method also works for the Laplace transform and in general for many integral transforms. One condition on this is that the variable you take the integral transform its domain must match the range of integration of the integral transform. The type of boundary and initial conditions that are given can also play a role in which transform should be used. Once again I will devote a later post.
3.1. Example 1
We consider the Cauchy problem for the heat equation on . That is we solve
with and where
is the Laplacian on . Observe that,
Therefore . We now calculate the Fourier transform of the left hand side,
So if we take the Fourier transform of the Cauchy problem we get,
Taking the Fourier transform of the initial conditions gives,
We solve the ordinary differential equation above for . The ODE is separable, so that
where is an arbitrary factor of integration and is a function of . Hence , where . But . Therefore,
Thus our solution is
Now taking the inverse Fourier transform to determine
Now as
we can rewrite as
where and could be rewritten as
The function is called the heat kernel on . It is also called the Fundamental solution of the heat equation. It is easy to see that for all , solves the heat equation. It is possible to show that the fundamental solution satisfies the initial condition
hence . So
solves the heat equation with the given initial condition. Hence we have solved the Cauchy problem for the heat equation.
3.2. Example 2
Solve the heat equation
where , and with the conditions that
Taking the Laplace transform with respect to and denoting the Laplace transform with a capital we have
Solving this ODE we have
Now the Laplace transform of the conditions are
Using the boundary conditions we find that we have to solve the following system of equations
This system is easily solved as so . So our DE has the full solution of the form
Now to find we apply the complex inversion formula for the Laplace transform
Notice that all the singularities occur at . Each singularity is a simple pole.
If can be shown by the calculus of residues that the final solution is
3.3. Example 3
Let us start by considering the wave equation in . It is here that we start to run into some trouble about what the space should really be. If we were to take the space we would have a rather restrictive space as it does not include all of the solutions we can get from d’Alembert’s solution of the wave equation, where the solution does not even need to be continuous or integrable. However if we were to find solutions of this form using transform methods we would need to let be the space of tempered distributions, to do this would involve a great deal of preliminary work.
We shall concern ourself with the wave equation on -space
In fact when the wave equation determines the behaviour of electromagnetic waves in a vacuum where will be the speed of light. It also models sound waves and many other forms of wave motion.
We also note that we can set with out any loss of generality, as we can rescale the equation.
We will solve the wave equation with Cauchy data. That is solve the problem
with the initial conditions
where . Normally the variable to considered to be time. However we shall not restrict to be positive. In fact for our solution to the wave equation will make sense for any real value of . That is, the wave equation can be reversed in time.
Upon applying the standard technique for solving a partial differential equation by transform methods and treating and a vectors we find that the Fourier transform of (39) is
Solving this ODE, we find the solution to be
where and are to be determined from the transformed initial conditions. These are
It is easy to see that
Therefore we find the solution of the ODE is
The solution of the wave equation is given by applying the inverse Fourier transform. As this has been a formal derivation let us be more precise. A solution of the Cauchy problem for the wave equation is
Proof: The proof is relatively simple. All you need to do is show that satisfies the PDE and the extra conditions. The solution (46) is in fact a unique solution to the wave equation however we shall not prove this.
3.4. Example 4
Solve the Laplace equation in the upper half plane
With the boundary condition . We will take the Fourier transform of the variable as it matched the domain of our transformation.
Once again we find that . Now solving the ODE for ,
To recover we have to take the inverse Fourier transform. But the inverse Fourier transform of does not exist. So we have to set . Hence,
but from the initial condition,
so,
Now evaluating the inner integral,
Hence the solution to our problem is given by
3.5. Example 5
We already mentioned that to solve a non constant coefficient DE via the Fourier transform is not usually a useful approach. The same can be said for the Laplace transform. At the time we did mention that there are certain non constant coefficients PDE’s that can be solved by transforms methods. One example we mentioned was a Fokker-Plank equation.
In many applications such as in Finance a diffusion process is of importance. Often these diffusions are specified by a Fokker-Plank equation.
Let us solve the following Cauchy problem for a particular Fokker-Planck equation.
When we take the Fourier transform, this time we will not reduce the problem to that of solving an ODE. We will in fact end up with a first order partial differential equation.
Simplifying we have
We now have two options on how to proceed. We can solve this first order partial differential equation. However we can also take the Laplace transform in the variable and reduce the problem to that of solving an ODE.
We shall first solve the first order PDE. By the method of characteristics we find the solution to be
where is unknown. To recover we apply the inverse Fourier transform. So
where we made a substitution . Now , so . So we have that
This implies that . Thus
Now taking the inverse Fourier transform we have
After completing the square and setting
we find that
where we applied Cauchy’s Theorem in the last line. Evaluating the inner integral we find that
3.6. Example 6
Here we will look at solving a non constant coefficient that is in cylindrical co-ordinates.
where and such that and with initial condition . The Laplace transform of this PDE is
the boundary conditions turn into and . The general solution of the ODE is
where and are the first order modified Bessel functions of the second kind. Now as is unbounded at the origin we set . Applying the Boundary conditions gives
To find out what is we need to compute the integral
After some calculations and the application of Cauchy’s integral formula at the zeros of we have
where are the zeros of the Bessel function for .
4. Final Remarks
From these examples there are a couple of important points to take away from them. First is that we need to match the domain of the variable we are going to transform to the range of the integration and what happens at the boundaries of your transform. Recall the Laplace transform required that you know initial values and the Fourier transform required decal at the ends of its domain.
The second point comes from the comparison of solving the constant coefficient (CC) and non CC PDE’s. When we were solving PDE’s with CC’s we were able to reduce the problem to solving a differential equation in one variable. In the non CC’s case we were able to reduce the order of the PDE by one.
This might lead you to think that we can blindly applying a Integral transform to a PDE to reduce the problem to something simpler. This is incorrect. Think of it this way our inversion formulas are also integral transforms. So if were were to take the inverse Fourier transform of
we would get Laplace’s equation. So here we have transformed an ODE to a second order PDE. Not much of a simplification.
What would be correct to think is that there is some connection between the kernel of the integral transform and the differential operator. Here the key is that is generally a solution of a second order differential equation with constant coefficients.
So when you are solving a PDE using integral transforms you need to be mindful of both the domain (in particular the boundary). In a later post I will show how you can construct the “nicest” integral transform to solve a certain IVP or BVP. This makes use of Sturm-Liouville theory.
Introducing Integral Transforms October 14, 2009
Posted by Stephen Godfrey in Mathematics.Tags: Integral Transform, Maths
add a comment
As I spent the year studying I slowly found out that the current research in the field was rather different to what was doing in the classics, it appeared that I was born about 60-70 years too late. Most of the current research looks at new ways to invert Laplace transforms, deals with certain classes of distributions (also called Generalised functions), or plug a Hypergeometric function in the kernel. Not easy for work for an undergrad.
Now enough of this chit chat lets get into some maths!
The idea of using transformations in Mathematics is an old one. The idea is to change your problem into a simpler but equivalent problem, then change back to get the solution of the original problem. In this post I shall briefly explain why integrals transforms are useful when solving differential equations. The simplest answer is that one integral undoes one lot of differentiation, so if we integrate a ordinary differential equation we should only have an algebraic one.
The key point to remember is that we have mapped differentiation to something simpler like multiplication.
Integral transforms have been in wide use during the past two centuries as a tool to solve various problems in pure and applied mathematics. Many integral transforms were originally introduced to solve specific problems, but over the course of time have been found to be of use in the solution of other problems as well.
Let and let Let
Then the mapping
is an integral transform with domain , kernel and transform variable .
There are several key questions that should be asked about any integral transform . For this post and possibly later ones we shall only look at the following 4 key questions.
- What functions are in , the domain of the operator ?
- For every function is there a unique function ?
- Given a function , does there exist an operator that recovers ? If so does the domain of this operator match exactly with ?
- What differential operator does this transform diagonalize?
For practical purposes the most important question that of if we can undo the transformation. With a bit of Functional Analysis we can answer this question for every integral transform. Since integration is linear it is easy to see that is linear. That is, the operator has the following property,
The following lemma shows that if is linear and one-to-one, then exists and it is also a linear operator.
If a linear operator from into is one-to-one, then there exists an operator , called the inverse of , such that and , where and are identity operators on and . The operator is also linear. Often is denoted by .
It is important to note that although the linear operator is an integral transform the lemma does not state how is defined. In many cases is also an integral transform, however this is not always the case. It is important to keep in mind that Lemma 1 is a statement about the existence of . In many cases there are several different ways of inverting an integral transform.
Many integral transforms share similar properties. What I intend to do is a quick survey of the well know integral transforms to look at the similarities as motivation to look for a more general framework of Integral transforms.
In an attempt to give a more general theory of integral transforms we shall attack this problem in three different ways, though self-adjoin DE’s, determining the relationship between the kernel in and , and by having the kernel of an integral transform being a Hypergeometric function.
Overall, as I mentioned above I will be interested in solving differential equations by integral transform methods. This is more out of personal interest as it is not the only place these transforms arise. The reason that integral transform methods are well suited to this problem is that many integral transforms convert certain differential operators into operators that act by multiplication.
As a simple example we consider the ordinary differential equation (ODE)
with the initial conditions and . The Laplace transform of this ODE is
which in turn can be solved for the Laplace transform of the unknown function , after some work you find that
To find the solution to the differential equation we apply the inverse Laplace transform to .
For Partial Differential Equations (PDE’s) we often do not reduce the equation to an algebraic problem. Often we reduce it to the problem of solving an ODE. Consider the inhomogeneous equation of telegraphy in the variable,
By applying the Fourier transform in the variable it is transformed into
which is a much simpler problem to solve. In fact the solution to equation (8) can be shown to equal,
To find the solution to equation (7) we just apply the inverse Fourier transform to in the first variable.
Although I will mainly be interested in solving problems like the previous two examples, I will also look at solving certain types of integral equations.