Commit ad4abc08 authored by Gregory Ashton's avatar Gregory Ashton
Browse files

Update readme and improve the documentation and examples

parent e65bca6f
# PyFstat # PyFstat
This is a python package containing basic wrappers of the `lalpulsar` module This is a python package containing basic wrappers of the `lalpulsar` module
with capabilities to perform a variety of searches, primarily focussing on with capabilities to perform a variety of searches, primarily focusing on
semi-coherent glitch searches. semi-coherent glitch searches.
## Examples ## Examples
All examples can be run from their source scripts in [examples](examples), or We include a variety of example search scripts [here](examples), for each
for each example there is descriptive documentation: example there is also a more descriptive write-up containing examples of the
output which we list below. Before running any of the search examples, be sure
to have run the [script to generate fake data](examples/make_fake_data.py).
* [Making fake data with and without glitches](docs/make_fake_data.md) * [Making fake data with and without glitches](docs/make_fake_data.md)
* [Fully coherent MCMC search](docs/fully_coherent_search.md) * [Fully coherent MCMC search](docs/fully_coherent_search.md)
...@@ -17,35 +19,41 @@ for each example there is descriptive documentation: ...@@ -17,35 +19,41 @@ for each example there is descriptive documentation:
The script can be installed system wide via The script can be installed system wide via
``` ```
python setup.py install $ python setup.py install
``` ```
or simply add this directroy to your python path or simply add this directory to your python path. To check that the installation
was successful, run
```
$ python -c 'import pyfstat'
```
if no error message is output, then you have installed `pyfstat`.
### Ephemeris installation ### Ephemeris installation
The scripts require a path to ephemeris files in order to use the The scripts require a path to ephemeris files in order to use the
`lalpulsar.ComputeFstat` module. This can either be specified when initialsing `lalpulsar.ComputeFstat` module. This can either be specified when initialising
each search, or more simply by playing a file `~/pyfstat.conf` in your home each search, or more simply by placing a file `~/pyfstat.conf` into your home
directory which looks like directory which looks like
``` ```
earth_ephem = '/home/<USER>/lalsuite-install/share/lalpulsar/earth00-19-DE421.dat.gz' earth_ephem = '/home/<USER>/lalsuite-install/share/lalpulsar/earth00-19-DE421.dat.gz'
sun_ephem = '/home/<USER>/lalsuite-install/share/lalpulsar/sun00-19-DE421.dat.gz' sun_ephem = '/home/<USER>/lalsuite-install/share/lalpulsar/sun00-19-DE421.dat.gz'
``` ```
here, we use the default ephemeris files provided with `lalsuite`.
where this uses the default ephemeris files provided with `lalsuite`.
### Dependencies ### Dependencies
* swig-enabled lalpulsar, a minimal configuration is given by * swig-enabled lalpulsar, a minimal configuration is given by
``` ```
./configure --prefix=${HOME}/lalsuite-install --disable-all-lal --enable-lalpulsar --enable-lalapps --enable-swig $ ./configure --prefix=${HOME}/lalsuite-install --disable-all-lal --enable-lalpulsar --enable-lalapps --enable-swig
``` ```
* [emcee](http://dan.iel.fm/emcee/current/)[^1] * [emcee](http://dan.iel.fm/emcee/current/)[^1]
* [corner](https://pypi.python.org/pypi/corner/)[^1] * [corner](https://pypi.python.org/pypi/corner/)[^1]
* [dill](https://pypi.python.org/pypi/dill)[^1] * [dill](https://pypi.python.org/pypi/dill)[^1]
* [tqdm](https://pypi.python.org/pypi/tqdm)[^1] (optional), if installed, this
provides a useful progress bar and estimate of the remaining run-time.
[^1]: Most easily installed using either `conda` or `pip` [^1]: Most easily installed using either `conda` or `pip`.
# Fully coherent search using MCMC # Fully coherent search using MCMC
In this example, we will show the basics of setting up and running an MCMC In this example, we will show the basics of setting up and running a MCMC
search for a fully-coherent search. This is based on the example search for a fully-coherent search. This is based on the example
[fully_coherent_search.py](../example/fully_coherent_search.py). We will run [fully_coherent_search.py](../example/fully_coherent_search.py). We will run
the search on the `basic` data generated in the the search on the `basic` data generated in the
[make_fake_data](make_fake_data.md) example. [make_fake_data](make_fake_data.md) example.
We first need to import the search tool, in this example we will use the First, we need to import the search tool, in this example we will use the
`MCMCSearch`, but one could equally use `MCMCGlitchSearch` with `nglitch=0`. `MCMCSearch`, but one could equally use `MCMCGlitchSearch` with `nglitch=0`.
We first import this, To import this,
``` ```python
from pyfstat import MCMCSearch from pyfstat import MCMCSearch
``` ```
Next, we define some variables defining the center of our search, namely the Next, we define some variables defining the exact parameters of the signal
Crab parameters (with which the `basic` data was produced): in the data, and the start and end times:
``` ```python
F0 = 30.0 F0 = 30.0
F1 = -1e-10 F1 = -1e-10
F2 = 0 F2 = 0
...@@ -30,14 +30,17 @@ duration = 100*86400 ...@@ -30,14 +30,17 @@ duration = 100*86400
tend = tstart = duration tend = tstart = duration
``` ```
Next, we specify our prior, this is done using a dictionary: Now, we need to specify out prior. This is a dictionary containing keys for
each variable (in the `MCMCSearch` these are `F0`, `F1`, `F2`, `Alpha`, and
`Delta`). In this example, we choose a uniform box in `F0` and `F1`:
``` ```python
theta_prior = {'F0': {'type': 'norm', 'loc': F0, 'scale': abs(1e-6*F0)}, theta_prior = {'F0': {'type': 'unif', 'lower': F0*(1-1e-6), 'upper': F0*(1+1e-6)},
'F1': {'type': 'norm', 'loc': F1, 'scale': abs(1e-6*F1)}, 'F1': {'type': 'unif', 'lower': F1*(1+1e-2), 'upper': F1*(1-1e-2)},
'F2': F2, 'F2': F2,
'Alpha': Alpha, 'Alpha': Alpha,
'Delta': Delta} 'Delta': Delta
}
``` ```
Each key and value of the `theta_prior` contains an instruction to the MCMC Each key and value of the `theta_prior` contains an instruction to the MCMC
search. If the value is a scalar, the MCMC search holds these fixed (as is the search. If the value is a scalar, the MCMC search holds these fixed (as is the
...@@ -50,67 +53,70 @@ the binary parameters. ...@@ -50,67 +53,70 @@ the binary parameters.
Next, we define the parameters of the MCMC search: Next, we define the parameters of the MCMC search:
``` ```python
ntemps = 1 ntemps = 4
log10temperature_min = -1
nwalkers = 100 nwalkers = 100
nsteps = [100, 500, 1000] nsteps = [1000, 1000]
``` ```
These can be considered the *tuning parameters* of the search. In this instance These can be considered the *tuning parameters* of the search. A complete
we use a single temperature with 100 walkers. For the number of steps we use discussion of these can be found [here](tuning_parameters.md).
a 2-step process. First we run the simulation starting from values picked
about the prior for `100` steps, afterwhich we take the maximum likelihood Passing all this to the MCMC search, we also need to give it a label, a
chain and pick a new set of walkers scattered about this. This step removes directory to save the data, and provide `sftfilepath`, a string matching
walkers which get lost is islands of low probability (it can be removed by
simply called `nsteps = [500, 1000]`). Finally, the simulation run for `500`
steps of burn-in then `1000` steps of production to estimate the posterior.
Passing all this to the MCMC search, we also need to give it a label and
directory to save the data and provide `sftfilepath`, a string matching
the data to use in the search the data to use in the search
``` ```python
mcmc = MCMCSearch('fully_coherent', 'data', sftfilepath='data/*basic*sft', mcmc = MCMCSearch(label='fully_coherent_search_using_MCMC', outdir='data',
theta_prior=theta_prior, tref=tref, tstart=tstart, tend=tend, sftfilepath='data/*basic*sft', theta_prior=theta_prior,
nsteps=nsteps, nwalkers=nwalkers, ntemps=ntemps, tref=tref, tstart=tstart, tend=tend, nsteps=nsteps,
scatter_val=1e-10) nwalkers=nwalkers, ntemps=ntemps,
log10temperature_min=log10temperature_min)
``` ```
To run code, we call To run the simulation, we call
``` ```python
mcmc.run() mcmc.run()
``` ```
This produces two `png` images. The first is the chains after the This produces two `.png` images. The first is the position of the *walkers*
initialisation step during the simulation:
![](img/fully_coherent_init_0_walkers.png) ![](img/fully_coherent_using_MCMC_walkers.png)
Here one can see there are multiple evenly spaced solutions (at least in This shows (in red) the position of the walkers during the burn-in stage. They
frequency) corresponding to different peaks are initially defuse (they start from positions randomly picked from the prior),
in the likelihood. Taking the largest peak, the walkers are reinitialised about but eventually converge to a single stable solution. The black is the production
this point (scattered in a relative way by the `scatter_val`) and run again period from which posterior estimates are made. The bottom panel is a histogram
producing of `twoF`, split into production and burn-in. Note that, early on there are
![](img/fully_coherent_walkers.png) multiple modes corresponding to other peaks, by using the parallel tempering,
Now once can see that the chains have converged onto the single central peak. we allow the walkers to explore all of these peaks and opt for the strong
central candidate.
To get posteriors, we call To get posteriors, we call
``` ```python
mcmc.plot_corner() mcmc.plot_corner()
``` ```
which produces a corner plot which produces a corner plot
![](img/fully_coherent_corner.png) ![](img/fully_coherent_using_MCMC_corner.png)
illustrating the tightly constrained posteriors on `F0` and `F1` and their illustrating the tightly constrained posteriors on `F0` and `F1` and their
covariance. Furthermore, one may wish to get a summary which can be printed covariance. Furthermore, one may wish to get a summary which can be printed
to the terminal via to the terminal via
``` ```python
mcmc.print_summary() mcmc.print_summary()
``` ```
which gives the maximum twoF value, median and standard-deviation, in this case which gives the maximum twoF value, median and standard-deviation, in this case
this is this is
``` ```
Max twoF: 1756.44177246 Summary:
F0 = 2.999999974e+01 +/- 6.363964377e-08 theta0 index: 0
F1 = -9.999960025e-11 +/- 9.920455272e-17 Max twoF: 1771.50622559 with parameters:
F0 = 2.999999874e+01
F1 = -9.999802960e-11
Median +/- std for production values
F0 = 2.999999873e+01 +/- 6.004803009e-07
F1 = -9.999801583e-11 +/- 9.359959909e-16
``` ```
...@@ -2,9 +2,9 @@ ...@@ -2,9 +2,9 @@
Here, we describe the steps required to generate fake data which will be used Here, we describe the steps required to generate fake data which will be used
throughout the other examples. We will generate data based on the properties of throughout the other examples. We will generate data based on the properties of
the Crab pulsar, first as a smooth CW signal and then as a CW signal which the Crab pulsar, first as a smooth CW signal, then as a CW signal which
contains a glitch. This document is based on the file contains one glitch, and finally as a signal with two glitches. This document
[make_fake_data.py](../examples/make_fake_data.py). is based on the file [make_fake_data.py](../examples/make_fake_data.py).
## Smooth signal ## Smooth signal
...@@ -12,7 +12,7 @@ In the following code segment, we import the `Writer` class used to generate ...@@ -12,7 +12,7 @@ In the following code segment, we import the `Writer` class used to generate
fake data, define the Crab parameters and create an instant of the `Writer` fake data, define the Crab parameters and create an instant of the `Writer`
called `data` called `data`
``` ```python
from pyfstat import Writer from pyfstat import Writer
# Define parameters of the Crab pulsar # Define parameters of the Crab pulsar
...@@ -40,7 +40,7 @@ data = Writer( ...@@ -40,7 +40,7 @@ data = Writer(
We can now use the `data` object to create `.sft` files which contain a smooth We can now use the `data` object to create `.sft` files which contain a smooth
signal in Gaussian noise. In detail, the process consists first in calling signal in Gaussian noise. In detail, the process consists first in calling
``` ```python
data.make_cff() data.make_cff()
``` ```
which generates a file `data/basic.cff` (notice the directory and file name which generates a file `data/basic.cff` (notice the directory and file name
...@@ -67,7 +67,7 @@ transientTauDays=100.000 ...@@ -67,7 +67,7 @@ transientTauDays=100.000
Finally, we generate the `.sft` files by calling Finally, we generate the `.sft` files by calling
``` ```python
data.run_makefakedata() data.run_makefakedata()
``` ```
...@@ -82,7 +82,7 @@ start with a simple case in which the glitch occurs half way through the ...@@ -82,7 +82,7 @@ start with a simple case in which the glitch occurs half way through the
observation span. We define the properties of this signal, create observation span. We define the properties of this signal, create
another `Writer` instance called `glitch_data`, and then run `make_data()` another `Writer` instance called `glitch_data`, and then run `make_data()`
``` ```python
dtglitch = duration/2.0 dtglitch = duration/2.0
delta_F0 = 0.4e-5 delta_F0 = 0.4e-5
delta_F1 = 0 delta_F1 = 0
...@@ -136,7 +136,7 @@ but continuous signals. ...@@ -136,7 +136,7 @@ but continuous signals.
Finally, the `Writer` class also provides a wrapper of `lalapps_PredictFstat`. Finally, the `Writer` class also provides a wrapper of `lalapps_PredictFstat`.
So calling So calling
``` ```python
>>> print data.predict_fstat() >>> print data.predict_fstat()
1721.1 1721.1
``` ```
...@@ -153,7 +153,7 @@ sequentially and additively as implemented ...@@ -153,7 +153,7 @@ sequentially and additively as implemented
`pyfstat.BaseSearchClass.calculate_thetas`. Here is an example with two `pyfstat.BaseSearchClass.calculate_thetas`. Here is an example with two
glitches, one a quarter of the way through and the other a fifth from the end. glitches, one a quarter of the way through and the other a fifth from the end.
``` ```python
dtglitch = [duration/4.0, 4*duration/5.0] dtglitch = [duration/4.0, 4*duration/5.0]
delta_phi = [0, 0] delta_phi = [0, 0]
delta_F0 = [0.4e-5, 0.3e-6] delta_F0 = [0.4e-5, 0.3e-6]
......
# Tuning parameters
This page needs to be completed with a description of the tuning parameters,
how they should be set, and the diagnostic tools which can be used in doing so.
...@@ -11,21 +11,23 @@ tstart = 1000000000 ...@@ -11,21 +11,23 @@ tstart = 1000000000
duration = 100*86400 duration = 100*86400
tend = tstart + duration tend = tstart + duration
theta_prior = {'F0': {'type': 'norm', 'loc': F0, 'scale': abs(1e-6*F0)}, theta_prior = {'F0': {'type': 'unif', 'lower': F0*(1-1e-6), 'upper': F0*(1+1e-6)},
'F1': {'type': 'norm', 'loc': F1, 'scale': abs(1e-6*F1)}, 'F1': {'type': 'unif', 'lower': F1*(1+1e-2), 'upper': F1*(1-1e-2)},
'F2': F2, 'F2': F2,
'Alpha': Alpha, 'Alpha': Alpha,
'Delta': Delta 'Delta': Delta
} }
ntemps = 1 ntemps = 4
log10temperature_min = -1
nwalkers = 100 nwalkers = 100
nsteps = [100, 500, 1000] nsteps = [1000, 1000]
mcmc = MCMCSearch('fully_coherent', 'data', sftfilepath='data/*basic*sft', mcmc = MCMCSearch(label='fully_coherent_search_using_MCMC', outdir='data',
theta_prior=theta_prior, tref=tref, tstart=tstart, tend=tend, sftfilepath='data/*basic*sft', theta_prior=theta_prior,
nsteps=nsteps, nwalkers=nwalkers, ntemps=ntemps, tref=tref, tstart=tstart, tend=tend, nsteps=nsteps,
scatter_val=1e-10) nwalkers=nwalkers, ntemps=ntemps,
log10temperature_min=log10temperature_min)
mcmc.run() mcmc.run()
mcmc.plot_corner() mcmc.plot_corner(add_prior=True)
mcmc.print_summary() mcmc.print_summary()
...@@ -2,7 +2,7 @@ from pyfstat import Writer ...@@ -2,7 +2,7 @@ from pyfstat import Writer
# First, we generate data with a reasonably strong smooth signal # First, we generate data with a reasonably strong smooth signal
# Define parameters of the Crab pulsar # Define parameters of the Crab pulsar as an example
F0 = 30.0 F0 = 30.0
F1 = -1e-10 F1 = -1e-10
F2 = 0 F2 = 0
...@@ -37,7 +37,7 @@ glitch_data = Writer( ...@@ -37,7 +37,7 @@ glitch_data = Writer(
glitch_data.make_data() glitch_data.make_data()
# The predicted twoF, given by lalapps_predictFstat can be accsessed by # The predicted twoF, given by lalapps_predictFstat can be accessed by
print data.predict_fstat() print data.predict_fstat()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment