diff --git a/README.md b/README.md index 7f6b1d1d78c7cddc37883cb918aa6a21401b5f80..4a82e683aae2f696417c5e5fd3e181ae952d738b 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,15 @@ # PyFstat This is a python package containing basic wrappers of the `lalpulsar` module -with capabilities to perform a variety of searches, primarily focussing on +with capabilities to perform a variety of searches, primarily focusing on semi-coherent glitch searches. ## Examples -All examples can be run from their source scripts in [examples](examples), or -for each example there is descriptive documentation: +We include a variety of example search scripts [here](examples), for each +example there is also a more descriptive write-up containing examples of the +output which we list below. Before running any of the search examples, be sure +to have run the [script to generate fake data](examples/make_fake_data.py). * [Making fake data with and without glitches](docs/make_fake_data.md) * [Fully coherent MCMC search](docs/fully_coherent_search.md) @@ -17,35 +19,41 @@ for each example there is descriptive documentation: The script can be installed system wide via ``` -python setup.py install +$ python setup.py install ``` -or simply add this directroy to your python path +or simply add this directory to your python path. To check that the installation +was successful, run +``` +$ python -c 'import pyfstat' +``` +if no error message is output, then you have installed `pyfstat`. ### Ephemeris installation The scripts require a path to ephemeris files in order to use the -`lalpulsar.ComputeFstat` module. This can either be specified when initialsing -each search, or more simply by playing a file `~/pyfstat.conf` in your home +`lalpulsar.ComputeFstat` module. This can either be specified when initialising +each search, or more simply by placing a file `~/pyfstat.conf` into your home directory which looks like ``` earth_ephem = '/home/<USER>/lalsuite-install/share/lalpulsar/earth00-19-DE421.dat.gz' sun_ephem = '/home/<USER>/lalsuite-install/share/lalpulsar/sun00-19-DE421.dat.gz' ``` - -where this uses the default ephemeris files provided with `lalsuite`. +here, we use the default ephemeris files provided with `lalsuite`. ### Dependencies * swig-enabled lalpulsar, a minimal configuration is given by ``` -./configure --prefix=${HOME}/lalsuite-install --disable-all-lal --enable-lalpulsar --enable-lalapps --enable-swig +$ ./configure --prefix=${HOME}/lalsuite-install --disable-all-lal --enable-lalpulsar --enable-lalapps --enable-swig ``` * [emcee](http://dan.iel.fm/emcee/current/)[^1] * [corner](https://pypi.python.org/pypi/corner/)[^1] * [dill](https://pypi.python.org/pypi/dill)[^1] +* [tqdm](https://pypi.python.org/pypi/tqdm)[^1] (optional), if installed, this + provides a useful progress bar and estimate of the remaining run-time. -[^1]: Most easily installed using either `conda` or `pip` +[^1]: Most easily installed using either `conda` or `pip`. diff --git a/docs/fully_coherent_search.md b/docs/fully_coherent_search.md deleted file mode 100644 index 461649bda30272ad86df5c801922f03a26a6f6a8..0000000000000000000000000000000000000000 --- a/docs/fully_coherent_search.md +++ /dev/null @@ -1,116 +0,0 @@ -# Fully coherent search using MCMC - -In this example, we will show the basics of setting up and running an MCMC -search for a fully-coherent search. This is based on the example -[fully_coherent_search.py](../example/fully_coherent_search.py). We will run -the search on the `basic` data generated in the -[make_fake_data](make_fake_data.md) example. - -We first need to import the search tool, in this example we will use the -`MCMCSearch`, but one could equally use `MCMCGlitchSearch` with `nglitch=0`. -We first import this, - -``` -from pyfstat import MCMCSearch -``` - -Next, we define some variables defining the center of our search, namely the -Crab parameters (with which the `basic` data was produced): - -``` -F0 = 30.0 -F1 = -1e-10 -F2 = 0 -Alpha = 5e-3 -Delta = 6e-2 -tref = 362750407.0 - -tstart = 1000000000 -duration = 100*86400 -tend = tstart = duration -``` - -Next, we specify our prior, this is done using a dictionary: - -``` -theta_prior = {'F0': {'type': 'norm', 'loc': F0, 'scale': abs(1e-6*F0)}, - 'F1': {'type': 'norm', 'loc': F1, 'scale': abs(1e-6*F1)}, - 'F2': F2, - 'Alpha': Alpha, - 'Delta': Delta} -``` -Each key and value of the `theta_prior` contains an instruction to the MCMC -search. If the value is a scalar, the MCMC search holds these fixed (as is the -case for `F2`, `Alpha`, and `Delta` here). If instead the value is a dictionary -describing a distribution, this is taken as the prior and the variable is -simulated in the MCMC search (as is the case for `F0` and `F1`). Note that -for `MCMCSearch`, `theta_prior` must contain at least all of the variables -given here (even if they are zero), and if `binary=True`, it must also contain -the binary parameters. - -Next, we define the parameters of the MCMC search: - -``` -ntemps = 1 -nwalkers = 100 -nsteps = [100, 500, 1000] -``` - -These can be considered the *tuning parameters* of the search. In this instance -we use a single temperature with 100 walkers. For the number of steps we use -a 2-step process. First we run the simulation starting from values picked -about the prior for `100` steps, afterwhich we take the maximum likelihood -chain and pick a new set of walkers scattered about this. This step removes -walkers which get lost is islands of low probability (it can be removed by -simply called `nsteps = [500, 1000]`). Finally, the simulation run for `500` -steps of burn-in then `1000` steps of production to estimate the posterior. - -Passing all this to the MCMC search, we also need to give it a label and -directory to save the data and provide `sftfilepath`, a string matching -the data to use in the search - -``` -mcmc = MCMCSearch('fully_coherent', 'data', sftfilepath='data/*basic*sft', - theta_prior=theta_prior, tref=tref, tstart=tstart, tend=tend, - nsteps=nsteps, nwalkers=nwalkers, ntemps=ntemps, - scatter_val=1e-10) -``` - -To run code, we call - -``` -mcmc.run() -``` - -This produces two `png` images. The first is the chains after the -initialisation step - -Here one can see there are multiple evenly spaced solutions (at least in -frequency) corresponding to different peaks -in the likelihood. Taking the largest peak, the walkers are reinitialised about -this point (scattered in a relative way by the `scatter_val`) and run again -producing - -Now once can see that the chains have converged onto the single central peak. - -To get posteriors, we call - -``` -mcmc.plot_corner() -``` -which produces a corner plot - -illustrating the tightly constrained posteriors on `F0` and `F1` and their -covariance. Furthermore, one may wish to get a summary which can be printed -to the terminal via - -``` -mcmc.print_summary() -``` -which gives the maximum twoF value, median and standard-deviation, in this case -this is -``` -Max twoF: 1756.44177246 -F0 = 2.999999974e+01 +/- 6.363964377e-08 -F1 = -9.999960025e-11 +/- 9.920455272e-17 -``` diff --git a/docs/fully_coherent_search_using_MCMC.md b/docs/fully_coherent_search_using_MCMC.md new file mode 100644 index 0000000000000000000000000000000000000000..236b628de9286fbb689d94649d03d84208896d57 --- /dev/null +++ b/docs/fully_coherent_search_using_MCMC.md @@ -0,0 +1,122 @@ +# Fully coherent search using MCMC + +In this example, we will show the basics of setting up and running a MCMC +search for a fully-coherent search. This is based on the example +[fully_coherent_search.py](../example/fully_coherent_search.py). We will run +the search on the `basic` data generated in the +[make_fake_data](make_fake_data.md) example. + +First, we need to import the search tool, in this example we will use the +`MCMCSearch`, but one could equally use `MCMCGlitchSearch` with `nglitch=0`. +To import this, + +```python +from pyfstat import MCMCSearch +``` + +Next, we define some variables defining the exact parameters of the signal +in the data, and the start and end times: + +```python +F0 = 30.0 +F1 = -1e-10 +F2 = 0 +Alpha = 5e-3 +Delta = 6e-2 +tref = 362750407.0 + +tstart = 1000000000 +duration = 100*86400 +tend = tstart = duration +``` + +Now, we need to specify out prior. This is a dictionary containing keys for +each variable (in the `MCMCSearch` these are `F0`, `F1`, `F2`, `Alpha`, and +`Delta`). In this example, we choose a uniform box in `F0` and `F1`: + +```python +theta_prior = {'F0': {'type': 'unif', 'lower': F0*(1-1e-6), 'upper': F0*(1+1e-6)}, + 'F1': {'type': 'unif', 'lower': F1*(1+1e-2), 'upper': F1*(1-1e-2)}, + 'F2': F2, + 'Alpha': Alpha, + 'Delta': Delta + } +``` +Each key and value of the `theta_prior` contains an instruction to the MCMC +search. If the value is a scalar, the MCMC search holds these fixed (as is the +case for `F2`, `Alpha`, and `Delta` here). If instead the value is a dictionary +describing a distribution, this is taken as the prior and the variable is +simulated in the MCMC search (as is the case for `F0` and `F1`). Note that +for `MCMCSearch`, `theta_prior` must contain at least all of the variables +given here (even if they are zero), and if `binary=True`, it must also contain +the binary parameters. + +Next, we define the parameters of the MCMC search: + +```python +ntemps = 4 +log10temperature_min = -1 +nwalkers = 100 +nsteps = [1000, 1000] +``` + +These can be considered the *tuning parameters* of the search. A complete +discussion of these can be found [here](tuning_parameters.md). + +Passing all this to the MCMC search, we also need to give it a label, a +directory to save the data, and provide `sftfilepath`, a string matching +the data to use in the search + +```python +mcmc = MCMCSearch(label='fully_coherent_search_using_MCMC', outdir='data', + sftfilepath='data/*basic*sft', theta_prior=theta_prior, + tref=tref, tstart=tstart, tend=tend, nsteps=nsteps, + nwalkers=nwalkers, ntemps=ntemps, + log10temperature_min=log10temperature_min) +``` + +To run the simulation, we call + +```python +mcmc.run() +``` + +This produces two `.png` images. The first is the position of the *walkers* +during the simulation: + +This shows (in red) the position of the walkers during the burn-in stage. They +are initially defuse (they start from positions randomly picked from the prior), +but eventually converge to a single stable solution. The black is the production +period from which posterior estimates are made. The bottom panel is a histogram +of `twoF`, split into production and burn-in. Note that, early on there are +multiple modes corresponding to other peaks, by using the parallel tempering, +we allow the walkers to explore all of these peaks and opt for the strong +central candidate. + +To get posteriors, we call + +```python +mcmc.plot_corner() +``` +which produces a corner plot + +illustrating the tightly constrained posteriors on `F0` and `F1` and their +covariance. Furthermore, one may wish to get a summary which can be printed +to the terminal via + +```python +mcmc.print_summary() +``` +which gives the maximum twoF value, median and standard-deviation, in this case +this is +``` +Summary: +theta0 index: 0 +Max twoF: 1771.50622559 with parameters: + F0 = 2.999999874e+01 + F1 = -9.999802960e-11 + +Median +/- std for production values + F0 = 2.999999873e+01 +/- 6.004803009e-07 + F1 = -9.999801583e-11 +/- 9.359959909e-16 +``` diff --git a/docs/img/fully_coherent_corner.png b/docs/img/fully_coherent_corner.png deleted file mode 100644 index c9a97c32f7515e7eccc3cc8d428eee548b7c68aa..0000000000000000000000000000000000000000 Binary files a/docs/img/fully_coherent_corner.png and /dev/null differ diff --git a/docs/img/fully_coherent_init_0_walkers.png b/docs/img/fully_coherent_init_0_walkers.png deleted file mode 100644 index b2780c0dcff011fa323b6d16d2ef9e81aa213773..0000000000000000000000000000000000000000 Binary files a/docs/img/fully_coherent_init_0_walkers.png and /dev/null differ diff --git a/docs/img/fully_coherent_search_using_MCMC_corner.png b/docs/img/fully_coherent_search_using_MCMC_corner.png new file mode 100644 index 0000000000000000000000000000000000000000..140195eb38f0921a4419eb78e976ee7b4447c9f9 Binary files /dev/null and b/docs/img/fully_coherent_search_using_MCMC_corner.png differ diff --git a/docs/img/fully_coherent_search_using_MCMC_walkers.png b/docs/img/fully_coherent_search_using_MCMC_walkers.png new file mode 100644 index 0000000000000000000000000000000000000000..af2d507c69f5c0880b58a19a470112dd54c28745 Binary files /dev/null and b/docs/img/fully_coherent_search_using_MCMC_walkers.png differ diff --git a/docs/img/fully_coherent_walkers.png b/docs/img/fully_coherent_walkers.png deleted file mode 100644 index 128f1fe57414b62d3c764391e6342766c12d4c93..0000000000000000000000000000000000000000 Binary files a/docs/img/fully_coherent_walkers.png and /dev/null differ diff --git a/docs/make_fake_data.md b/docs/make_fake_data.md index 53536009cfef0b753b7c9db7672759fbff8ab86e..c4bcf521d2fc094675f54f243f1ddd8c2f99aaf4 100644 --- a/docs/make_fake_data.md +++ b/docs/make_fake_data.md @@ -2,9 +2,9 @@ Here, we describe the steps required to generate fake data which will be used throughout the other examples. We will generate data based on the properties of -the Crab pulsar, first as a smooth CW signal and then as a CW signal which -contains a glitch. This document is based on the file -[make_fake_data.py](../examples/make_fake_data.py). +the Crab pulsar, first as a smooth CW signal, then as a CW signal which +contains one glitch, and finally as a signal with two glitches. This document +is based on the file [make_fake_data.py](../examples/make_fake_data.py). ## Smooth signal @@ -12,7 +12,7 @@ In the following code segment, we import the `Writer` class used to generate fake data, define the Crab parameters and create an instant of the `Writer` called `data` -``` +```python from pyfstat import Writer # Define parameters of the Crab pulsar @@ -40,7 +40,7 @@ data = Writer( We can now use the `data` object to create `.sft` files which contain a smooth signal in Gaussian noise. In detail, the process consists first in calling -``` +```python data.make_cff() ``` which generates a file `data/basic.cff` (notice the directory and file name @@ -67,7 +67,7 @@ transientTauDays=100.000 Finally, we generate the `.sft` files by calling -``` +```python data.run_makefakedata() ``` @@ -82,7 +82,7 @@ start with a simple case in which the glitch occurs half way through the observation span. We define the properties of this signal, create another `Writer` instance called `glitch_data`, and then run `make_data()` -``` +```python dtglitch = duration/2.0 delta_F0 = 0.4e-5 delta_F1 = 0 @@ -136,7 +136,7 @@ but continuous signals. Finally, the `Writer` class also provides a wrapper of `lalapps_PredictFstat`. So calling -``` +```python >>> print data.predict_fstat() 1721.1 ``` @@ -153,7 +153,7 @@ sequentially and additively as implemented `pyfstat.BaseSearchClass.calculate_thetas`. Here is an example with two glitches, one a quarter of the way through and the other a fifth from the end. -``` +```python dtglitch = [duration/4.0, 4*duration/5.0] delta_phi = [0, 0] delta_F0 = [0.4e-5, 0.3e-6] diff --git a/docs/tuning_parameters.md b/docs/tuning_parameters.md new file mode 100644 index 0000000000000000000000000000000000000000..ed03cfa3d98a81611c1893e1ffc2cf3680f64795 --- /dev/null +++ b/docs/tuning_parameters.md @@ -0,0 +1,4 @@ +# Tuning parameters + +This page needs to be completed with a description of the tuning parameters, +how they should be set, and the diagnostic tools which can be used in doing so. diff --git a/examples/fully_coherent_search.py b/examples/fully_coherent_search.py deleted file mode 100644 index 9b11b66b576023736585d9aa453ca7fe4f2f2970..0000000000000000000000000000000000000000 --- a/examples/fully_coherent_search.py +++ /dev/null @@ -1,31 +0,0 @@ -from pyfstat import MCMCSearch - -F0 = 30.0 -F1 = -1e-10 -F2 = 0 -Alpha = 5e-3 -Delta = 6e-2 -tref = 362750407.0 - -tstart = 1000000000 -duration = 100*86400 -tend = tstart + duration - -theta_prior = {'F0': {'type': 'norm', 'loc': F0, 'scale': abs(1e-6*F0)}, - 'F1': {'type': 'norm', 'loc': F1, 'scale': abs(1e-6*F1)}, - 'F2': F2, - 'Alpha': Alpha, - 'Delta': Delta - } - -ntemps = 1 -nwalkers = 100 -nsteps = [100, 500, 1000] - -mcmc = MCMCSearch('fully_coherent', 'data', sftfilepath='data/*basic*sft', - theta_prior=theta_prior, tref=tref, tstart=tstart, tend=tend, - nsteps=nsteps, nwalkers=nwalkers, ntemps=ntemps, - scatter_val=1e-10) -mcmc.run() -mcmc.plot_corner() -mcmc.print_summary() diff --git a/examples/fully_coherent_search_using_MCMC.py b/examples/fully_coherent_search_using_MCMC.py new file mode 100644 index 0000000000000000000000000000000000000000..737259109fe7cd044be67d6678bdbe9eedd673dd --- /dev/null +++ b/examples/fully_coherent_search_using_MCMC.py @@ -0,0 +1,33 @@ +from pyfstat import MCMCSearch + +F0 = 30.0 +F1 = -1e-10 +F2 = 0 +Alpha = 5e-3 +Delta = 6e-2 +tref = 362750407.0 + +tstart = 1000000000 +duration = 100*86400 +tend = tstart + duration + +theta_prior = {'F0': {'type': 'unif', 'lower': F0*(1-1e-6), 'upper': F0*(1+1e-6)}, + 'F1': {'type': 'unif', 'lower': F1*(1+1e-2), 'upper': F1*(1-1e-2)}, + 'F2': F2, + 'Alpha': Alpha, + 'Delta': Delta + } + +ntemps = 4 +log10temperature_min = -1 +nwalkers = 100 +nsteps = [1000, 1000] + +mcmc = MCMCSearch(label='fully_coherent_search_using_MCMC', outdir='data', + sftfilepath='data/*basic*sft', theta_prior=theta_prior, + tref=tref, tstart=tstart, tend=tend, nsteps=nsteps, + nwalkers=nwalkers, ntemps=ntemps, + log10temperature_min=log10temperature_min) +mcmc.run() +mcmc.plot_corner(add_prior=True) +mcmc.print_summary() diff --git a/examples/make_fake_data.py b/examples/make_fake_data.py index df0c37b048cbf4051ee32163c822d0353a9f1769..ede4cbe8006e256acdbe167f1e8cdece268be1a6 100644 --- a/examples/make_fake_data.py +++ b/examples/make_fake_data.py @@ -2,7 +2,7 @@ from pyfstat import Writer # First, we generate data with a reasonably strong smooth signal -# Define parameters of the Crab pulsar +# Define parameters of the Crab pulsar as an example F0 = 30.0 F1 = -1e-10 F2 = 0 @@ -37,7 +37,7 @@ glitch_data = Writer( glitch_data.make_data() -# The predicted twoF, given by lalapps_predictFstat can be accsessed by +# The predicted twoF, given by lalapps_predictFstat can be accessed by print data.predict_fstat()