Metadata-Version: 2.1
Name: cmdstancache
Version: 1.2.0
Summary: Smart cache for Stan models and runs
Home-page: https://github.com/JohannesBuchner/CmdStanCache
Author: Johannes Buchner
Author-email: johannes.buchner.acad@gmx.com
License: GPL
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >3.0, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
Provides-Extra: plot
License-File: LICENSE


CmdStanCache
=============

Quicker model iterations and enhanced productivity for Stan MCMC by

* caching model compilation in a smart way
* caching sampling results in a smart way

No waiting for the resampling the same model with the same data.

Install 
-------

First install `CmdStanPy <https://cmdstanpy.readthedocs.io/>`_ and
CmdStan and make sure it works.

::

	$ pip install cmdstancache

Usage
-----
::

	model = """
	data {
	  int N;
	}
	parameters {
	  real<lower=-10.0, upper=10.0> x[N];
	}
	model {
	  for (i in 1:N-1) {
		 target += -2 * (100 * square(x[i+1] - square(x[i])) + square(1 - x[i]));
	  }
	}
	"""
	data = dict(N=2)

	import cmdstancache

	stan_variables, method_variables = cmdstancache.run_stan(
		model,
		data=data, 
		# any other sample() parameters go here
		seed=42
	)

**Now comes the trick**:

* If you run this code twice, the second time the stored result is read.

* If you add or modify a code comment, the same result is returned without having to rerun.

How it works
-------------

cmdstancache keeps a cache of code and data that has previously been used for MCMC sampling.
If it already has the results, it returns it from the cache.

Here are the details:

1. The code is normalised (stripped of comments and indents)
2. A hash of the normalised code is computed
3. The model code is stored in ~/.stan_cache/<codehash>.stan
4. The model is compiled, if it is not already there
5. The data are sorted by key, exported to json, and a hash computed
6. The data are stored in ~/.stan_cache/<datahash>.json
7. cmdstanpy MCMC is run with code=<codehash>.stan and data=<datahash>.json
8. fit.stan_variables() and fit.method_variables() are returned
9. joblib memoizes steps 7 and 8, avoiding resampling when the same data and code hash are seen.


Plotting
--------

Make a quick corner plots of only the scalar model variables::

	cmdstancache.plot_corner(stan_variables)

In case some chains are stuck, and you want to remove their samples for plotting::

	cleaned_variables = remove_stuck_chains(stan_variables, method_variables)
	plot = plot_corner(cleaned_variables)

Since this is optional, the dependency of corner is pulled in if installed with::

	$ pip install cmdstancache[plot]

Contributors
-------------

* @JohannesBuchner

Contributions are welcome.
