Fully Bayes Normal Means

Last updated: 2022-05-07

Checks: 7 0

Knit directory: stat34800/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20180411)

The command set.seed(20180411) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: f3c2dab

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version f3c2dab. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Untracked files:
    Untracked:  analysis/currency_analysis.Rmd
    Untracked:  analysis/haar.Rmd
    Untracked:  analysis/stocks_analysis.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/bayes_normal_means.Rmd) and HTML (docs/bayes_normal_means.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	f3c2dab	Matthew Stephens	2022-05-07	workflowr::wflow_publish(“analysis/bayes_normal_means.Rmd”)
html	a39d47c	stephens999	2018-05-03	Build site.
Rmd	196d0e3	stephens999	2018-05-03	wflow_publish(“analysis/bayes_normal_means.Rmd”)

Background

In a previous homework you implemented Empirical Bayes (EB) shrinkage for the normal means problem with a normal prior. That is we have data \(X=(X_1,\dots,X_n)\): \[X_j | \theta_j, s_j \sim N(\theta_j, s_j^2)\] and assume \[\theta_j | \mu,\sigma \sim N(\mu,\sigma^2) \quad j=1,\dots,n.\]

The EB approach involved two steps:

Estimates \(\mu, \sigma\) by maximizing the log-likelihood \(l(\mu,\sigma) =\log p(X | \mu,\sigma)\).
Compute the posterior distribution \(p(\theta_j | \hat\mu,\hat\sigma)\).

The EB approach can be criticized for ignoring uncertainty in the estimates of \(\mu\) and \(\sigma\). Here we will use MCMC to do a fully Bayesian analysis that takes account of this uncertainty.

Fully Bayes approach

To make this easier we will first re-parameterize to use \(\eta = log(\sigma)\), so \(\eta\) can take any value on the real line.

We will use a uniform prior on \((\mu,\eta)\), \(p(\mu,\eta) \propto 1\) in the range \(\mu = [-a,a]\) and \(\eta \in [-b,b]\). You can use \(a=10^6\) and \(b=10\). (Because \(\eta\) is on the log scale, \(b=10\) covers a wide range of possible standard deviations). Thus the posterior distribution on \(\mu,\eta\) is given by \[p(\mu,\eta | X) \propto p(X | \mu, \eta) I(|\mu|<a) I(|\eta|<b)\]

where \(I\) denotes an indicator function.

Modify your log-likelihood computation code from your previous homework to compute the log-likelihood for \((\mu,\eta)\) given data \(X\) (and standard deviations \(s\)).
Use this to implement a MH algorithm to sample from \(\pi(\mu,\eta) \propto p(X | \mu,\eta)\). Note: In computing the MH acceptance probability you need to compute a ratio \(L_1/L_2\). For numerical stability reasons you should always compute this ratio by \(\exp(l_1 - l_2)\) where \(l_i = \log(L_i)\) rather than directly computing \(L_1\) and \(L_2\) and then computing their ratio. (If both \(L_1\) and \(L_2\) are very small, they may be 0 to machine precision, which causes problems if you try to compute \(L_1/L_2\) directly.)
Apply your MH algorithm to simulated data where you know the answer. Run you MH algorithm multiple (at least 3) times from multiple different initializations. For each run plot how the value of \(log \pi(\mu^t,\eta^t)\) changes with iteration \(t\). You should see that it starts from a low value (assuming you initialized to something that is not consistent with the data) and then gradually increases until it settles down to a “steady state” behavior. Use these plots to help decide how many iterations to run your algorithm to get reliable results (ie so results from different runs look similar) and how many iterations to discard as “burn-in”. Compare your posterior distributions of \(\mu\) and \(\eta\) with the true values you simulated (the distributions should cover the true values unless you did something wrong or are unlucky!)
Repeat part 3 for the “8 schools data” here (omitting the comparisons with the true values, which of course you do not know here).
Note that the posterior distribution on \(\theta_j\) is given by: \[p(\theta_j | X) = \int p(\theta_j | X, \mu, \eta)p(\mu,\eta | X)\] which is the expectation of \(p(\theta_j | X, \mu, \eta)\) over the posterior \(p(\mu,\eta | X)\). Computing posterior distributions like this is sometimes referred to as “integrating out uncertainty in” \(\mu,\eta\). (You might find it useful to compare this with the EB approach of just plugging in the maximum likelihood estimates and computing \(p(\theta_j | X, \hat{\mu},\hat{\eta})\). Notice that the two will produce similar results if the posterior distribution \(p(\mu,\eta | X)\) is very concentrated around the mle.) Given \(T\) samples \(\mu^1,\eta^1,\dots,\mu^T, \eta^T\) from the posterior distribution \(p(\mu,\eta | X)\) you can approximate this expectation by \[p(\theta_j | X) \approx (1/T)\sum_t p(\theta_j | X, \mu^t, \eta^t).\] So you can approximate the posterior mean by \[E(\theta_j | X) \approx (1/T)\sum_t E(\theta_j | X, \mu^t, \eta^t).\] Using the same idea, give an expression to approximate the posterior second moment \(E(\theta^2_j | X)\), and so approximate the posterior variance (and hence posterior standard deviation).
Use the results from 4 and 5 to compute an approximate posterior mean and posterior standard deviation for \(\theta_j\) for each school in the 8 schools data. Compare and contrast your results with the EB results and also the discussion in the initial blog-post here

sessionInfo()

R version 4.1.0 Patched (2021-07-20 r80657)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.2

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] workflowr_1.7.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3     bslib_0.3.1      compiler_4.1.0   pillar_1.7.0    
 [5] later_1.3.0      git2r_0.30.1     jquerylib_0.1.4  tools_4.1.0     
 [9] getPass_0.2-2    digest_0.6.29    jsonlite_1.8.0   evaluate_0.15   
[13] tibble_3.1.7     lifecycle_1.0.1  pkgconfig_2.0.3  rlang_1.0.2     
[17] cli_3.3.0        rstudioapi_0.13  yaml_2.3.5       xfun_0.30       
[21] fastmap_1.1.0    httr_1.4.2       stringr_1.4.0    knitr_1.39      
[25] sass_0.4.1       fs_1.5.2         vctrs_0.4.1      rprojroot_2.0.3 
[29] glue_1.6.2       R6_2.5.1         processx_3.5.3   fansi_1.0.3     
[33] rmarkdown_2.14   callr_3.7.0      magrittr_2.0.3   whisker_0.4     
[37] ps_1.7.0         promises_1.2.0.1 htmltools_0.5.2  ellipsis_0.3.2  
[41] httpuv_1.6.5     utf8_1.2.2       stringi_1.7.6    crayon_1.5.1

Fully Bayes Normal Means

Matthew Stephens

May 3, 2018

Background

Fully Bayes approach