`re_formula = NA` or `NULL`: some inconsistency between modelbased and brms

Hello everyone! 
I think I might have found an inconsistency between _modelbased_'s (`estimate_means`) and _brms_'s `conditional_effects` when it comes to computing the EMMs with `re_formula`. In particular, this regards the difference between conditional and marginal effects.

**Brief intro**:
I found [this post by Andrew Heiss](https://www.andrewheiss.com/blog/2022/11/29/conditional-marginal-marginaleffects/) very clear when it comes to describing the marginal (`re_formula = NULL`, default in `modelbased`) vs conditional (`re_formula = NA`) effects.

Here's a general definition:
- **Conditional effect** = `re_formula = NA` > random offsets set to 0
- **Marginal effect** = `re_formula = NULL`: random offsets  are incorporated into the estimate

This definition seems to be shared elsewhere. In several blogs, they say that `re_formula = NULL` should increase the uncertainty in the predictions due to the inclusion of the random effects, whereas with `re_formula = NA`, the predictions are based only on fixed effects and the estimated error of those parameters. [Here](https://groups.google.com/g/brms-users/c/FI8vZLeFD4Y), Paul Buerkner is saying this. In the `brms` manual, it says the same thing: 

> `re_formula`: A formula containing group-level effects to be considered in the conditional predictions. If `NULL`, include all group-level effects; if `NA` (default), include no group-level effects

However, in _modelbased_'s EMMs, the Credible Intervals with NA are way larger than those with `NULL`. But shouldn't it be the other way around? If `NULL` includes all the uncertainty coming from the random part of the model, shouldn't the CIs be larger? I confess I'm a bit confused.

Another user reported [a similar doubt](https://discourse.mc-stan.org/t/should-i-report-the-results-with-random-effects-or-not-brms-re-formula-null-vs-na/16280/4), but the replies are still a bit confusing to me. I noticed everyone suggests using `NA` as default (`brms` too), but in _modelbased_ `NULL` seems to be the default, so I wondered why.

I'm attaching here three figures derived by three models (reproducible code below):

- scale(Reaction) ~ Days + (1+Days|Subject), data=sleepstudy
<img width="1815" height="1370" alt="Image" src="https://github.com/user-attachments/assets/592c1520-a906-4e04-a70a-7dd687e22a64" />

- scale(height) ~ Occasion + (1+Occasion|Subject), data = Oxboys
<img width="1815" height="1370" alt="Image" src="https://github.com/user-attachments/assets/0e0ff1c0-4e60-438e-b593-a96e2b17c359" />

What I noticed is that the estimates seem more similar with `NA` (still not identical, though), but they vary enormously with `NULL`. Compared to `NA`, _modelbased_'s `NULL` decreases the variability, whereas brms' conditional_effects increases it. _Brms_'s results seem more logical to me though; cause, again, it makes sense that if you add the variability coming from the RE, the uncertainty should be larger. However, to complicate things even more, the `NULL` estimates with brms give the impression that nothing is "significant", whereas in fact everything is!

- Finally, this one comes from a more complex model with one random intercept and three random slopes (1 + x + y + z | participant). Here the discrepancies seem even larger.
<img width="1830" height="1197" alt="Image" src="https://github.com/user-attachments/assets/6d345fa8-49c5-42ca-a389-425d44c04483" />


Here's a reproducible code for the first two figures. I hope it's helpful for you to get to the bottom of this XD
Also, I apologise if I made some mistakes in describing what I found. Maybe it's just me not getting it.

```
### Marginal / Conditional Effects in B-GLMMs (brms & modelbased) ###

library(pacman)
pacman::p_load(brms, modelbased, lme4)

#### Sleepstudy dataset (slope) ####
data("sleepstudy", package = "lme4")

m <- brm(scale(Reaction) ~ Days + (1+Days|Subject),
         family = gaussian(),
         data = sleepstudy,
         prior = c(
           set_prior("normal(0,1)", class = "Intercept"),
           set_prior("normal(0,1)", class = "b"),
           set_prior("exponential(1)", class = "sd"),
           set_prior("exponential(1)", class = "sigma")
         ),
         cores= 4,
         control = list(adapt_delta = 0.95),
         save_pars = save_pars(all = TRUE), 
         backend = "cmdstanr",
         seed = 1234)

## Marginal effect ##
emm_marginal <- estimate_means(m, by="Days", centrality="mean", ci=.95, ci_method="hdi", re_formula=NULL)

## Conditional effect ##
emm_conditional <- estimate_means(m, by="Days", centrality="mean", ci=.95, ci_method="hdi", re_formula=NA)

ce1 <- conditional_effects(m, effects = "Days", re_formula = NULL)
ce2 <- conditional_effects(m, effects = "Days", re_formula = NA)

p1 <- plot(ce1, plot = FALSE)[[1]]
p2 <- plot(ce2, plot = FALSE)[[1]]

# Merge all graphs
plot(emm_marginal)+labs(title="NULL | modelbased", y="Reaction")+
  plot(emm_conditional)+labs(title="NA | modelbased", y="Reaction")+
  p1 + labs(title="NULL | brms conditional_effects", y="Reaction")+
  p2 + labs(title="NA | brms conditional_effects", y="Reaction")&
  coord_cartesian(ylim=c(-2.5,3))&
  theme_bw()


#### Oxboys dataset (mean difference) ####
data("Oxboys", package = "nlme")
Oxboys <- subset(Oxboys, Occasion <= 4)
Oxboys$Occasion <- factor(Oxboys$Occasion, ordered = FALSE)

m3 <- brm(scale(height) ~ Occasion + (1+Occasion|Subject),
          family = gaussian(),
          data = Oxboys,
          prior = NULL,
          cores= 4,
          control = list(adapt_delta = 0.95),
          save_pars = save_pars(all = TRUE), 
          backend = "cmdstanr",
          seed = 1234)

## Marginal effect ##
emm_marginal3 <- estimate_means(m3, by="Occasion", centrality="mean", ci=.95, ci_method="hdi", re_formula=NULL)

## Conditional effect ##
emm_conditional3 <- estimate_means(m3, by="Occasion", centrality="mean", ci=.95, ci_method="hdi", re_formula=NA)

ce13 <- conditional_effects(m3, effects = "Occasion", re_formula = NULL)
ce23 <- conditional_effects(m3, effects = "Occasion", re_formula = NA)

p13 <- plot(ce13, plot = FALSE)[[1]]
p23 <- plot(ce23, plot = FALSE)[[1]]

# Merge all graphs
plot(emm_marginal3)+labs(title="NULL | modelbased", y="Height (z-score)")+
  plot(emm_conditional3)+labs(title="NA | modelbased", y="Height (z-score)")+
  p13 + labs(title="NULL | brms conditional_effects", y="Height (z-score)")+
  p23 + labs(title="NA | brms conditional_effects", y="Height (z-score)")&
  coord_cartesian(ylim=c(-3,3.5))&
  theme_bw()

```



 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`re_formula = NA` or `NULL`: some inconsistency between modelbased and brms #579

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

re_formula = NA or NULL: some inconsistency between modelbased and brms #579

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`re_formula = NA` or `NULL`: some inconsistency between modelbased and brms #579