First systematic audit pass following the 0.1.0 pre-release. Fourteen
HIGH-severity and seven MED-severity bugs were fixed across the R and
Stata code, plus a new data = argument and a multi-script verification
harness. A 2026-05-01 Windows verification pass uncovered three further
HIGH-severity bugs (Stata weights not forwarded to bandwidth selection;
R rdhte_lincom silently applied multcomp's single-step multiplicity
adjustment; harness orchestrator regex/CSV-staging gaps) which are
included below.
A subsequent expert-review pass on the Stata code on 2026-05-01 PM
uncovered six further HIGH-severity issues, surfaced one previously
silent latent inference bug in the no-covs_hte path, and bumped the
Stata floor from 14 to 16 (frames). All are included in this version.
Stata 16+ now required (was Stata 14+). Driven by the migration
from preserve to frames for the in-program working sample.
data = argument added to rdhte() and rdbwhte(). When supplied,
y, x, covs.hte, covs.eff, weights, cluster, and subset
may be given as bare variable names referring to columns of data.
covs.hte additionally accepts a one-sided formula
(e.g. "~ z1 + z2") whose variables are looked up in data first.
Mirrors the rdrobust 4.0.0 pattern. Implemented via the internal
.rdhte_resolve_data() NSE helper in R/helpers.R.W in no-covs.hte + covs.eff branch. The internal
regression formula referenced W even though no heterogeneity variable
was supplied. rdhte() now uses Y ~ T*Xp + covs (and Xq for the
q-fit). Previously crashed at lm().covs.hte path. Three separate bugs
were fixed: a missing brace pair that left fml.txt undefined for
~-prefixed input; the formula's environment now uses
parent.frame() so caller-frame variables resolve; and the
data.frame(covs.hte = model.matrix(...)) step was rebuilt so the
downstream grep("^covs.hte", ...) finds the correct columns
(previously resulted in "subscript out of bounds" deeper in lm()).
The illustration's covs.hte = "w_left*w_strength" example was
silently affected by these and now produces correct output.is.null(h.r) & is.null(h.r) (twice the same
argument) is now is.null(h.l) & is.null(h.r).h.lev[l] in three locations across
rdhte() and rdbwhte(): the right-side effective-sample-size
count was using the left bandwidth whenever bwselect = "msetwo"
or per-group h.l/h.r were supplied. Now uses h.lev[l, 1] and
h.lev[l, 2].Nh.lev always 0 in continuous-covs.hte auto-bw branch because
h[1]/h[2] was NULL. Now uses h.lev[1,1]/h.lev[1,2].out$h shape inconsistency between branches (vector in
no-covs.hte, matrix in covs.hte). All branches now return a
consistent n.lev × 2 matrix.W.names. = W.names (trailing dot) so
summary.rdhte() could not find x$W.names for the continuous case.
Renamed to W.names.length(h.l) > 1 & length(h.r)
always evaluated TRUE (bare length(h.r) coerces to TRUE when
positive). Now length(h.l) > 1 | length(h.r) > 1 correctly errors
on mismatched lengths.do.call-unsafe NSE. deparse(substitute(covs.hte)) returned
multi-line output when covs.hte was passed as a value rather than a
symbol (e.g. via do.call), causing paste() recycling to inflate
W.names to length-126 and the names(tau.hat.se) <- W.names
assignment to error. The deparse output is now collapsed to one line
and length-capped (>100 chars falls back to the placeholder
"covs.hte").N.lev is now consistently c(N_l, N_r) across all branches; the
categorical-covs.hte branch previously stored table(W.covs),
which made summary.rdhte() show wrong "Number of obs" for any group
count other than two.subset = accepts both logical and integer indices (matches
rdrobust 4.0.0); previously only logical worked.rdbwhte() gained weights = and subset = arguments to match
rdhte()'s feature surface, plus the same string-formula covs.hte
handling.weights = is now forwarded to all internal rdbwselect() calls
(previously dropped silently in three locations).rdbwhte() no-covs.hte branch's covs.cont <- FALSE flipped to
TRUE so the print path matches rdhte().rdbwhte()'s continuous-covs.hte branch now returns Nh as a
1 x 2 matrix (consistent with the factor case) rather than a
length-2 vector.rdhte_lincom(): validate level is in [1, 100) so the common
fraction-vs-percentage mistake (e.g. level = 0.95) errors clearly
rather than silently producing a tiny CI.rdhte_lincom(): column renamed t_stat -> z_stat. The values
were already z-statistics (multcomp's glht uses a Gaussian
approximation by default); the column label is now correct.rdhte_lincom(): removed dead ses variable and a commented-out
se column.rdhte_lincom() p-values were silently multiplicity-adjusted.
summary.glht() defaults to test = adjusted("single-step") whenever
the user passes more than one hypothesis, so the p_value column in
the individual data frame was an FWER-adjusted p-value rather than
the per-hypothesis two-sided z-test the docstring promises (and that
Stata's lincom reports). For >1 hypothesis the adjusted values were
~2x the raw values; the single-hypothesis case was already correct.
Now passes test = adjusted(type = "none"). Point estimates and CI
bounds are unchanged.checkup_rdhte_basic.R was emitting r$N[1]/r$N[2] (full-sample
left/right counts, post-na.omit) for the N_l/N_r cross-language
diff fields, which mismatched Stata's e(tau_N) (effective N within
bandwidth). Now emits r$Nh[1,1]/r$Nh[1,2] so both sides compare
the same quantity.Nh shape inconsistency in the continuous-covs.hte branch.
At rdhte.R:509 the continuous (no W.lev) path returned Nh as a
length-2 vector while every other branch returned a 1 x 2 (or
n.lev x 2) matrix. Downstream code that indexed Nh[i, j]
uniformly -- including the Akhtari/Moreira/Trucco replication
script -- broke on the continuous path. Now matches the rest of the
branches with matrix(c(Nh.lev.l, Nh.lev.r), 1, 2). Values are
unchanged; positional access (Nh[1], Nh[2]) still works.rdhte y x, vce(hc1) errored at regress. Stata's regress
only accepts vce(robust), not vce(hc1). The vce string is now
translated to vce(robust) for the internal regress calls (display
still shows "HC1").rdbwhte ... vce(cluster c) produced a different bandwidth from
the auto-BW inside rdhte ... vce(cluster c) because rdbwhte was
forwarding the user's raw vce string (and forcing vce_select = "hc0"
internally) while rdhte built a proper vce_rdbw local. Both now use
the same vce_rdbw construction.rdbwhte ereturned e(cmd) = "rdhte"; now correctly "rdbwhte".rdhte.ado:712 no longer hard-code 1.96; now use
invnormal(1 - (1 - level/100)/2) so level() is honored.i. prefix on a 0/1 variable now warns when the variable is
constant (all 0 or all 1) and the heterogeneity analysis collapses to
an overall RD ATE. Mirrored across rdhte.ado and rdbwhte.ado.rdhte_lincom: lincom is now wrapped in capture so a malformed
hypothesis errors with a clear message instead of silently picking
up stale r() values from the previous estimation.rdhte_lincom accepts a new level() option (default 95) -- the R
version already had it; the Stata version was using the global
set level only.weights was not forwarded to bandwidth selection. rdhte.ado's
two internal rdbwselect calls (the joint-bandwidth path and the
per-level factor path) did not pass the user's weights() option,
so rdhte ..., weights(w) bwselect(...) silently chose a different
bandwidth from the R-side equivalent. Same shape of bug as the
vce_rdbw cluster fix in this release. Now forwards weights() to
every internal rdbwselect invocation.rdbwhte.ado was missing the weights() option entirely. The
R-side rdbwhte() already accepted weights =; the Stata syntax
line did not declare weights(string) at all, and the three internal
rdbwselect calls (joint-bandwidth, no-covs_hte, per-factor-level)
did not forward it either. Both gaps fixed.checkup_rdhte_stress.do had five if _rc == 0 { di "..."; local ++PASS } lines that triggered Stata's r(198) program error: code follows on the same line as open brace under batch mode. All five
expanded to multi-line. The same script's NA-handling probe
referenced e(N_h_l) + e(N_h_r) (neither saved by rdhte.ado); now
asserts only _rc == 0. Note: rdhte.ado's e(N) is the pre-NA-drop
input row count (line 158), not the post-drop count R's sum(r$N)
reports -- a documented LOW-severity asymmetry left for a follow-up.Eight new top-level scripts under the project root provide a verification gate analogous to rdrobust's:
checkup_rdhte_lm.R -- 40 R-only sanity checks. Compares each
rdhte result to a manual reconstruction from coef.full /
vcov.full.checkup_rdhte_basic.{R,do} -- 14 specs of cross-language baseline
parity (factor / continuous / cluster / weights / kernel / bwselect /
p-q / manual h / bw.joint).checkup_rdhte_extended.{R,do} -- 9 specs combining 3+ non-default
options each, with per-(label, level) emission.checkup_rdhte_bw.{R,do} -- 14 specs of standalone-rdbwhte parity,
including the cluster path that the H10/H11/H12 fixes target
directly.checkup_rdhte_lincom.{R,do} -- 7 hypotheses across 3 specs comparing
R's multcomp::glht to Stata's lincom.checkup_rdhte_vs_rdrobust.R -- 11 R-only checks confirming the
illustration's claim that rdhte reproduces rdrobust under matched
settings (h fixed, rho=1, vce="hc3"). Matches to ~1e-15 on tau.checkup_rdhte_stress.R -- 11 R-only edge cases.checkup_rdhte_stress.do -- 9 Stata-portable stress probes.checkup_rdhte_strformula.R -- 6 R-only checks for the
string-formula covs.hte path with caller-frame variables.run_rdhte_checks.py orchestrates all of the above end-to-end and
reports per-spec |R-Stata|. Marker on success: RUN_RDHTE_CHECKS_OK.
Two orchestrator-level bugs found and fixed during the 2026-05-01 Windows verification pass:
(\w+)=(-?[0-9.eE+\-]+), which did
not tolerate the whitespace that Stata's %15.10f display format
inserts between = and the number. Every Stata float field
(tau_cl=..., h_l=..., etc.) silently parsed as nothing -- only
the unpadded integer fields (N_l=240) got through. The orchestrator
was therefore comparing only N_l/N_r, which under the old
full-sample-vs-effective N semantic produced ~700-unit "diffs" that
masked the real silent-OK pattern. Fixed by adding \s* after =.C:\Temp\rdhte_stata_* scratch dir,
but the .do files reference _rdhte_data.csv by relative path.
run_STATA() now stages _rdhte_*.csv from the project root into
the scratch dir before invoking Stata.A second pass on the Stata code, severities HIGH / MEDIUM / LOW. All HIGH fixed; most MEDIUM applied; LOW partially done.
e(V) bug in the no-covs_hte path
(surfaced by H4). rdhte y x (no heterogeneity) had been
silently returning an e(V) matrix populated with .s --
the no-covs branch never wrote tau_V[1,1]. The
cap ereturn post b V masked the resulting
r(504) matrix has missing values, leaving downstream
test/lincom calls with broken inference. Now correct
(matrix tau_V[1,1] = se^2); the cap is gone.c(sysdir_plus)/r/rdrobust.ado with file open
and tokenized to extract the version. Replaced with
findfile rdrobust.ado (works for any adopath location:
PERSONAL, SITE, PLUS) plus a regex scan of the first
*!...vN.M.K line. Robust to future rdrobust file-layout
changes.vce() parser validation restored. vce(robust foo),
vce(hc1 var), vce(cluster a b), etc., previously had
their validation commented out (and silently mis-parsed).
Now error with r(198) and a clear message; the legal
2-token forms are exactly vce(cluster|hc2|hc3 var).rdhte.ado (tau_h, tau_hat, tau_bc,
tau_se, tau_V, tau_t, tau_pv, tau_N,
tau_h_l/h_r, tau_ci_l/r, CV, plus the b/V ereturn
pair) and 2 in rdbwhte.ado (tau_h, tau_N) are now
tempname'd. They no longer leak into the caller's
namespace and cannot clobber a user matrix of the same name.rdhte_countvars no longer double-counts. The if-elseif
chain wasn't mutually exclusive: any pure
factor#factor term (e.g. i.W#i.Z) incremented BOTH
n_interaction_factor AND n_interaction_continuous.
Restructured so each term routes to exactly one bucket.
Affected the downstream quad flag and the BW-routing
decision for higher-order factor interactions.rdhte_fvexpand declared rclass. Previously the
program had no class declaration and callers relied on
the inner fvexpand call's r() values surviving through
program return -- accidental, undocumented, and brittle to
future Stata. Now rclass, captures r(varlist) /
r(fvops) inside nobreak before any reset, and
explicitly returns them.preserve -> frames (M1). rdhte.ado and
rdbwhte.ado now do all work in an isolated temporary
frame: frame put * if touse, into(workframe); cwf workframe; ...; cwf orig_frame; frame drop workframe.
Avoids the O(N) preserve disk round-trip and only the
touse=1 subset is copied (typically a fraction of the
user's dataset). The user's data and frame state are
never mutated, even on error. Drives the Stata 16+ floor
bump (frames were introduced in Stata 16).e(N_h_l) and e(N_h_r) added (M2). Per-side
bandwidth-effective sample sizes (summed across factor
levels). Mirrors rdrobust's convention. The deeper
post-NA-drop semantic for e(N) across covs_hte / weights
/ cluster missingness is documented as a deferred
follow-up.qui count consolidation (M3). Three qui count
passes collapsed to two (left + right; total via
addition). One fewer O(N) pass per call.version 14.0 -> 16.0 (M8). Required by the frames
refactor. Bumped across all 5 ados; rdhte_countvars.ado
also gained the missing version directive (it had
none before).rdhte.ado (check-it-exists scaffolding for
value labels, dead tau_V covariance attempts, redundant
local k = i inside nested forvalues, etc.) and
rdbwhte.ado (orphan */, value-label scaffolding).e(outcomevar) removed (L2). Was advertised in the
help file but the 2026-04-27 harmonization audit had
already removed it; line 855 had silently reintroduced it.
Users get e(depvar) per Stata convention.rdhte.sthlp Stored results section refreshed (L5).
Removed e(outcomevar); added e(N_h_l) and e(N_h_r);
e(N) description now correctly says "sample size after
listwise deletion on (y, x)" instead of "original number
of observations".covs_hte(W) (no i. prefix) and W has
fewer than 10 unique values, rdhte now emits a
note: recommending i.W for subgroup-CATE estimation
rather than the (probably unintended) slope-coefficient
interpretation.length(subinstr(...)) hash-counting refactor):
arithmetic is correct, just opaque.rdhte.ado; should pull into a
subprogram.e(N) post-NA-drop for covs_hte / weights / cluster:
needs careful covs_hte string parsing to markout factor
variables.Initial release.