# Replication Guide

This guide walks through how to rebuild the figures, tables, and report
outputs in this package from the underlying data.

If you only want to **read** the outputs, skip this guide — the rendered
PDFs and HTML are stand-alone in `1-Presentation/` and the section
folders. This guide is for someone who wants to re-run the
analysis, change a parameter, or extend the dataset.

---

## What you need

| Tool | Version | Why |
|---|---|---|
| **R** | 4.3 or later | All analytical pipelines |
| **Quarto** | 1.4 or later | Renders the report (HTML) and the deck (reveal.js) |
| **TeX Live** (or MacTeX / TinyTeX) | 2023 or later | Compiles the Beamer slides (`.tex` → `.pdf`) |

R packages used across the four sections:

- **Core**: `here`, `dplyr`, `tidyr`, `readr`, `readxl`, `stringr`, `purrr`
- **Plotting**: `ggplot2`, `scales`, `forcats`
- **Maps** (Section 1 engagement chart): `sf`, `leaflet`, `rnaturalearth`,
  `rnaturalearthdata`, `htmlwidgets`
- **Frontier analysis** (Section 4 annex): `frontier`, `sfaR`
- **Data fetching** (only if extending the dataset): `wbstats`, `httr2`,
  `Rilostat`

Install in R with `install.packages(c("here", "dplyr", ...))`.

---

## Folder layout for re-running

The R scripts expect the **original repository structure**, where the
project root contains:

```
<project_root>/
├── data/raw/               ← input CSVs and XLSX
├── data/processed/         ← processed CSVs
├── segment_1_tax_advisory_history/
│   ├── src/R/              ← scripts
│   └── output/             ← figures, tables, RDS cache
├── segment_2_taxation_trends/
├── segment_3_inequality/
└── segment_4_tax_mix_future/
```

This share package is reorganised for reading. If you want to **run** the
code, recreate the original layout:

1. Make a new folder `eap_tax_strategy/`.
2. Inside it, create the segment folders above.
3. Copy `3-Data/*` into `data/raw/` and `data/processed/` (split
   between the two — see file headers in each script).
4. Copy each section's `code/` subfolder into the matching segment's
   `src/R/`.
5. Open R inside the new project root and run.

Alternative: clone the original Git repository at
<https://github.com/panosni/eap_tax_strategy>. The structure there is
ready to run.

---

## Re-running the four sections

Each section has its own pipeline. Run them in order — Sections 2 and 4
depend on Section 1's outputs.

### Section 1 — Advisory History

```bash
Rscript segment_1_tax_advisory_history/src/R/00_run_all.R
```

This runs `01_load.R` → `02_clean.R` → `03_analyze.R` → `04_tables.R` →
`05_figures.R`, then `06_current_engagements.R` (FY2025 engagement map)
and `07_regional_comparison.R` (cross-region DPF comparator).

Outputs land in `segment_1_tax_advisory_history/output/{figures, tables, rds}/`.

### Section 2 — Revenue Trajectories

```bash
Rscript segment_2_taxation_trends/src/R/01_load.R
Rscript segment_2_taxation_trends/src/R/02_clean.R
Rscript segment_2_taxation_trends/src/R/03_analyze.R
Rscript segment_2_taxation_trends/src/R/06_figures.R
Rscript segment_2_taxation_trends/src/R/08_alignment_revisit.R
```

The numbered scripts under `04_*` (econometrics, event-study diagnostics,
synthetic control) are exploratory and not used in the headline outputs.

### Section 3 — Inequality

```bash
Rscript segment_3_inequality/src/R/01_load.R
Rscript segment_3_inequality/src/R/02_clean.R
Rscript segment_3_inequality/src/R/03_analyze.R
Rscript segment_3_inequality/src/R/04_tables.R
Rscript segment_3_inequality/src/R/05_figures.R
Rscript segment_3_inequality/src/R/06_post_tax.R
Rscript segment_3_inequality/src/R/07_figures_part2.R
```

Note the **Pacific Inequality Gap** caveat (`5-Notes/Pacific_Inequality_Gap.md`):
WID does not publish country-specific inequality data for 7 of the 17
EAP economies. The country-level analysis is therefore on EAP-10.

### Section 4 — Structural Indicators

```bash
Rscript segment_4_tax_mix_future/src/R/01_load.R
Rscript segment_4_tax_mix_future/src/R/02_clean.R
Rscript segment_4_tax_mix_future/src/R/03_structural_readiness.R
Rscript segment_4_tax_mix_future/src/R/04_sfa.R           # exploratory annex
Rscript segment_4_tax_mix_future/src/R/05_scenarios.R     # exploratory annex
Rscript segment_4_tax_mix_future/src/R/06_priorities.R    # exploratory annex
Rscript segment_4_tax_mix_future/src/R/07_synthesis.R
Rscript segment_4_tax_mix_future/src/R/08_tables.R
Rscript segment_4_tax_mix_future/src/R/09_figures.R
```

`03_structural_readiness.R` produces the headline combined-score
composite. Scripts `04`–`06` are exploratory (frontier analysis, 2030
scenarios, country priorities) — kept for transparency but not promoted
to the main narrative.

---

## Re-rendering the deck

### Deck — reveal.js HTML

```bash
quarto render presentations/eap_tax_strategy_deck.qmd --to revealjs
```

Output: `presentations/eap_tax_strategy_deck.html` (self-contained
slideshow, opens in any browser).

### Deck — Beamer PDF

Two flavours:

```bash
# With \pause animations (for live presenting)
pdflatex presentations/eap_tax_strategy_deck.tex

# No animations (for printing / sharing)
pdflatex presentations/eap_tax_strategy_deck_handout.tex
```

The handout `.tex` is identical to the deck `.tex` except for one option:
`\documentclass[8pt,handout]{beamer}` collapses every `\pause` overlay
onto a single page.

---

## Datasets

| File | Description | Source |
|---|---|---|
| `eap_tax_master_dataset.csv` | 125 advisory recommendations, hand-coded. The canonical full record. | Compiled from public WB documents (DPOs, PFRs, PERs) |
| `eap_tax_2015_2025.csv` | 105-row processed subset, FY2015–FY2025. **The figures in this package use this subset.** | Filtered from the master |
| `DPAD_database_up_to_FY24.xlsx` | Cross-region comparator of tax-themed DPF prior actions across all six WB regions. | WB OPCS Development Policy Action Database |
| `wb_current_engagements_2026.csv` | FY2025 active engagement pipeline (P18). | Internal WB engagement pipeline |

The provenance YAML files for fetched external data live in the original
repo at `data/raw/external/provenance/`. They are not duplicated here.

---

## If something doesn't work

1. **Path errors** — the R scripts use `here::here()` which finds the
   project root by looking for a `.Rproj` file or a `.git` directory.
   If you copied the code out of the original repo without those
   markers, create an empty `.here` file at your new project root, or
   open R inside the project directory before running.
2. **Missing packages** — if `library(X)` fails, install with
   `install.packages("X")`.
3. **Quarto won't render** — check `quarto --version` is 1.4+. Older
   versions don't support some of the YAML features used.
4. **Beamer compilation fails** — make sure the Metropolis theme is
   installed: `tlmgr install beamertheme-metropolis`.
