9-week build · P1 close-out · P3a canonical scaffold
P1 (bronze staging) closes out as the Phase B macro library finally lands on the per-BU contacts / accounts / opps / sales models. P3a canonical scaffold opened as PR #4083 — 7 DMO replacements (Individual, 3 ContactPoints, Account, Opportunity, Sales_Order) with empty bodies but full upstream ref() declarations. Splink IR (#161) gets its spec next week. Adrien still owns the review queue for 6 open PRs.
Receiving team unchanged: Mike Grabbe + Adrian LeDoux (ET Data Engineering). Rich still in role this week — Tuesday alignment call locked W6 Monday Boston in-person with Vivek + Grabbe.
What landed in W5
- 1Canonical-model scaffold PR
eftours/de-dbt#4083opened (W5 D2). Full DAG-shape for all 7 DMOs replaced by Snowflake: 4 survivorship-resolved canonicals (int_canonical__individual+ 3 ContactPoints) referencingint_ir__unified_individual_membershipplaceholder; 3 entity-1:1 canonicals (__account/__opportunity/__sales_order) UNION ALL across BU stg with composite PK. 4 macros landed:source_priority_order(RT1 Rich's 8-source order),oldest_source_value(RT3s first-touch),bool_or_gr(RT2-OR for Global_Rewards flags),age_range_bucket(Rich's new 6-bucket taxonomy). Latentrequired_testsdict-format project bug fixed in the same PR. - 24-BU accounts batch PR
#4099(W5 D3). Academy 6-CTE pattern with school / agency / sales-office / campus scoping + RECORD_TYPE filter · HSEY single-source with correlatedopps_count+ Rich's 9-bucketschool_type_normalized· Language 6-CTE 3-branch UNION (US destinations + Home schools / Agents + International destinations) · Student_Tours passthrough. CI 13/13 green after survivorship_ts qualification fix. - 38-BU opportunities PR
#4106+ 8-BU sales_orders PR#4107(W5 D3). Real ports across every BU. Sales_orders includes Language Juno / Poseidon revenue COALESCE, Academy 35-row Lost_Reason CASE (Rich 2026-04-14), CCAP 17-row unmatch_reason CASE, WJTRAVEL_Vport (no ORDERS_V — gap discovered + flagged in spec). Yml tests relaxed for 3 real-data anomalies (SA opps NULL on opportunity_id, ST opps 48 dupes on sales_op_id, WJ opps 95k dupes on composite-PK design). - 420 ContactPoint variant scaffolds PR
#4110(W5 D3). Phase A++ scaffold — Python script extracts legacy DDL column headers and emitscast(null as TYPE)per column, locking the column contract so downstream IR + canonical ContactPoint models canref()safely. Resolved abusiness_unitcolumn conflict (legacyBUSINESS_UNITcolumn collides with Phase B literal — legacy renamed tobusiness_unit_legacy). - 57 Phase B normalize macros applied end-to-end (W5 D4) → 7 RLT tickets closed.
bu_attribution/bu_literalacross opps + sales + accounts (RLT-3577 closed) · newstage_category_normalizeseed (52 rows · 4 BUs) + macro (RLT-3989 closed) · newbusiness_subunit_normalizedwith Rich's WJ / ET subunit derivation rules (RLT-3579 closed) ·age_range_bucketmacro with Rich's 6-bucket taxonomy applied to 7 BU contacts (RLT-3973 closed) ·language_iso_normalizeapplied to Academy + HSEY contacts + HSEY language SF-ID seed populated from 140 IDs (RLT-3975 + RLT-4111 closed) ·highest_attained_status_normalizemacro + 32-row seed shipped (RLT-3578 closed, field application Phase D). - 6Cross-BU QA pass surfaced 4 fixes + RLT-3577 ST EFSA carve-out (W5 D4). Cross-BU QA pushback caught 4 latent bugs as fix-up commits: business_subunit_normalized EFSA carve-out restored on ST opps · HSEY LEAD
age_range_bucketwired tocompute_age()(was hardcoded NULL despite 983k populated DOBs) · HSEY Contact JOIN to Account for birthdate (Contact birthdate always NULL on HSEY) +<1900edge-case filter + MMDD adjustment · newcompute_age()macro extracted (NULL / future / <1900 / >110 / MMDD legacy-faithful AGE pattern) applied to all 7 BU contacts. ST contacts EFSA carve-out also shipped (was hardcoded 'ET' with TODO) — DEV validation 125,896 EFSA / 5,663,269 ET / 5.79M total matches CONTACTS_V row count. - 7Language EU stg source gap → spec
#321(W5 D4). Cross-BU language audit triggered by "para otros productos no viene language" pushback:stg_language__contactscurrently readsJUNO_CONTACT_LATEST(33M, no LANGUAGE column) instead of legacy's UNION ofJUNO_ACCOUNT_PA(32M / 99.9% filled) +JUNO_LEAD(26M). Same indirect gap on RLT-3973 (age computed on wrong source). Issue#321+ specspecs/build/321-language-eu-stg-source-restructure.mdcommitted. - 8Spec 175a RLT-3972 reclassification (W5 D5).
Business_Unit_Name__creclassified RT4 (MODE) → RT3s (Oldest-Source) + renamed toOriginating_Business_Unit_Name__cper RLT-3972. Spec changes: line 168 moved from RT4 to RT3s table · line 406 SQL updated · column inventory renamed · header history gains 2026-05-29 entry · "Main shifts" gains item #7. Jira RLT-3972 + GH #296 comments documenting the reclassification + 8-step gating chain to closure. - 92 sibling CCAP PRs + S3 bucket doc + Rich alignment call (W5 D2). PR
#4090= 3 Google_Analytics_* passthrough columns on stg_ccap__contacts per Rich's "derive in dbt" decision · PR#4091= CCAP accounts Phase C real port (49 cols, host-family record type filter). Plusdocs/integration/sfmc-s3-bucket-setup.mdfor Grabbe (271 lines, English, single-bucket two-prefix architecture, IAM templates, cost estimate). Tuesday Rich call locked in-person Mon 2026-06-01 1pm EDT at EF Boston with Vivek + Grabbe.
Blocked-on-others
PRs #4059 (8-BU contacts), #4083 (canonical scaffold), #4099 (4-BU accounts), #4106 (8-BU opportunities), #4107 (8-BU sales_orders + 2 school_accounts), #4110 (20 CP variants). Most CI green with bot triage clean. Re-pings sent Friday on #4059 + #4083.
RLT-3972 reclassification (RT4 → RT3s + rename) needs Rich sign-off before #175a canonical body materializes the field. RLT-3991 5 HSEY closed-lost codes (CN / CY / EF / APP / INV → Other) pending Rich confirmation. RLT-3990 NBSP fix in seed pending Rich confirmation post-merge.
Vivek call still gates the segments methodology decision (MC-primary vs Snowflake fallback). Sending-domain SF-side change is Vivek-owned (2-4 wk lead time). In-person scheduled W6 Mon 1pm EDT at EF Boston with Vivek + Grabbe + Rich.
$25 k NTE · spent vs remaining
NTE consumption
69.4% consumed end of W5. The 80% threshold ($20,000) will be crossed during W6 — per CLAUDE.md rule, re-confirmation with Rich + Mike + Adrian is triggered at that point. Pacing: Splink IR (#161) + canonical body materialization + parity QA fit within remaining headroom if no unplanned scope additions.
Scope reality vs $25 k NTE
| Bucket | SOW low (h) | Actual h | Actual $ | Status |
|---|---|---|---|---|
| P0 · M1 Foundation (W1) | 16 | 14.0 | $2,040 | Done |
| P1 · dbt #1 normalize + replication (W2-5) | 25 | 102.5 | $14,820 | Closing · 410% h · 593% $ |
| P2 · Splink IR (W6) | 37 | 0 | $0 | W6 kickoff |
| P3 · Canonical scaffold + CIs (W5-6) | 37 | 5.0 | $700 | Scaffold landed · bodies pending |
| P4 · SFMC + Marts (W6-7) | 56 | 0 | $0 | Upcoming |
| P5 · Cutover (W8) | 14 | 0 | $0 | Upcoming |
| P6 · Handoff + Hypercare (W9 + Jul) | 18 | 0 | $0 | Upcoming |
Trace: P1 absorbed the bulk of the over-run as the bronze-layer scope expanded materially (vanilla audit + 8 BU real ports + 20 CP variant scaffolds + Phase B macro library + cross-BU QA + #321 remediation). With ~$7.7 k headroom and P2 + P3 + P4 + P5 + P6 all ahead, the working assumption is Splink IR + canonical body + cutover fits if scope holds. Re-confirmation point at $20 k (~W6 mid-week).
W6 (1 Jun – 5 Jun) · Splink IR kickoff + Boston in-person
- Splink IR pipeline (#161) · Phase 1 normalize macros + scaffold + spec finalized · Phase 2 settings + predict + clusters · Phase 3 contacts_normalized real-source activation
- EF_SPLINK_WH provisioning (#160) · Snowpark-optimized warehouse for Splink workload
- Parity validation framework · Python harness for systematic ±0.1% checks across all stg ports
- #158 close-out · pre-staged parity queries ready to fire as PRs merge
- Adrien-queue draining · 6 PRs through review + merge → unblocks canonical bodies
- Boston in-person Monday · Rich canonical sync + Vivek SFMC strategy
- Adrien · drain the 6-PR review queue (re-pinged Friday on #4059 + #4083)
- Rich · ratify RLT-3972 reclassification · confirm RLT-3990 NBSP + RLT-3991 5 codes
- Vivek · attend Mon 1pm EDT Boston SFMC strategy session · kick off sending-domain
- Grabbe · S3 bucket setup per
docs/integration/sfmc-s3-bucket-setup.md· sanity check first 4 weeks of replication credits
Where to dig deeper
specs/build/321-language-eu-stg-source-restructure.md— Language EU stg source remediationspecs/build/175a-canonical-survivorship.mdv1 RLT-3972 reclassification (commitc87e6be)docs/integration/sfmc-s3-bucket-setup.md— S3 setup for Grabbe (271 lines)docs/sessions/2026-05-26-rich-alignment-call.md— Tuesday Rich + Boston in-person planning
eftours/de-dbt#4083— canonical scaffold (7 DMOs, 4 macros)eftours/de-dbt#4099— 4-BU accountseftours/de-dbt#4106— 8-BU opportunities + Phase B macroseftours/de-dbt#4107— 8-BU sales_orders + 2 school_accounts + Phase B macroseftours/de-dbt#4110— 20 ContactPoint variants scaffoldeftours/de-dbt#4090·#4091— CCAP Google_Analytics passthrough + Accounts porteftours/de-dbt#4059— 7-BU contacts Phase C (Adrien re-pinged)