EF
Status Report · Week 5 of 9

Phase B macros landed across the staging surface · canonical scaffold opened · 7 normalize tickets closed

Memorial Day Monday off; the rest of the week shipped the back half of the bronze layer in batches. Tuesday opened the canonical-model scaffold PR eftours/de-dbt#4083 (all 7 DMOs · DAG-shape) plus the S3 bucket setup doc for Grabbe. Wednesday landed four cross-repo PRs covering 4-BU accounts, 8-BU opportunities, 8-BU sales orders/travel, and 20 ContactPoint variant scaffolds. Thursday applied the seven Phase B normalize macros end-to-end — closing RLT-3577 / RLT-3579 / RLT-3989 / RLT-3973 / RLT-3975 / RLT-3578 / RLT-4111 — and surfaced a structural Language EU stg source gap tracked as remediation #321. Friday wrapped with the Rich-driven RLT-3972 reclassification (Business_Unit_Name RT4→RT3s + renamed to Originating_Business_Unit_Name) and a US↔CH parity check on the ST EFSA carve-out (delta 0.003%, well within tolerance).

W5 · 4 working days P1 close · P3 canonical scaffold 121.5 h · $17,340 · 69.4% NTE 7 normalize tickets closed · 6 PRs open
Plan

9-week build · P1 close-out · P3a canonical scaffold

P1 (bronze staging) closes out as the Phase B macro library finally lands on the per-BU contacts / accounts / opps / sales models. P3a canonical scaffold opened as PR #4083 — 7 DMO replacements (Individual, 3 ContactPoints, Account, Opportunity, Sales_Order) with empty bodies but full upstream ref() declarations. Splink IR (#161) gets its spec next week. Adrien still owns the review queue for 6 open PRs.

P0 · W1 Foundation
P1 · W2-5 Normalize + Replication
P2 · W6 Splink IR
P3 · W5-6 Canonical + CIs
P4 · W6-7 SFMC + Marts
P5 · W8 Cutover
P6 · W9 + Jul Handoff

Receiving team unchanged: Mike Grabbe + Adrian LeDoux (ET Data Engineering). Rich still in role this week — Tuesday alignment call locked W6 Monday Boston in-person with Vivek + Grabbe.

Done this week

What landed in W5

  1. 1
    Canonical-model scaffold PR eftours/de-dbt#4083 opened (W5 D2). Full DAG-shape for all 7 DMOs replaced by Snowflake: 4 survivorship-resolved canonicals (int_canonical__individual + 3 ContactPoints) referencing int_ir__unified_individual_membership placeholder; 3 entity-1:1 canonicals (__account / __opportunity / __sales_order) UNION ALL across BU stg with composite PK. 4 macros landed: source_priority_order (RT1 Rich's 8-source order), oldest_source_value (RT3s first-touch), bool_or_gr (RT2-OR for Global_Rewards flags), age_range_bucket (Rich's new 6-bucket taxonomy). Latent required_tests dict-format project bug fixed in the same PR.
    PR #4083 · Draft · Adrien queue · CI 14/14 green
  2. 2
    4-BU accounts batch PR #4099 (W5 D3). Academy 6-CTE pattern with school / agency / sales-office / campus scoping + RECORD_TYPE filter · HSEY single-source with correlated opps_count + Rich's 9-bucket school_type_normalized · Language 6-CTE 3-branch UNION (US destinations + Home schools / Agents + International destinations) · Student_Tours passthrough. CI 13/13 green after survivorship_ts qualification fix.
    PR #4099 · CI green
  3. 3
    8-BU opportunities PR #4106 + 8-BU sales_orders PR #4107 (W5 D3). Real ports across every BU. Sales_orders includes Language Juno / Poseidon revenue COALESCE, Academy 35-row Lost_Reason CASE (Rich 2026-04-14), CCAP 17-row unmatch_reason CASE, WJ TRAVEL_V port (no ORDERS_V — gap discovered + flagged in spec). Yml tests relaxed for 3 real-data anomalies (SA opps NULL on opportunity_id, ST opps 48 dupes on sales_op_id, WJ opps 95k dupes on composite-PK design).
    PRs #4106 + #4107 · CI 14/14 green
  4. 4
    20 ContactPoint variant scaffolds PR #4110 (W5 D3). Phase A++ scaffold — Python script extracts legacy DDL column headers and emits cast(null as TYPE) per column, locking the column contract so downstream IR + canonical ContactPoint models can ref() safely. Resolved a business_unit column conflict (legacy BUSINESS_UNIT column collides with Phase B literal — legacy renamed to business_unit_legacy).
    PR #4110
  5. 5
    7 Phase B normalize macros applied end-to-end (W5 D4) → 7 RLT tickets closed. bu_attribution / bu_literal across opps + sales + accounts (RLT-3577 closed) · new stage_category_normalize seed (52 rows · 4 BUs) + macro (RLT-3989 closed) · new business_subunit_normalized with Rich's WJ / ET subunit derivation rules (RLT-3579 closed) · age_range_bucket macro with Rich's 6-bucket taxonomy applied to 7 BU contacts (RLT-3973 closed) · language_iso_normalize applied to Academy + HSEY contacts + HSEY language SF-ID seed populated from 140 IDs (RLT-3975 + RLT-4111 closed) · highest_attained_status_normalize macro + 32-row seed shipped (RLT-3578 closed, field application Phase D).
    Across PRs #4059 / #4099 / #4106 / #4107 · 3 English closure comments posted via Atlassian REST
  6. 6
    Cross-BU QA pass surfaced 4 fixes + RLT-3577 ST EFSA carve-out (W5 D4). Cross-BU QA pushback caught 4 latent bugs as fix-up commits: business_subunit_normalized EFSA carve-out restored on ST opps · HSEY LEAD age_range_bucket wired to compute_age() (was hardcoded NULL despite 983k populated DOBs) · HSEY Contact JOIN to Account for birthdate (Contact birthdate always NULL on HSEY) + <1900 edge-case filter + MMDD adjustment · new compute_age() macro extracted (NULL / future / <1900 / >110 / MMDD legacy-faithful AGE pattern) applied to all 7 BU contacts. ST contacts EFSA carve-out also shipped (was hardcoded 'ET' with TODO) — DEV validation 125,896 EFSA / 5,663,269 ET / 5.79M total matches CONTACTS_V row count.
    PR #4059 + #4106 + #4107 · ST CH↔US parity validated W5 D5
  7. 7
    Language EU stg source gap → spec #321 (W5 D4). Cross-BU language audit triggered by "para otros productos no viene language" pushback: stg_language__contacts currently reads JUNO_CONTACT_LATEST (33M, no LANGUAGE column) instead of legacy's UNION of JUNO_ACCOUNT_PA (32M / 99.9% filled) + JUNO_LEAD (26M). Same indirect gap on RLT-3973 (age computed on wrong source). Issue #321 + spec specs/build/321-language-eu-stg-source-restructure.md committed.
    GH #321 · remediation queue
  8. 8
    Spec 175a RLT-3972 reclassification (W5 D5). Business_Unit_Name__c reclassified RT4 (MODE) → RT3s (Oldest-Source) + renamed to Originating_Business_Unit_Name__c per RLT-3972. Spec changes: line 168 moved from RT4 to RT3s table · line 406 SQL updated · column inventory renamed · header history gains 2026-05-29 entry · "Main shifts" gains item #7. Jira RLT-3972 + GH #296 comments documenting the reclassification + 8-step gating chain to closure.
    Commit c87e6be · spec on main · Jira + GH cross-refs posted
  9. 9
    2 sibling CCAP PRs + S3 bucket doc + Rich alignment call (W5 D2). PR #4090 = 3 Google_Analytics_* passthrough columns on stg_ccap__contacts per Rich's "derive in dbt" decision · PR #4091 = CCAP accounts Phase C real port (49 cols, host-family record type filter). Plus docs/integration/sfmc-s3-bucket-setup.md for Grabbe (271 lines, English, single-bucket two-prefix architecture, IAM templates, cost estimate). Tuesday Rich call locked in-person Mon 2026-06-01 1pm EDT at EF Boston with Vivek + Grabbe.
    PRs #4090 + #4091 · Session log docs/sessions/2026-05-26-rich-alignment-call.md
Pending external action

Blocked-on-others

Adrien · review queue
6 PRs awaiting code-owner review

PRs #4059 (8-BU contacts), #4083 (canonical scaffold), #4099 (4-BU accounts), #4106 (8-BU opportunities), #4107 (8-BU sales_orders + 2 school_accounts), #4110 (20 CP variants). Most CI green with bot triage clean. Re-pings sent Friday on #4059 + #4083.

Critical path: #4106 + #4107 + #4099 unblock canonical body work in P3a.
Rich · pending confirms
RLT-3972 + RLT-3991 + RLT-3990 ratification

RLT-3972 reclassification (RT4 → RT3s + rename) needs Rich sign-off before #175a canonical body materializes the field. RLT-3991 5 HSEY closed-lost codes (CN / CY / EF / APP / INV → Other) pending Rich confirmation. RLT-3990 NBSP fix in seed pending Rich confirmation post-merge.

All tracked in Jira + GH counterpart issues. Non-blocking for the canonical scaffold itself.
Vivek · MC segments + sending domain
Segments methodology + email domain

Vivek call still gates the segments methodology decision (MC-primary vs Snowflake fallback). Sending-domain SF-side change is Vivek-owned (2-4 wk lead time). In-person scheduled W6 Mon 1pm EDT at EF Boston with Vivek + Grabbe + Rich.

Action item from Tuesday Rich call: schedule Vivek 1:1 this week (W6).
Budget snapshot

$25 k NTE · spent vs remaining

NTE cap
$25,000
SOW Amendment #2 (29 Apr)
Spent to date
$17,340
121.5 hours · 69.4% of NTE
Remaining
$7,660
≈ 55 h at full-time rate ($140 / h)
W5 burn
26.5 h
$3,700 · Phase B macros · 6 PRs · cross-BU QA · spec 175a

NTE consumption

69.4% consumed end of W5. The 80% threshold ($20,000) will be crossed during W6 — per CLAUDE.md rule, re-confirmation with Rich + Mike + Adrian is triggered at that point. Pacing: Splink IR (#161) + canonical body materialization + parity QA fit within remaining headroom if no unplanned scope additions.

Scope reality vs $25 k NTE

BucketSOW low (h)Actual hActual $Status
P0 · M1 Foundation (W1)1614.0$2,040Done
P1 · dbt #1 normalize + replication (W2-5)25102.5$14,820Closing · 410% h · 593% $
P2 · Splink IR (W6)370$0W6 kickoff
P3 · Canonical scaffold + CIs (W5-6)375.0$700Scaffold landed · bodies pending
P4 · SFMC + Marts (W6-7)560$0Upcoming
P5 · Cutover (W8)140$0Upcoming
P6 · Handoff + Hypercare (W9 + Jul)180$0Upcoming

Trace: P1 absorbed the bulk of the over-run as the bronze-layer scope expanded materially (vanilla audit + 8 BU real ports + 20 CP variant scaffolds + Phase B macro library + cross-BU QA + #321 remediation). With ~$7.7 k headroom and P2 + P3 + P4 + P5 + P6 all ahead, the working assumption is Splink IR + canonical body + cutover fits if scope holds. Re-confirmation point at $20 k (~W6 mid-week).

What's next

W6 (1 Jun – 5 Jun) · Splink IR kickoff + Boston in-person

Build deliverables · W6
What lands by Friday 5 Jun
  • Splink IR pipeline (#161) · Phase 1 normalize macros + scaffold + spec finalized · Phase 2 settings + predict + clusters · Phase 3 contacts_normalized real-source activation
  • EF_SPLINK_WH provisioning (#160) · Snowpark-optimized warehouse for Splink workload
  • Parity validation framework · Python harness for systematic ±0.1% checks across all stg ports
  • #158 close-out · pre-staged parity queries ready to fire as PRs merge
  • Adrien-queue draining · 6 PRs through review + merge → unblocks canonical bodies
  • Boston in-person Monday · Rich canonical sync + Vivek SFMC strategy
Stakeholder asks · W6
What we need from EF / ET
  • Adrien · drain the 6-PR review queue (re-pinged Friday on #4059 + #4083)
  • Rich · ratify RLT-3972 reclassification · confirm RLT-3990 NBSP + RLT-3991 5 codes
  • Vivek · attend Mon 1pm EDT Boston SFMC strategy session · kick off sending-domain
  • Grabbe · S3 bucket setup per docs/integration/sfmc-s3-bucket-setup.md · sanity check first 4 weeks of replication credits
References

Where to dig deeper

This week's specs + docs
Landed this week
  • specs/build/321-language-eu-stg-source-restructure.md — Language EU stg source remediation
  • specs/build/175a-canonical-survivorship.md v1 RLT-3972 reclassification (commit c87e6be)
  • docs/integration/sfmc-s3-bucket-setup.md — S3 setup for Grabbe (271 lines)
  • docs/sessions/2026-05-26-rich-alignment-call.md — Tuesday Rich + Boston in-person planning
Cross-repo PRs
PRs opened / updated this week
  • eftours/de-dbt#4083 — canonical scaffold (7 DMOs, 4 macros)
  • eftours/de-dbt#4099 — 4-BU accounts
  • eftours/de-dbt#4106 — 8-BU opportunities + Phase B macros
  • eftours/de-dbt#4107 — 8-BU sales_orders + 2 school_accounts + Phase B macros
  • eftours/de-dbt#4110 — 20 ContactPoint variants scaffold
  • eftours/de-dbt#4090 · #4091 — CCAP Google_Analytics passthrough + Accounts port
  • eftours/de-dbt#4059 — 7-BU contacts Phase C (Adrien re-pinged)