9-week build · canonical bodies shipped + CIs end-to-end DAG
Canonical scaffolding (P3) closed Monday: 7 entity bodies (account / opportunity / sales_order / 3 contact_points / individual_unified + individual passthrough split) landed in PR #4239 with row-count parity all PASS ±0.2% vs DC. CI consolidation (#166) Phase A-C shipped Thursday on PR #4282: 17 of 27 CI columns active (62% of mapped 48 DC CIs), collapsing DC's Helper→Base→Rollup→Final 4-layer arch into single CTEs. Remaining 10 CIs unlock on RT1 stg enrichment (4) + G1/G2/G3 follow-on tickets (6). Cutover harness scaffolding (#170) closed Tuesday. NTE cap crossed this week ($28,635 vs $25k = 114.5%) — user explicit override 2026-06-08 authorises continued work through cutover.
Receiving team unchanged: Mike Grabbe + Adrian LeDoux (ET Data Engineering). Splink real-data unblock: Mike approved provisioning the SPLINK_RUNNER_SKU runner label via Rob Edwards (GitHub org owner) — secrets in place since 2026-06-10, awaiting Mike's runner label provisioning. Re-amendment conversation pending end of W7.
What landed in W7
- 17 canonical bodies landed in PR
#4239(W7 D1 Mon · 8.0h). Entity-1:1 canonical bodies via parallel subagents:int_canonical__account(52 cols · UNION ALL 7 BUs · WJ has no Account model · 447,773 rows = +0.12% vs DC) ·int_canonical__opportunity(96 cols · 8 BU UNION ALL · 4,163,583 rows = +0.16%) ·int_canonical__sales_order(101 cols · 8 BU UNION ALL · 6,181,811 rows = +0.06%). Then 3 ContactPoint canonicals (email / phone / address) as pre-IR passthrough matching DC's*_Data_Share__dlmviews. Plus individual passthrough/unified split per user clarification: renamed priorint_canonical__individual→int_canonical__individual_unified(post-IR · 15.43M · +2.13% vs DC) + created newint_canonical__individualtargetingIndividual_Data_Share__dlm(pre-IR passthrough · 96 cols · 16.69M vs DC 16.66M = +0.17% AC7 PASS). - 2Cutover reconciliation harness PR
#4259+ operational runbook (W7 D2 Tue · 13.25h). 1,074 LOC acrossscripts/parity/(compare.py · snapshots.py · connections.py · run_parity.py · cutover_reconciliation.py · ir_reconciliation.sql · report.py) + 8 canonical YAML configs + tri-state report renderer. Pre-staged for AC4-AC10 reconciliation when secrets land. Closed JonatanFG/EF #172 (operational runbook · 798 lines · 11 sections + 2 appendices: architecture, monitoring/alerting, failure scenarios, escalation path, dbt #1+#2 lineage, Prefect docs). 9 augmentcode review findings resolved (row-count masking · INFORMATION_SCHEMA unquoted · composite-PK NULL sentinel · MCP_READER → writable role · try/finally cleanup · FQTN routing · div-by-zero · YELLOW/RED rendering). EOD Jira backlog cleanup: closed 7 of 8 RLT tickets sitting in Under Review (3812 · 3969 · 3970 · 3972 · 3973 · 3975 · 3976 · 3990) with English QA-closure comments + GH mirror links + transitions to Finalizada. - 3Splink runner RSA key-pair auth migration + 100k smoke test PASSED (W7 D3 Wed · 12.5h). Allan provisioned
EF_SPLINK_RUNNERservice account 2026-06-10 with 3 repo secrets (SNOWFLAKE_ACCOUNT_CH·SNOWFLAKE_CH_SPLINK_USERNAME·SNOWFLAKE_CH_SPLINK_PRIVATE_KEY) + RSA key-pair auth (no legacy password). Adaptedscripts/splink-runner/run_splink_ir.py.connect_snowflake()to load PEM → PKCS8 DER viacryptography.hazmat. Initial smoke atlimit=1000 dry_run=truefailed grants; Allan fixed in real-time (GRANT ROLE EF_DBT_PROD_RW TO USER EF_SPLINK_RUNNER). Re-run PASSED in 2m 37s onubuntu-latestfree runner. 100k scale-up PASSED in 47s total · Splink compute trivial (predict()done in 0.15s · clustering done in 0.4s). Only blocker for full 16.3M Splink real: memory (SPLINK_RUNNER_SKU = ubuntu-22.04-large16-core/64GB pay-per-min). Crafted Mike-facing cost summary correcting earlier r6i.4xlarge $50k/yr misunderstanding. - 4#166 CIs Phase A-C build + PR
#4282opened (W7 D3 Wed · ~6h). Re-scoped GH #166 from "single layer ~37 models" to 2 tables + 1 view (94% reduction · DC's Helpers→Base→Rollup→Final is a platform workaround that Snowflake+dbt obviates). Phase A scaffold: 3 model files (int_ci__individual_by_buLONG ·int_ci__individual_globalper-individual ·mart_marketing_individual_activationview) + YAML schemas. Phase B parallel CTE translation via 3 subagents (opp-side 5 chains · individual-side · so-side). Phase B/C application: applied 7 HIGH-confidence chains inint_ci__individual_by_bu(Active_Opp · Closed_Lost · New_Leads · Contact_Exists · Contact_Created_Date · Email_Unsub · TRH_Home) + 4 RT1-blocked structural-correct + 3 MEDIUM. Phase C global: 7 HIGH/MEDIUM cross-BU CIs inint_ci__individual_global+ Global_Business_Rule orchestrator. 17 of 27 CI columns active today (62% of mapped 48 DC CIs). - 54 augmentcode review fixes on PR #4282 + QA materialisation + 3 spotcheck findings (W7 D3 Wed). (1) HIGH ·
is_new_leadgated oncontact_exists = 1to prevent cross-join individual_bu_grid over-firing. (2) MEDIUM · per-BU date column split (ET → tour_*, others → service_*) forlatest_tour_*global cols. (3) MEDIUM · addedccap_traveled_returned_hometo mart view. (4) MEDIUM · PEM normalize helper beforeload_pem_private_key(). QA spotcheck on dbt Cloud run 43045206 revealed 3 distinct problems: WJ JOIN broken (99% of WJ carries|<BUSINESS_CODE>suffix that bridge strips → 0 matches; fix viasplit_part(individual_id__c, '|', 1)ships25ad08b8a, WJ now 933k) · Language ID type mismatch (accountid vs individual_id) · Active_Opportunities semantic gap (DC Helper SQL emits 1 when individual has NO opp in BU, our model ships displayName-intent · numbers diverge dramatically). Documented + posted to PR #4282 + Rich/Adrien. - 62 EF Confluence pages on Snowflake stack + 4 lessons (W7 D2 + ad-hoc throughout the week). Authored architecture pages for ET Data Engineering handoff: dbt #1 (US normalize layer · 57 views in
EF_DATA_HUB.ANALYTICSport-and-evolve approach) and dbt #2 (CH harmonisation / IR / CIs / segments). 4 memory-system lessons logged: (a) Adrien's "vanilla dbt structure" preference (W3 carry-over · validates 4-layer schema split decision) · (b) Splink RSA-key-pair auth migration path · (c) PR #4239 individual passthrough/unified split rationale · (d) GH org-owner permissions required for runner labels.- 7
Tickets closed across the week (Jira + GH) — total 9. Jira (W7 D2 EOD): RLT-3812 · RLT-3969 · RLT-3970 · RLT-3972 · RLT-3973 · RLT-3975 · RLT-3976 · RLT-3990 (7 of 8 transitioned to Finalizada with QA-closure comments + GH mirror links · the 8th left in Under Review because GH #175 still open / PR #4239 pre-merge). GitHub: #170 cutover harness Done (PR #4259) · #160 EF_SPLINK_WH grants Done. PR #4259 also unblocks #169 (cutover hour-of-cutover commands ready) and #171 (cutover runbook drafts underdocs/runbooks/cutover-procedure.md).Pending external actionBlocked-on-others
Mike · Splink runner SKU labelSPLINK_RUNNER_SKU label via Rob Edwards (GitHub org owner)Mike committed to providing the 64GB runner label after Allan provisioned
EF_SPLINK_RUNNERservice account + secrets 2026-06-10. 100k smoke test PASSED at 47s on free runner. Full 16.3M scale-up requiresubuntu-22.04-large16-core/64GB pay-per-min — Rob Edwards (GH org owner) provisions the label. Asked 2026-06-12, awaiting response.Adrien · #4282 reviewPR #4282 (#166 CIs) + PR #4239 (canonical bodies) approvalsBoth PRs CI green except Cortex AI Review gate (requires
eftours/data-engineeringteam approval). PR #4239 7 canonical bodies all AC7 PASS · PR #4282 17 of 27 CI columns active end-to-end DAG. Comment posted on PR #4282 cc'ing @mbgrabbe @aledoux Thursday EOD.EF · NTE re-amendment$25k NTE crossed 114.5% · scope conversation neededEnd of W7: cumulative ~$28,635 vs $25k NTE = 114.5%. User explicit override 2026-06-08 ("olvidate del presupuesto, no estamos cumpliendo con lo que pidieron") authorises continued work. Phase 4 (SFMC + Marts) + Phase 5 (Cutover) + Phase 6 (Hypercare) all still ahead. Re-amendment conversation recommended ahead of cutover gates.
Budget snapshot$25 k NTE · spent vs remaining
NTE cap$25,000SOW Amendment #2 (29 Apr) · CROSSED W7Spent to date$28,635~204 hours · ~114.5% of NTEOver cap by$3,635User override 2026-06-08 authorises continued workW7 burn~34.75 h~$4,865 · canonical bodies + #166 CIs + Splink RSA + 9 tickets closedNTE consumption
NTE CAP CROSSED · 114.5%. Per CLAUDE.md rule, the 95% threshold was crossed end of W6 and 100% crossed mid-W7. User explicit override 2026-06-08 ("olvidate del presupuesto, no estamos cumpliendo con lo que pidieron") authorises continued work through cutover. Realistic landing: P4 (CI parity finalisation + SFMC) + P5 (cutover) + P6 (hypercare) all still ahead. Re-amendment conversation recommended before P5 cutover gates.
Scope reality vs $25 k NTE
Bucket SOW low (h) Actual h Actual $ Status P0 · M1 Foundation (W1) 16 14.0 $2,040 Done P1 · dbt #1 normalize + replication (W2-6) 25 112.0 $16,180 Done · #158 closed 54/54 parity P2 · IR Phase 1 SQL ER (W6) 37 18.0 $2,600 AC7 PASS · PR #4178 P3 · Canonical bodies (W6-7) 37 ~25.0 ~$3,500 Done · PR #4239 7 bodies AC7 PASS P3b · CIs consolidation (W7) - ~10.0 ~$1,400 17 of 27 cols active · PR #4282 P4 · SFMC + Marts (W7-8) 56 ~5.0 ~$700 Activation mart scaffolded P5 · Cutover (W8) 14 ~5.0 ~$700 Cutover harness done (#170 / PR #4259) P6 · Handoff + Hypercare (W9 + Jul) 18 ~3.5 ~$490 Runbook scaffolded (#172 / 798 lines) Trace: NTE cap crossed mid-W7. All 6 SOW workstreams now have some W7 burn. P3b (CIs consolidation) is new work that was originally out-of-SOW; it absorbed ~$1,400 of the overrun. Options going into W8: (a) Amendment #3 for cutover + hypercare funding · (b) descope P6 hypercare to one cycle · (c) request EF-side ownership of cutover with Ohana acting in advisory-only mode.
What's nextW8 (15 Jun – 19 Jun) · CI parity sprint + cutover prep
Build deliverables · W8What lands- CI parity sprint · close the 17 of 27 CI columns gap — RT1 stg enrichment unlocks 4 CIs · G1 Sentiment carrier (3-blocker · CAXed_we_cancelled) · G2 SFMC ingest (Engaged_Last_60D) · G3 parent-of-child bridge (TRH_Adult)
- QA window 17-19 Jun · per Rich's snapshot-first decision · cutover validation queries for marketing consumers
- Splink real on Mike's runner · once SKU label provisioned, run full 16.3M Splink with DC match rule · target DC's exact 9.30% consolidation
- NTE re-amendment conversation · with Mike + Adrien + Rich early W8 given the cap crossed
Stakeholder asks · W8What we need from EF / ET- Mike · Rob Edwards approval for SPLINK_RUNNER_SKU runner label · S3 bucket provisioning per
docs/integration/sfmc-s3-bucket-setup.md - Adrien + ky-shi · Cortex AI gate approvals on PR #4239 + PR #4282
- Rich · QA Wed-Fri window participation · ratify CI parity findings for activation consumers
- EF · NTE · Amendment #3 conversation · scope decision for P5 cutover + P6 hypercare
ReferencesWhere to dig deeper
This week's specs + docsLanded this weekspecs/build/166-cis-consolidated.md— re-scoped #166 from 37 models to 2 tables + 1 view (94% reduction)docs/sf-analysis/calculated-insights/inventory.md— 48 DC CIs inventory + dbt mappingdocs/sf-analysis/calculated-insights/opportunity-chains-translation.md— 5 chain CTE translationdocs/sf-analysis/calculated-insights/individual-side-translation.md— individual-side CIs translationdocs/sf-analysis/calculated-insights/sales-order-side-translation.md— SO-side CIs translationdocs/runbooks/operational-runbook.md— 798 lines (#172) · 11 sections + 2 appendicesdocs/runbooks/cutover-procedure.md— pre-staged for #169
Cross-repo PRsPRs opened / merged this weekeftours/de-dbt#4239— 7 canonical bodies (account / opp / SO / 3 CPs / individual_unified + individual passthrough split) · all AC7 PASS · CI green awaits Cortex gateeftours/de-dbt#4259— cutover reconciliation harness (1,074 LOC) · closes #170eftours/de-dbt#4282— #166 CIs Phase A-C (17 of 27 cols active · 1,373 LOC) · CI green awaits Cortex gateJonatanFG/EF#172— operational runbook (798 lines) · DONEJonatanFG/EF#170— cutover harness · DONEJonatanFG/EF#160— EF_SPLINK_WH grants · DONE- 7 RLT tickets closed in batch (RLT-3812 · 3969 · 3970 · 3972 · 3973 · 3975 · 3976 · 3990) — 2 GH issues Done
- Splink runner: RSA key-pair auth migrated · 100k smoke PASSED 47s · awaiting Mike's runner SKU label
- 7