Architecture Refactoring Plan - TDD Approach
Overview
This refactoring addresses the architectural split identified in the QA and Architecture reviews. The API currently bypasses the Clean Architecture v2.0, and we need to unify the codebase while following TDD principles.
Priorities (User-Defined)
✅ IN SCOPE: - Architectural unification (API → KpiService) - Test suite implementation (TDD approach) - Proper error handling
🚫 OUT OF SCOPE (De-prioritized): - Dashboard development - Authentication/authorization - Structured logging - Monitoring/observability
Phase 1: Testing Infrastructure Setup
1.1 Install Testing Dependencies
source venv/bin/activate
pip install pytest pytest-cov pytest-mock pytest-asyncio
1.2 Create Test Directory Structure
tests/
├── __init__.py
├── conftest.py # Shared fixtures
├── fixtures/
│ ├── __init__.py
│ ├── sample_issues.json # Sample JIRA issues
│ ├── sample_team_data.json # Sample team data
│ └── sample_metrics.json # Sample calculated metrics
├── test_repositories/
│ ├── __init__.py
│ ├── test_historical_storage.py
│ └── test_issue_repository.py
├── test_api/
│ ├── __init__.py
│ ├── test_dependencies.py
│ ├── test_teams_router.py
│ ├── test_metrics_router.py
│ └── test_error_handling.py
└── test_integration/
├── __init__.py
└── test_api_to_kpiservice.py
1.3 Create Shared Test Fixtures
Goal: Reusable test data across all tests
Files:
- tests/conftest.py - Pytest fixtures
- tests/fixtures/sample_issues.json - Mock JIRA data
Phase 2: Dependency Injection (TDD)
2.1 Write Tests FIRST
Test File: tests/test_api/test_dependencies.py
Tests to Write:
def test_get_kpi_service_returns_instance()
def test_get_kpi_service_returns_singleton() # Same instance on multiple calls
def test_get_historical_storage_returns_instance()
def test_get_historical_storage_returns_singleton()
2.2 Implement Dependency Injection
Implementation File: src/api/dependencies.py
Requirements:
- get_kpi_service() → Returns KpiService instance
- get_historical_storage() → Returns HistoricalStorage instance
- Both should be singletons (FastAPI will cache via @lru_cache())
- Testable (allow mock injection for tests)
2.3 Verify Tests Pass
pytest tests/test_api/test_dependencies.py -v
Phase 3: Refactor HistoricalStorage (TDD)
3.1 Write Tests FIRST
Test File: tests/test_repositories/test_historical_storage.py
Critical Tests:
def test_add_snapshot_creates_file_atomically() # Issue #1 from QA
def test_load_history_handles_corrupted_json() # Issue #2 from QA
def test_add_snapshot_validates_metrics() # Issue #3 from QA
def test_concurrent_snapshot_creation() # Issue #4 from QA
def test_retention_policy_removes_old_snapshots()
def test_snapshot_interval_enforcement()
3.2 Fix Critical Issues
Implementation: src/repositories/historical_storage.py
Fixes: 1. Atomic writes - Use temp file + rename 2. JSON parse error handling - Try-except with logging 3. Metrics validation - Validate required fields before storing 4. Race condition - File locking or atomic check
3.3 Integration with Repository Pattern
Goal: HistoricalStorage should use IssueRepository, not JiraKpiStorage
Test File: tests/test_repositories/test_historical_storage_integration.py
Tests:
def test_historical_storage_uses_issue_repository()
def test_snapshot_creation_via_kpi_service()
Implementation: Refactor historical_storage.py:268 (migration function)
Phase 4: Refactor API Routers (TDD)
4.1 Write Tests FIRST
Test Files:
- tests/test_api/test_teams_router.py
- tests/test_api/test_metrics_router.py
Critical Tests:
# Teams router
def test_list_teams_returns_all_teams()
def test_get_team_detail_returns_team_info()
def test_get_team_detail_nonexistent_returns_404()
# Metrics router
def test_get_team_metrics_uses_kpi_service() # Verify dependency injection
def test_get_team_metrics_invalid_team_returns_400()
def test_get_team_metrics_handles_calculation_errors()
def test_timeseries_validates_date_range()
4.2 Refactor Routers to Use KpiService
Files to Refactor:
- src/api/routers/teams.py
- src/api/routers/metrics.py
- src/api/routers/charts.py
- src/api/routers/dashboards.py
Pattern:
# BEFORE (current - BAD)
from jira_kpi_sync import TEAMS, JiraKpiStorage
storage = JiraKpiStorage()
@router.get("/teams/{team_id}/metrics")
async def get_team_metrics(team_id: str):
team_data = storage.load_team_data(team_id)
# ...
# AFTER (refactored - GOOD)
from fastapi import Depends
from src.api.dependencies import get_kpi_service
from src.kpi_service import KpiService
@router.get("/teams/{team_id}/metrics")
async def get_team_metrics(
team_id: str,
service: KpiService = Depends(get_kpi_service)
):
metrics = service.calculate_team_metrics(team_id)
# ...
4.3 Verify All Tests Pass
pytest tests/test_api/ -v --cov=src/api --cov-report=term
Phase 5: Error Handling (TDD)
5.1 Write Tests FIRST
Test File: tests/test_api/test_error_handling.py
Tests:
def test_team_not_found_returns_404()
def test_invalid_date_format_returns_400()
def test_calculation_error_returns_500_with_generic_message()
def test_validation_error_returns_400_with_details()
def test_file_not_found_returns_404_not_500()
5.2 Implement Error Handling
Files:
- src/api/main.py - Global exception handlers
- All router files - HTTPException usage
Requirements: - 400 for validation errors - 404 for not found - 500 for unexpected errors (with generic message to user) - Never expose stack traces or internal paths - Log errors server-side (print statements for now, structured logging later)
Example:
# src/api/main.py
from fastapi import Request, HTTPException
from fastapi.responses import JSONResponse
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
# Log error server-side (print for now)
print(f"ERROR: {exc}", flush=True)
# Return generic error to client
return JSONResponse(
status_code=500,
content={
"error": "internal_server_error",
"message": "An unexpected error occurred"
}
)
5.3 Verify Error Handling Tests Pass
pytest tests/test_api/test_error_handling.py -v
Phase 6: Data Integrity Fixes (TDD)
6.1 Write Tests FIRST
Test File: tests/test_repositories/test_data_integrity.py
Critical Tests:
def test_atomic_write_prevents_corruption_on_crash()
def test_json_parse_error_returns_empty_list_not_crash()
def test_metrics_validation_rejects_invalid_data()
def test_concurrent_writes_dont_corrupt_file()
6.2 Implement Fixes
File: src/repositories/historical_storage.py
Atomic Write Pattern:
import tempfile
from pathlib import Path
def add_snapshot(self, ...):
# ... prepare snapshot data
# Write to temp file first
temp_fd, temp_path = tempfile.mkstemp(
dir=self.storage_dir,
prefix=f"{team_key}_history_",
suffix=".tmp"
)
try:
with os.fdopen(temp_fd, 'w') as f:
json.dump(history_data, f, indent=2)
# Atomic rename (POSIX guarantee)
Path(temp_path).replace(path)
print(f"✓ Saved snapshot for {team_key}")
return True
except Exception as e:
# Clean up temp file on error
try:
os.unlink(temp_path)
except:
pass
print(f"✗ Error saving snapshot: {e}")
return False
JSON Parse Error Handling:
def load_history(self, team_key: str) -> List[Dict]:
path = self.get_history_path(team_key)
if not path.exists():
return []
try:
with open(path, 'r') as f:
data = json.load(f)
return data.get('snapshots', [])
except json.JSONDecodeError as e:
print(f"⚠️ Warning: Corrupted history file for {team_key}: {e}")
# TODO: Attempt recovery or notify admin
return []
except Exception as e:
print(f"✗ Error loading history for {team_key}: {e}")
return []
6.3 Verify Data Integrity Tests Pass
pytest tests/test_repositories/test_data_integrity.py -v
Phase 7: Integration Testing
7.1 Write Integration Tests
Test File: tests/test_integration/test_api_to_kpiservice.py
Tests:
def test_api_endpoint_uses_kpi_service_end_to_end()
def test_api_metrics_calculation_matches_kpi_service_directly()
def test_historical_snapshot_creation_integrates_with_kpi_service()
7.2 Run Full Test Suite
# All tests
pytest
# With coverage report
pytest --cov=src --cov-report=html --cov-report=term
# Coverage goals:
# - src/repositories/: >90%
# - src/api/: >80%
# - src/kpi_service.py: >80%
Phase 8: Verify & Document
8.1 Verification Checklist
- [ ] All tests pass (
pytest) - [ ] Coverage meets targets (>80% for refactored code)
- [ ] No direct imports of
jira_kpi_syncin API code - [ ] All API routers use dependency injection
- [ ] Error handling tests pass
- [ ] Data integrity issues fixed
- [ ] Integration tests pass
8.2 Update Documentation
Files to Update:
- CLAUDE.md - Document refactoring decisions
- API_AND_DASHBOARD_GUIDE.md - Update with dependency injection
- REFACTORING_SUMMARY.md - Create summary of changes
8.3 Test Manually
# Start API server
source venv/bin/activate
python -m uvicorn src.api.main:app --reload
# Test endpoints
curl http://localhost:8000/api/teams
curl http://localhost:8000/api/teams/UAS/metrics
curl http://localhost:8000/api/organization/summary
Success Criteria
Architectural
✅ API uses Clean Architecture v2.0 (KpiService via dependency injection)
✅ No direct imports of legacy code (jira_kpi_sync) in new API
✅ HistoricalStorage integrated with repository pattern
✅ Single code path for all KPI calculations
Quality
✅ Test coverage >80% for API layer ✅ Test coverage >90% for repository layer ✅ All critical data integrity issues fixed ✅ Proper error handling (400/404/500 with appropriate messages)
Functionality
✅ All existing API endpoints work correctly ✅ Historical snapshots created properly ✅ KPI calculations match previous results (regression-free)
Timeline Estimate
- Phase 1: Testing infrastructure - 2 hours
- Phase 2: Dependency injection - 3 hours
- Phase 3: HistoricalStorage refactor - 5 hours
- Phase 4: API routers refactor - 8 hours
- Phase 5: Error handling - 3 hours
- Phase 6: Data integrity - 4 hours
- Phase 7: Integration testing - 3 hours
- Phase 8: Verification & docs - 2 hours
Total: ~30 hours (~4 days)
Notes
- Following strict TDD: Write test first, then implementation
- Each phase has clear pass/fail criteria (tests)
- Can be done incrementally (commit after each phase)
- Backward compatible (existing CLI tools still work)
- Dashboard de-prioritized completely
- Auth/logging de-prioritized (future work)
Next Steps After Refactoring
Once refactoring is complete and all tests pass:
- Production Deployment (internal first)
- Add Authentication (when needed)
- Add Structured Logging (for observability)
- Resume Dashboard Development (on solid foundation)
- Add New Features (GitHub adapter, new metrics, etc.)