Release v0.3.0
Features: - Smart Overrides Phase 1: create overrides directly from Run Checks via the "Apply override for future runs?" follow-up dialog after Mark as Success (scope + duration choices, audit-logged). - Cove workstation offline handling: skip schedule-based missed-runs for Cove workstations (always on) and add an optional colorbar-based offline-detection toggle in Settings -> Integrations -> Cove (cove_offline_detection_enabled, cove_workstation_warning_days, cove_workstation_error_days). Synthetic offline runs use a stable external_id so they escalate in place and clear once activity resumes. - Settings -> Maintenance: Generate test run card for exercising the Smart Override flow. - Restored Mark as Success button in the Run Checks modal footer. Changes: - Run Checks Cove same-day suppression: hide repeat Cove runs after the first complete success run on the same local day. - Inbox excludes mail messages linked to archived jobs. - Run Checks / Search overview now applies Customer.active filter. - In-app documentation refreshed across getting-started, users, mail-import, integrations (Cove), settings, backup-review, customers-jobs and autotask sections. Tooling: - Adopted the shared docker-build-and-push script. Modes are now t / r; release version is read from docs/changelog.md; the script no longer performs git operations. Removed obsolete version.txt and .last-branch. Renames: - docs/technical-notes-codex.md -> docs/TECHNICAL.md - docs/changelog-claude.md -> docs/changelog-develop.md Migrations: - migrate_cove_offline_detection (3 columns on system_settings).
This commit is contained in:
parent
3cb608cb6b
commit
f21d6f4fca
@ -1 +0,0 @@
|
||||
v20260402-01
|
||||
@ -1288,7 +1288,7 @@ Static Files:
|
||||
- static/images/documentation/* # NEW: Screenshots folder
|
||||
|
||||
Docs:
|
||||
- changelog-claude.md # UPDATE: Document feature addition
|
||||
- changelog-develop.md # UPDATE: Document feature addition
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
@ -927,7 +927,7 @@ containers/backupchecks/src/templates/main/
|
||||
└── settings.html # Settings (add Reporting section)
|
||||
|
||||
docs/
|
||||
└── changelog-claude.md # Changelog entry
|
||||
└── changelog-develop.md # Changelog entry
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
400
TODO-smart-overrides.md
Normal file
400
TODO-smart-overrides.md
Normal file
@ -0,0 +1,400 @@
|
||||
# Smart Overrides — Technisch Ontwerp
|
||||
|
||||
> Backupchecks leert van operator-acties om herhaalwerk te verminderen.
|
||||
> Gefaseerde aanpak: directe UI-verbetering → patroonherkenning → cross-job kennisbank.
|
||||
|
||||
---
|
||||
|
||||
## Fase 1: "Wil je dit ook in de toekomst?"
|
||||
|
||||
### Probleem
|
||||
|
||||
De huidige "Mark as Success" knop in Run Checks maakt een override aan met een tijdvenster van ±1 minuut rond de specifieke run. Dezelfde fout morgen? Opnieuw handmatig markeren.
|
||||
|
||||
### Oplossing
|
||||
|
||||
Na "Mark as Success" toont de UI een vervolgdialoog waarin de operator de scope en duur kan bepalen. De backend accepteert deze extra parameters en maakt een bredere override aan.
|
||||
|
||||
### UI-wijziging: vervolgdialoog
|
||||
|
||||
Na een succesvolle `mark-success-override` API-call verschijnt een modal (of inline panel in de bestaande modal) met:
|
||||
|
||||
**Scope-keuze (radio buttons):**
|
||||
|
||||
- "Alleen deze run" ← huidige gedrag, default
|
||||
- "Deze job, zelfde foutmelding" → object-level override op `job_id` + `match_error_contains`
|
||||
- "Alle jobs met deze software/type en zelfde foutmelding" → global override op `backup_software` + `backup_type` + `match_error_contains`
|
||||
|
||||
**Duur-keuze (radio buttons):**
|
||||
|
||||
- "Eenmalig" ← huidige gedrag, ±1 minuut window
|
||||
- "1 week"
|
||||
- "1 maand"
|
||||
- "Permanent (tot handmatig uitgeschakeld)"
|
||||
|
||||
**Optioneel commentaar** (textarea, vooringevuld met de error-tekst als referentie)
|
||||
|
||||
### API-wijziging
|
||||
|
||||
`POST /api/run-checks/mark-success-override` krijgt optionele extra velden:
|
||||
|
||||
```python
|
||||
{
|
||||
"run_id": 123,
|
||||
"scope": "job" | "global" | "run", # default: "run" (huidig gedrag)
|
||||
"duration": "once" | "1w" | "1m" | "permanent", # default: "once"
|
||||
"comment": "VSS snapshot timeout, known issue"
|
||||
}
|
||||
```
|
||||
|
||||
De backend-logica in `api_run_checks_mark_success_override()` verandert:
|
||||
|
||||
- `scope="run"` → huidige gedrag (±1 min window)
|
||||
- `scope="job"` → `Override(level="object", job_id=job.id, match_error_contains=..., start_at=now, end_at=now+duration)`
|
||||
- `scope="global"` → `Override(level="global", backup_software=..., backup_type=..., match_error_contains=..., start_at=now, end_at=now+duration)`
|
||||
|
||||
Bij `duration="permanent"` wordt `end_at=None` gezet.
|
||||
|
||||
### Error-tekst extractie
|
||||
|
||||
De vervolgdialoog moet de error-tekst tonen die als `match_error_contains` gebruikt wordt. Dit is al beschikbaar in de run-detail modal. De logica:
|
||||
|
||||
1. Haal `run_object_links` op voor de run (bestaande query in `_apply_overrides_to_run`)
|
||||
2. Filter op objecten met niet-success status
|
||||
3. Neem de `error_message` van het eerste problematische object
|
||||
4. Fallback naar `run.remark`
|
||||
|
||||
Deze tekst wordt getoond in de dialoog zodat de operator kan zien *wat* er geleerd wordt, en eventueel kan aanpassen.
|
||||
|
||||
### Database-wijzigingen
|
||||
|
||||
Geen. Het bestaande `Override`-model ondersteunt alle benodigde velden al (`level`, `match_error_contains`, `match_error_mode`, `start_at`, `end_at`, `comment`).
|
||||
|
||||
### Nieuwe audit logging
|
||||
|
||||
Bij het aanmaken van een bredere override: log naar `AuditLog` met `event_type="override_from_review"` en details over scope, duur, en de oorspronkelijke run.
|
||||
|
||||
---
|
||||
|
||||
## Fase 2: Patroonherkenning op reviews
|
||||
|
||||
### Probleem
|
||||
|
||||
Fase 1 vereist nog steeds een bewuste keuze van de operator. Als dezelfde fout 5× achter elkaar wordt gereviewed zonder ticket of opmerking, had het systeem dat al eerder moeten signaleren.
|
||||
|
||||
### Oplossing
|
||||
|
||||
Een achtergrondproces analyseert review-events en detecteert herhaalde patronen. Bij een drempel verschijnt een suggestie in de UI.
|
||||
|
||||
### Nieuw model: `OverrideSuggestion`
|
||||
|
||||
```python
|
||||
class OverrideSuggestion(db.Model):
|
||||
__tablename__ = "override_suggestions"
|
||||
|
||||
id = db.Column(db.Integer, primary_key=True)
|
||||
|
||||
# Fingerprint van het patroon (hash van genormaliseerde velden)
|
||||
pattern_fingerprint = db.Column(db.String(64), nullable=False, index=True)
|
||||
|
||||
# Leesbare omschrijving van het patroon
|
||||
pattern_description = db.Column(db.Text, nullable=False)
|
||||
|
||||
# Scope van de suggestie
|
||||
suggested_level = db.Column(db.String(20), nullable=False) # global | object
|
||||
suggested_backup_software = db.Column(db.String(255), nullable=True)
|
||||
suggested_backup_type = db.Column(db.String(255), nullable=True)
|
||||
suggested_job_id = db.Column(db.Integer, db.ForeignKey("jobs.id"), nullable=True)
|
||||
suggested_match_status = db.Column(db.String(32), nullable=True)
|
||||
suggested_match_error = db.Column(db.String(255), nullable=True)
|
||||
|
||||
# Bewijs
|
||||
match_count = db.Column(db.Integer, nullable=False, default=0)
|
||||
sample_run_ids = db.Column(db.Text, nullable=True) # JSON list
|
||||
first_seen_at = db.Column(db.DateTime, nullable=False)
|
||||
last_seen_at = db.Column(db.DateTime, nullable=False)
|
||||
|
||||
# Status
|
||||
status = db.Column(db.String(20), nullable=False, default="pending")
|
||||
# pending | accepted | dismissed | expired
|
||||
resolved_by_user_id = db.Column(db.Integer, db.ForeignKey("users.id"), nullable=True)
|
||||
resolved_at = db.Column(db.DateTime, nullable=True)
|
||||
created_override_id = db.Column(db.Integer, db.ForeignKey("overrides.id"), nullable=True)
|
||||
|
||||
created_at = db.Column(db.DateTime, nullable=False)
|
||||
updated_at = db.Column(db.DateTime, nullable=False)
|
||||
```
|
||||
|
||||
### Patroondetectie-logica
|
||||
|
||||
**Trigger:** Na elke `mark-reviewed` actie (of als periodieke task, bijv. elk uur).
|
||||
|
||||
**Algoritme:**
|
||||
|
||||
```
|
||||
1. Query alle JobRuns die in de laatste 30 dagen reviewed zijn
|
||||
ZONDER dat er een ticket aan gelinkt is
|
||||
EN ZONDER dat er een override actief was
|
||||
|
||||
2. Groepeer op:
|
||||
- (backup_software, backup_type, status, genormaliseerde_error) → global kandidaat
|
||||
- (job_id, status, genormaliseerde_error) → object kandidaat
|
||||
|
||||
3. Voor elke groep met count >= DREMPEL (configureerbaar, default 3):
|
||||
- Bereken fingerprint: sha256(level + scope-velden + genormaliseerde_error)
|
||||
- Check of er al een OverrideSuggestion bestaat met deze fingerprint
|
||||
→ Ja: update match_count en last_seen_at
|
||||
→ Nee: maak nieuwe suggestie aan
|
||||
|
||||
4. Suggesties ouder dan 90 dagen zonder nieuwe matches: status → expired
|
||||
```
|
||||
|
||||
**Error-normalisatie** (voor groepering):
|
||||
|
||||
```python
|
||||
def normalize_error_for_grouping(error: str) -> str:
|
||||
"""Verwijder variabele delen uit error-berichten zodat
|
||||
dezelfde fout met verschillende details als één patroon herkend wordt."""
|
||||
import re
|
||||
s = (error or "").strip()
|
||||
# Verwijder timestamps (diverse formaten)
|
||||
s = re.sub(r'\d{4}[-/]\d{2}[-/]\d{2}[T ]\d{2}:\d{2}(:\d{2})?', '<TIMESTAMP>', s)
|
||||
# Verwijder GUIDs
|
||||
s = re.sub(r'[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}', '<GUID>', s)
|
||||
# Verwijder IP-adressen
|
||||
s = re.sub(r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b', '<IP>', s)
|
||||
# Verwijder paden (Windows en Unix)
|
||||
s = re.sub(r'[A-Z]:\\[\w\\.-]+', '<PATH>', s)
|
||||
s = re.sub(r'/[\w/.-]+', '<PATH>', s)
|
||||
# Verwijder grote getallen (bijv. bytes, sizes)
|
||||
s = re.sub(r'\b\d{4,}\b', '<NUM>', s)
|
||||
# Normaliseer whitespace
|
||||
s = re.sub(r'\s+', ' ', s).strip()
|
||||
return s
|
||||
```
|
||||
|
||||
### UI: suggesties tonen
|
||||
|
||||
**Optie A — Dashboard badge:** Een badge naast "Override Suggestions" in de sidebar of op het dashboard (zoals inbox_count). Operator klikt door naar een lijst.
|
||||
|
||||
**Optie B — Inline in Run Checks:** Wanneer een operator een run opent die past bij een bestaand patroon, toon een banner: *"Dit patroon is al 5× handmatig gereviewed. Wil je een override aanmaken?"* met knoppen "Ja, maak override" en "Negeer".
|
||||
|
||||
Aanbeveling: **beide**. De dashboard-badge geeft een overzicht, de inline-banner maakt het actionable op het moment dat het relevant is.
|
||||
|
||||
**Suggesties-pagina** (`/override-suggestions`):
|
||||
|
||||
- Lijst van pending suggesties, gesorteerd op match_count (hoogste eerst)
|
||||
- Per suggestie: patroon-omschrijving, aantal matches, sample runs (klikbaar), voorgestelde override-config
|
||||
- Acties: "Accepteer" (maakt Override aan met voorgestelde config), "Negeer" (dismissed), "Aanpassen" (opent override-aanmaakformulier met vooringevulde velden)
|
||||
|
||||
### Configuratie
|
||||
|
||||
Nieuwe velden op `SystemSettings`:
|
||||
|
||||
```python
|
||||
# Override Suggestions
|
||||
suggestion_enabled = db.Column(db.Boolean, nullable=False, default=True)
|
||||
suggestion_threshold = db.Column(db.Integer, nullable=False, default=3)
|
||||
suggestion_lookback_days = db.Column(db.Integer, nullable=False, default=30)
|
||||
suggestion_expiry_days = db.Column(db.Integer, nullable=False, default=90)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Fase 3: Error-fingerprint kennisbank
|
||||
|
||||
### Probleem
|
||||
|
||||
Fase 2 werkt per installatie. Als klant A dezelfde Veeam VSS-fout heeft als klant B, moet de operator het bij beide apart afhandelen.
|
||||
|
||||
### Oplossing
|
||||
|
||||
Bouw een kennisbank van genormaliseerde error-fingerprints met hun classificatie. Nieuwe runs worden automatisch gematcht tegen de kennisbank.
|
||||
|
||||
### Nieuw model: `ErrorPattern`
|
||||
|
||||
```python
|
||||
class ErrorPattern(db.Model):
|
||||
__tablename__ = "error_patterns"
|
||||
|
||||
id = db.Column(db.Integer, primary_key=True)
|
||||
|
||||
# Genormaliseerde fingerprint (output van normalize_error_for_grouping + sha256)
|
||||
fingerprint = db.Column(db.String(64), unique=True, nullable=False, index=True)
|
||||
|
||||
# Leesbaar voorbeeld van de originele error
|
||||
example_error = db.Column(db.Text, nullable=False)
|
||||
|
||||
# Genormaliseerde versie (voor display)
|
||||
normalized_error = db.Column(db.Text, nullable=False)
|
||||
|
||||
# Classificatie
|
||||
classification = db.Column(db.String(20), nullable=False, default="unknown")
|
||||
# benign | actionable | critical | unknown
|
||||
|
||||
# Automatische actie
|
||||
auto_action = db.Column(db.String(20), nullable=False, default="none")
|
||||
# none | treat_as_success | needs_review | escalate
|
||||
|
||||
# Scope-beperking (optioneel: alleen voor specifieke software/type)
|
||||
scope_backup_software = db.Column(db.String(255), nullable=True)
|
||||
scope_backup_type = db.Column(db.String(255), nullable=True)
|
||||
|
||||
# Herkomst
|
||||
learned_from = db.Column(db.String(32), nullable=False, default="manual")
|
||||
# manual | override | suggestion
|
||||
source_override_id = db.Column(db.Integer, nullable=True)
|
||||
|
||||
# Statistieken
|
||||
times_matched = db.Column(db.Integer, nullable=False, default=0)
|
||||
times_confirmed = db.Column(db.Integer, nullable=False, default=0) # keer dat operator OK gaf
|
||||
times_rejected = db.Column(db.Integer, nullable=False, default=0) # keer dat operator NIET OK gaf
|
||||
confidence = db.Column(db.Float, nullable=False, default=0.0)
|
||||
|
||||
# Beheer
|
||||
active = db.Column(db.Boolean, nullable=False, default=True)
|
||||
created_by = db.Column(db.String(255), nullable=True)
|
||||
notes = db.Column(db.Text, nullable=True)
|
||||
|
||||
created_at = db.Column(db.DateTime, nullable=False)
|
||||
updated_at = db.Column(db.DateTime, nullable=False)
|
||||
```
|
||||
|
||||
### Automatisch leren van overrides
|
||||
|
||||
Wanneer een Override wordt aangemaakt (via Fase 1-dialoog, Fase 2-suggestie, of handmatig):
|
||||
|
||||
```python
|
||||
def learn_from_override(override: Override):
|
||||
"""Extraheer een ErrorPattern uit een nieuwe override."""
|
||||
if not override.match_error_contains:
|
||||
return # Geen error-tekst om van te leren
|
||||
|
||||
normalized = normalize_error_for_grouping(override.match_error_contains)
|
||||
fp = hashlib.sha256(normalized.encode()).hexdigest()[:16]
|
||||
|
||||
existing = ErrorPattern.query.filter_by(fingerprint=fp).first()
|
||||
if existing:
|
||||
existing.times_confirmed += 1
|
||||
existing.confidence = min(1.0, existing.times_confirmed / (existing.times_confirmed + existing.times_rejected + 1))
|
||||
existing.updated_at = datetime.utcnow()
|
||||
return
|
||||
|
||||
pattern = ErrorPattern(
|
||||
fingerprint=fp,
|
||||
example_error=override.match_error_contains,
|
||||
normalized_error=normalized,
|
||||
classification="benign",
|
||||
auto_action="treat_as_success" if override.treat_as_success else "none",
|
||||
scope_backup_software=override.backup_software,
|
||||
scope_backup_type=override.backup_type,
|
||||
learned_from="override",
|
||||
source_override_id=override.id,
|
||||
times_matched=0,
|
||||
times_confirmed=1,
|
||||
confidence=0.5, # startwaarde bij eerste bevestiging
|
||||
active=True,
|
||||
created_by=override.created_by,
|
||||
created_at=datetime.utcnow(),
|
||||
updated_at=datetime.utcnow(),
|
||||
)
|
||||
db.session.add(pattern)
|
||||
```
|
||||
|
||||
### Integratie in de run-interpretatie
|
||||
|
||||
De fingerprint-check wordt een extra laag in `_apply_overrides_to_run()`, ná de bestaande override-evaluatie:
|
||||
|
||||
```python
|
||||
# Na de bestaande override checks (regel ~502-516 in routes_shared.py):
|
||||
|
||||
# Fase 3: Check error pattern kennisbank
|
||||
if not override_applied:
|
||||
error_text = _get_run_error_text(run, run_object_rows)
|
||||
if error_text:
|
||||
normalized = normalize_error_for_grouping(error_text)
|
||||
fp = hashlib.sha256(normalized.encode()).hexdigest()[:16]
|
||||
pattern = ErrorPattern.query.filter_by(
|
||||
fingerprint=fp, active=True
|
||||
).first()
|
||||
|
||||
if pattern and pattern.confidence >= CONFIDENCE_THRESHOLD:
|
||||
# Check scope-beperking
|
||||
if pattern.scope_backup_software and pattern.scope_backup_software.lower() != (job.backup_software or "").lower():
|
||||
pattern = None
|
||||
if pattern and pattern.scope_backup_type and pattern.scope_backup_type.lower() != (job.backup_type or "").lower():
|
||||
pattern = None
|
||||
|
||||
if pattern:
|
||||
pattern.times_matched += 1
|
||||
if pattern.auto_action == "treat_as_success":
|
||||
return "Success (auto)", True, "pattern", pattern.id, f"ErrorPattern id={pattern.id}"
|
||||
```
|
||||
|
||||
**`CONFIDENCE_THRESHOLD`** (configureerbaar, default 0.7): voorkomt dat patronen met weinig bewijs automatisch worden toegepast. Een patroon begint op 0.5 en stijgt door bevestigingen.
|
||||
|
||||
### Feedback-loop: confidence bijwerken
|
||||
|
||||
Wanneer een operator een run reviewed die door een ErrorPattern als success was gemarkeerd:
|
||||
- **Reviewed zonder correctie** → `pattern.times_confirmed += 1`
|
||||
- **Operator maakt alsnog een ticket aan** → `pattern.times_rejected += 1`
|
||||
- Herbereken: `confidence = confirmed / (confirmed + rejected + 1)`
|
||||
- Bij confidence < 0.3: automatisch `active = False` zetten
|
||||
|
||||
### UI: kennisbank-pagina
|
||||
|
||||
**`/error-patterns`** (admin/operator):
|
||||
|
||||
- Tabel met alle patronen, sorteerbaar op confidence, times_matched, classificatie
|
||||
- Per patroon: genormaliseerde error, voorbeeld, classificatie, auto_action, confidence-balk, match-statistieken
|
||||
- Inline edit voor classificatie en auto_action
|
||||
- Bulk-acties: activeren/deactiveren
|
||||
- Filter op software/type
|
||||
|
||||
In de run-detail modal: als een ErrorPattern is toegepast, toon een klein label "Auto-classified: benign (confidence 85%)" met link naar het patroon.
|
||||
|
||||
---
|
||||
|
||||
## Implementatievolgorde en afhankelijkheden
|
||||
|
||||
```
|
||||
Fase 1 (directe waarde, ~2-3 dagen werk)
|
||||
├── Backend: scope + duration parameters toevoegen aan mark-success-override
|
||||
├── Frontend: vervolgdialoog in run_checks.html modal
|
||||
├── Audit logging voor bredere overrides
|
||||
└── Geen nieuwe modellen of migraties nodig
|
||||
|
||||
Fase 2 (patroonherkenning, ~1-2 weken werk)
|
||||
├── Migratie: override_suggestions tabel
|
||||
├── Migratie: SystemSettings velden voor configuratie
|
||||
├── Backend: patroondetectie-logica + normalize_error_for_grouping()
|
||||
├── Backend: trigger na mark-reviewed OF periodieke task
|
||||
├── Frontend: suggesties-pagina (/override-suggestions)
|
||||
├── Frontend: dashboard badge + inline banner in Run Checks
|
||||
└── Afhankelijk van: Fase 1 (optioneel, maar geeft betere data)
|
||||
|
||||
Fase 3 (kennisbank, ~2-3 weken werk)
|
||||
├── Migratie: error_patterns tabel
|
||||
├── Backend: learn_from_override() hook
|
||||
├── Backend: fingerprint-check in _apply_overrides_to_run()
|
||||
├── Backend: confidence feedback-loop bij review-acties
|
||||
├── Frontend: /error-patterns beheerpagina
|
||||
├── Frontend: "Auto-classified" label in run-detail modal
|
||||
└── Afhankelijk van: Fase 2 (voor normalize_error_for_grouping())
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Risico's en mitigatie
|
||||
|
||||
**False positives (patroon matcht verkeerd)**
|
||||
Mitigatie: confidence-drempel, scope-beperkingen, en de mogelijkheid om patronen te deactiveren. Fase 3-patronen met `auto_action=treat_as_success` worden in de UI duidelijk gemarkeerd als "Auto" zodat operators ze kunnen herkennen en corrigeren.
|
||||
|
||||
**Error-normalisatie te agressief**
|
||||
Mitigatie: begin conservatief (alleen timestamps, GUIDs, IPs verwijderen). Breid de normalisatie geleidelijk uit op basis van ervaring. De kennisbank-pagina toont zowel de genormaliseerde als de originele tekst zodat operators kunnen beoordelen of de normalisatie klopt.
|
||||
|
||||
**Performance bij grote datasets**
|
||||
Mitigatie: `pattern_fingerprint` en `fingerprint` zijn geïndexeerd. De patroondetectie-query in Fase 2 draait periodiek (niet real-time). De fingerprint-lookup in Fase 3 is een simpele indexed lookup per run.
|
||||
|
||||
**Operator-vertrouwen**
|
||||
Mitigatie: het systeem suggereert en labelt, maar de operator heeft altijd het laatste woord. "Success (auto)" is visueel onderscheidbaar van "Success" en "Success (override)". Elke automatische actie is traceerbaar naar het bronpatroon.
|
||||
@ -3,79 +3,54 @@ set -euo pipefail
|
||||
|
||||
# ============================================================================
|
||||
# build-and-push.sh
|
||||
# Location: repo root (e.g. /docker/develop/backup-monitoring)
|
||||
#
|
||||
# Purpose:
|
||||
# - Automatic version bump:
|
||||
# 1 = patch, 2 = minor, 3 = major, t = test
|
||||
# - Test builds: only update :dev (no commit/tag)
|
||||
# - Release builds: update version.txt, commit, tag, push (to the current branch)
|
||||
# - Build & push Docker images for each service under ./compose/*
|
||||
# - Preflight checks: Docker daemon up, logged in to registry, valid names/tags
|
||||
# - Summary: show all images + tags built and pushed
|
||||
# - Branch visibility:
|
||||
# - Shows currently checked out branch (authoritative)
|
||||
# - Reads .last-branch for info (if present) when BRANCH is not set
|
||||
# - Writes the current branch back to .last-branch at the end
|
||||
# - Build & push Docker images for each service under ./containers/*
|
||||
# - Two modes:
|
||||
# t (test) = only push :dev
|
||||
# r (release) = push :<version>, :dev, :latest
|
||||
# version is read from the top of changelog.md
|
||||
#
|
||||
# No git operations: committing and tagging is done manually.
|
||||
#
|
||||
# Usage:
|
||||
# BRANCH=<branch> ./build-and-push.sh [bump] # BRANCH is optional; informative only
|
||||
# ./build-and-push.sh [bump]
|
||||
# If [bump] is omitted, you will be prompted (default = t).
|
||||
# ./build-and-push.sh [mode]
|
||||
# - mode = t -> test build, push :dev only
|
||||
# - mode = r -> release build, version taken from changelog.md
|
||||
# - omitted -> prompt (default: t)
|
||||
#
|
||||
# Tagging rules:
|
||||
# - Release build (1/2/3): push :<version>, :dev, :latest
|
||||
# - Test build (t): push only :dev (no :latest, no version tag)
|
||||
# Requirements:
|
||||
# - docs/changelog.md (relative to repo root), with the most recent release
|
||||
# at the top as:
|
||||
# ## vX.Y.Z — YYYY-MM-DD
|
||||
# (the version is parsed from the first such line)
|
||||
# - One Dockerfile per service under ./containers/<service>/Dockerfile
|
||||
# ============================================================================
|
||||
|
||||
DOCKER_REGISTRY="gitea.oskamp.info"
|
||||
DOCKER_NAMESPACE="ivooskamp"
|
||||
|
||||
VERSION_FILE="version.txt"
|
||||
START_VERSION="v0.1.0"
|
||||
COMPOSE_DIR="containers"
|
||||
LAST_BRANCH_FILE=".last-branch" # stored in repo root
|
||||
CHANGELOG_FILE="docs/changelog.md"
|
||||
CONTAINERS_DIR="containers"
|
||||
|
||||
# --- Input: prompt if missing ------------------------------------------------
|
||||
BUMP="${1:-}"
|
||||
if [[ -z "${BUMP}" ]]; then
|
||||
echo "Select bump type: [1] patch, [2] minor, [3] major, [t] test (default: t)"
|
||||
read -r BUMP
|
||||
BUMP="${BUMP:-t}"
|
||||
MODE="${1:-}"
|
||||
if [[ -z "${MODE}" ]]; then
|
||||
echo "Select build type: [t] test build (push :dev only), [r] release build (default: t)"
|
||||
read -r MODE
|
||||
MODE="${MODE:-t}"
|
||||
fi
|
||||
|
||||
if [[ "$BUMP" != "1" && "$BUMP" != "2" && "$BUMP" != "3" && "$BUMP" != "t" ]]; then
|
||||
echo "[ERROR] Unknown bump type '$BUMP' (use 1, 2, 3, or t)."
|
||||
case "$MODE" in
|
||||
t|test) MODE="t" ;;
|
||||
r|release) MODE="r" ;;
|
||||
*)
|
||||
echo "[ERROR] Unknown mode '$MODE' (use 't' for test or 'r' for release)."
|
||||
exit 1
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
|
||||
# --- Helpers -----------------------------------------------------------------
|
||||
read_version() {
|
||||
if [[ -f "$VERSION_FILE" ]]; then
|
||||
tr -d ' \t\n\r' < "$VERSION_FILE"
|
||||
else
|
||||
echo "$START_VERSION"
|
||||
fi
|
||||
}
|
||||
|
||||
write_version() {
|
||||
echo "$1" > "$VERSION_FILE"
|
||||
}
|
||||
|
||||
bump_version() {
|
||||
local cur="$1"
|
||||
local kind="$2"
|
||||
local core="${cur#v}"
|
||||
IFS='.' read -r MA MI PA <<< "$core"
|
||||
case "$kind" in
|
||||
1) PA=$((PA + 1));;
|
||||
2) MI=$((MI + 1)); PA=0;;
|
||||
3) MA=$((MA + 1)); MI=0; PA=0;;
|
||||
*) echo "[ERROR] Unknown bump kind"; exit 1;;
|
||||
esac
|
||||
echo "v${MA}.${MI}.${PA}"
|
||||
}
|
||||
|
||||
check_docker_ready() {
|
||||
if ! docker info >/dev/null 2>&1; then
|
||||
echo "[ERROR] Docker daemon not reachable. Is Docker running and do you have permission to use it?"
|
||||
@ -117,14 +92,35 @@ validate_tag() {
|
||||
fi
|
||||
}
|
||||
|
||||
# --- Preflight ---------------------------------------------------------------
|
||||
if [[ ! -d ".git" ]]; then
|
||||
echo "[ERROR] Not a git repository (.git missing)."
|
||||
# Parse the first "## vX.Y.Z ..." heading from changelog.md.
|
||||
# Accepts: ## v1.0.3 — 2026-04-24
|
||||
# ## v1.0.3 - 2026-04-24
|
||||
# ## v1.0.3
|
||||
read_version_from_changelog() {
|
||||
if [[ ! -f "$CHANGELOG_FILE" ]]; then
|
||||
echo "[ERROR] $CHANGELOG_FILE not found in $(pwd)." >&2
|
||||
exit 1
|
||||
fi
|
||||
fi
|
||||
local line
|
||||
# Match lines starting with "## v<digits>.<digits>.<digits>"
|
||||
line="$(grep -m1 -E '^##[[:space:]]+v[0-9]+\.[0-9]+\.[0-9]+' "$CHANGELOG_FILE" || true)"
|
||||
if [[ -z "$line" ]]; then
|
||||
echo "[ERROR] No release heading found in $CHANGELOG_FILE (expected e.g. '## v1.0.3 — 2026-04-24' near the top)." >&2
|
||||
exit 1
|
||||
fi
|
||||
# Extract the vX.Y.Z token
|
||||
local version
|
||||
version="$(echo "$line" | grep -oE 'v[0-9]+\.[0-9]+\.[0-9]+' | head -n1)"
|
||||
if [[ -z "$version" ]]; then
|
||||
echo "[ERROR] Could not parse version from line: $line" >&2
|
||||
exit 1
|
||||
fi
|
||||
echo "$version"
|
||||
}
|
||||
|
||||
if [[ ! -d "$COMPOSE_DIR" ]]; then
|
||||
echo "[ERROR] '$COMPOSE_DIR' directory missing. Expected ./compose/<service>/ with a Dockerfile."
|
||||
# --- Preflight ---------------------------------------------------------------
|
||||
if [[ ! -d "$CONTAINERS_DIR" ]]; then
|
||||
echo "[ERROR] '$CONTAINERS_DIR' directory missing. Expected ./${CONTAINERS_DIR}/<service>/ with a Dockerfile."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@ -132,91 +128,44 @@ check_docker_ready
|
||||
ensure_registry_login
|
||||
validate_repo_component "$DOCKER_NAMESPACE"
|
||||
|
||||
# Detect currently checked out branch (authoritative for this script)
|
||||
DETECTED_BRANCH="$(git branch --show-current 2>/dev/null || true)"
|
||||
if [[ -z "$DETECTED_BRANCH" ]]; then
|
||||
DETECTED_BRANCH="$(git symbolic-ref --quiet --short HEAD 2>/dev/null || true)"
|
||||
fi
|
||||
if [[ -z "$DETECTED_BRANCH" ]]; then
|
||||
# Try to derive from upstream
|
||||
UPSTREAM_REF_DERIVED="$(git rev-parse --abbrev-ref --symbolic-full-name @{u} 2>/dev/null || true)"
|
||||
if [[ -n "$UPSTREAM_REF_DERIVED" ]]; then
|
||||
DETECTED_BRANCH="${UPSTREAM_REF_DERIVED#origin/}"
|
||||
fi
|
||||
fi
|
||||
if [[ -z "$DETECTED_BRANCH" ]]; then
|
||||
DETECTED_BRANCH="main"
|
||||
fi
|
||||
|
||||
# Optional signals: BRANCH env and .last-branch (informational only)
|
||||
ENV_BRANCH="${BRANCH:-}"
|
||||
LAST_BRANCH_FILE_PATH="$(pwd)/$LAST_BRANCH_FILE"
|
||||
LAST_BRANCH_VALUE=""
|
||||
if [[ -z "$ENV_BRANCH" && -f "$LAST_BRANCH_FILE_PATH" ]]; then
|
||||
LAST_BRANCH_VALUE="$(tr -d ' \t\n\r' < "$LAST_BRANCH_FILE_PATH")"
|
||||
fi
|
||||
|
||||
UPSTREAM_REF="$(git rev-parse --abbrev-ref --symbolic-full-name @{u} 2>/dev/null || echo "origin/$DETECTED_BRANCH")"
|
||||
HEAD_SHA="$(git rev-parse --short HEAD 2>/dev/null || echo "unknown")"
|
||||
|
||||
echo "[INFO] Repo: $(pwd)"
|
||||
echo "[INFO] Current branch: $DETECTED_BRANCH"
|
||||
echo "[INFO] Upstream: $UPSTREAM_REF"
|
||||
echo "[INFO] HEAD (sha): $HEAD_SHA"
|
||||
|
||||
if [[ -n "$ENV_BRANCH" && "$ENV_BRANCH" != "$DETECTED_BRANCH" ]]; then
|
||||
echo "[WARNING] BRANCH='$ENV_BRANCH' differs from checked out branch '$DETECTED_BRANCH'."
|
||||
echo "[WARNING] This script does not switch branches; continuing on '$DETECTED_BRANCH'."
|
||||
fi
|
||||
|
||||
if [[ -n "$LAST_BRANCH_VALUE" && "$LAST_BRANCH_VALUE" != "$DETECTED_BRANCH" && -z "$ENV_BRANCH" ]]; then
|
||||
echo "[INFO] .last-branch suggests '$LAST_BRANCH_VALUE', but current checkout is '$DETECTED_BRANCH'."
|
||||
echo "[INFO] If you intended to build '$LAST_BRANCH_VALUE', switch branches first (use update-and-build.sh)."
|
||||
fi
|
||||
|
||||
# --- Versioning --------------------------------------------------------------
|
||||
CURRENT_VERSION="$(read_version)"
|
||||
NEW_VERSION="$CURRENT_VERSION"
|
||||
DO_TAG_AND_BUMP=true
|
||||
|
||||
if [[ "$BUMP" == "t" ]]; then
|
||||
echo "[INFO] Test build: keeping version $CURRENT_VERSION; will only update :dev."
|
||||
DO_TAG_AND_BUMP=false
|
||||
# Informational: show branch and HEAD if this happens to be a git repo.
|
||||
BRANCH_INFO=""
|
||||
HEAD_INFO=""
|
||||
if [[ -d ".git" ]]; then
|
||||
BRANCH_INFO="$(git branch --show-current 2>/dev/null || echo unknown)"
|
||||
HEAD_INFO="$(git rev-parse --short HEAD 2>/dev/null || echo unknown)"
|
||||
echo "[INFO] Repo: $(pwd)"
|
||||
echo "[INFO] Current branch: $BRANCH_INFO"
|
||||
echo "[INFO] HEAD (sha): $HEAD_INFO"
|
||||
else
|
||||
NEW_VERSION="$(bump_version "$CURRENT_VERSION" "$BUMP")"
|
||||
echo "[INFO] New version: $NEW_VERSION"
|
||||
echo "[INFO] Repo: $(pwd) (not a git checkout)"
|
||||
fi
|
||||
|
||||
if $DO_TAG_AND_BUMP; then
|
||||
validate_tag "$NEW_VERSION"
|
||||
# --- Determine version (release only) ----------------------------------------
|
||||
VERSION=""
|
||||
if [[ "$MODE" == "r" ]]; then
|
||||
VERSION="$(read_version_from_changelog)"
|
||||
echo "[INFO] Release version (from $CHANGELOG_FILE): $VERSION"
|
||||
validate_tag "$VERSION"
|
||||
validate_tag "latest"
|
||||
|
||||
# Ask for confirmation so you never accidentally re-push an old version or a wrong one.
|
||||
read -r -p "Proceed building & pushing as ${VERSION}? [y/N] " CONFIRM
|
||||
CONFIRM="${CONFIRM:-N}"
|
||||
if [[ ! "$CONFIRM" =~ ^[Yy]$ ]]; then
|
||||
echo "[INFO] Aborted by user."
|
||||
exit 0
|
||||
fi
|
||||
else
|
||||
echo "[INFO] Test build: only :dev will be pushed."
|
||||
fi
|
||||
validate_tag "dev"
|
||||
|
||||
# --- Version update + VCS ops (release builds only) --------------------------
|
||||
if $DO_TAG_AND_BUMP; then
|
||||
echo "[INFO] Writing $NEW_VERSION to $VERSION_FILE"
|
||||
write_version "$NEW_VERSION"
|
||||
|
||||
echo "[INFO] Git add + commit (branch: $DETECTED_BRANCH)"
|
||||
git add "$VERSION_FILE"
|
||||
git commit -m "Release $NEW_VERSION on branch $DETECTED_BRANCH (bump type $BUMP)"
|
||||
|
||||
echo "[INFO] Git tag $NEW_VERSION"
|
||||
git tag -a "$NEW_VERSION" -m "Release $NEW_VERSION"
|
||||
|
||||
echo "[INFO] Git push + tags"
|
||||
git push origin "$DETECTED_BRANCH"
|
||||
git push --tags
|
||||
else
|
||||
echo "[INFO] Skipping commit/tagging (test build)."
|
||||
fi
|
||||
|
||||
# --- Build & push per service ------------------------------------------------
|
||||
shopt -s nullglob
|
||||
services=( "$COMPOSE_DIR"/* )
|
||||
services=( "$CONTAINERS_DIR"/* )
|
||||
if [[ ${#services[@]} -eq 0 ]]; then
|
||||
echo "[ERROR] No services found under $COMPOSE_DIR"
|
||||
echo "[ERROR] No services found under $CONTAINERS_DIR"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
@ -236,21 +185,21 @@ for svc_path in "${services[@]}"; do
|
||||
|
||||
IMAGE_BASE="${DOCKER_REGISTRY}/${DOCKER_NAMESPACE}/${svc}"
|
||||
|
||||
if $DO_TAG_AND_BUMP; then
|
||||
if [[ "$MODE" == "r" ]]; then
|
||||
echo "============================================================"
|
||||
echo "[INFO] Building ${svc} -> tags: ${NEW_VERSION}, dev, latest"
|
||||
echo "[INFO] Building ${svc} -> tags: ${VERSION}, dev, latest"
|
||||
echo "============================================================"
|
||||
docker build \
|
||||
-t "${IMAGE_BASE}:${NEW_VERSION}" \
|
||||
-t "${IMAGE_BASE}:${VERSION}" \
|
||||
-t "${IMAGE_BASE}:dev" \
|
||||
-t "${IMAGE_BASE}:latest" \
|
||||
"$svc_path"
|
||||
|
||||
docker push "${IMAGE_BASE}:${NEW_VERSION}"
|
||||
docker push "${IMAGE_BASE}:${VERSION}"
|
||||
docker push "${IMAGE_BASE}:dev"
|
||||
docker push "${IMAGE_BASE}:latest"
|
||||
|
||||
BUILT_IMAGES+=("${IMAGE_BASE}:${NEW_VERSION}" "${IMAGE_BASE}:dev" "${IMAGE_BASE}:latest")
|
||||
BUILT_IMAGES+=("${IMAGE_BASE}:${VERSION}" "${IMAGE_BASE}:dev" "${IMAGE_BASE}:latest")
|
||||
else
|
||||
echo "============================================================"
|
||||
echo "[INFO] Test build ${svc} -> tag: dev"
|
||||
@ -261,21 +210,27 @@ for svc_path in "${services[@]}"; do
|
||||
fi
|
||||
done
|
||||
|
||||
# --- Persist current branch to .last-branch ----------------------------------
|
||||
# (This helps script 1 to preselect next time, and is informative if you run script 2 standalone)
|
||||
echo "$DETECTED_BRANCH" > "$LAST_BRANCH_FILE_PATH"
|
||||
|
||||
# --- Summary -----------------------------------------------------------------
|
||||
echo ""
|
||||
echo "============================================================"
|
||||
echo "[SUMMARY] Build & push complete (branch: $DETECTED_BRANCH)"
|
||||
if $DO_TAG_AND_BUMP; then
|
||||
echo "[INFO] Release version: $NEW_VERSION"
|
||||
if [[ "$MODE" == "r" ]]; then
|
||||
echo "[SUMMARY] Release build & push complete: $VERSION"
|
||||
else
|
||||
echo "[INFO] Test build (no version bump)"
|
||||
echo "[SUMMARY] Test build & push complete (:dev only)"
|
||||
fi
|
||||
if [[ -n "$BRANCH_INFO" ]]; then
|
||||
echo "[INFO] Branch: $BRANCH_INFO HEAD: $HEAD_INFO"
|
||||
fi
|
||||
echo "[INFO] Images pushed:"
|
||||
for img in "${BUILT_IMAGES[@]}"; do
|
||||
echo " - $img"
|
||||
done
|
||||
echo "============================================================"
|
||||
echo ""
|
||||
echo "[REMINDER] No git operations were performed. If this was a release,"
|
||||
echo " commit and tag manually, e.g.:"
|
||||
if [[ "$MODE" == "r" ]]; then
|
||||
echo " git add -A && git commit -m \"Release ${VERSION}\""
|
||||
echo " git tag -a ${VERSION} -m \"Release ${VERSION}\""
|
||||
echo " git push && git push --tags"
|
||||
fi
|
||||
|
||||
@ -541,6 +541,23 @@ def run_cove_import(settings, include_reasons: bool = False):
|
||||
break
|
||||
start += page_size
|
||||
|
||||
# Cove workstation offline detection (colorbar-based) — toggle in settings.
|
||||
# Runs once per import cycle so the synthetic offline runs stay in sync with
|
||||
# the freshly upserted colorbar data above.
|
||||
try:
|
||||
offline_changes = _apply_offline_detection_for_workstations(settings)
|
||||
if offline_changes:
|
||||
logger.info(
|
||||
"Cove offline detection: updated %d workstation job(s)",
|
||||
offline_changes,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning("Cove offline detection failed: %s", exc)
|
||||
try:
|
||||
db.session.rollback()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Update last import timestamp
|
||||
settings.cove_last_import_at = datetime.utcnow()
|
||||
try:
|
||||
@ -553,6 +570,165 @@ def run_cove_import(settings, include_reasons: bool = False):
|
||||
return total, created, skipped, errors
|
||||
|
||||
|
||||
def _parse_colorbar_codes(colorbar: str) -> list[int | None]:
|
||||
"""Parse a colorbar string into a list of status codes (oldest first).
|
||||
|
||||
Non-numeric entries become None so the caller can ignore them when counting
|
||||
streaks.
|
||||
"""
|
||||
stripped = (colorbar or "").strip()
|
||||
if not stripped:
|
||||
return []
|
||||
if " " in stripped or "," in stripped:
|
||||
raw = re.split(r"[,\s]+", stripped)
|
||||
else:
|
||||
raw = list(stripped)
|
||||
out: list[int | None] = []
|
||||
for c in raw:
|
||||
try:
|
||||
out.append(int(str(c).strip()))
|
||||
except (ValueError, TypeError):
|
||||
out.append(None)
|
||||
return out
|
||||
|
||||
|
||||
def _trailing_inactive_streak(codes: list[int | None]) -> int:
|
||||
"""Count consecutive trailing days where the colorbar code == 0 (no backup).
|
||||
|
||||
Stops at the first non-zero, non-None code (a real status, success or not).
|
||||
"""
|
||||
streak = 0
|
||||
for code in reversed(codes):
|
||||
if code == 0:
|
||||
streak += 1
|
||||
else:
|
||||
break
|
||||
return streak
|
||||
|
||||
|
||||
def _apply_offline_detection_for_workstations(settings) -> int:
|
||||
"""For every linked Cove workstation job, ensure a synthetic offline JobRun
|
||||
exists when the colorbar shows extended inactivity. Returns the number of
|
||||
job runs created/updated/removed.
|
||||
|
||||
No-op when settings.cove_offline_detection_enabled is False.
|
||||
"""
|
||||
if not getattr(settings, "cove_offline_detection_enabled", False):
|
||||
return 0
|
||||
|
||||
from .models import CoveAccount, Job, JobRun
|
||||
|
||||
warn_days = max(1, int(getattr(settings, "cove_workstation_warning_days", 7) or 7))
|
||||
err_days = max(1, int(getattr(settings, "cove_workstation_error_days", 14) or 14))
|
||||
|
||||
# Linked Cove workstation jobs only.
|
||||
rows = (
|
||||
db.session.query(CoveAccount, Job)
|
||||
.join(Job, Job.id == CoveAccount.job_id)
|
||||
.filter(CoveAccount.job_id.isnot(None))
|
||||
.filter(Job.archived.is_(False))
|
||||
.filter(Job.backup_software == "Cove Data Protection")
|
||||
.all()
|
||||
)
|
||||
|
||||
changes = 0
|
||||
for cove_acc, job in rows:
|
||||
if (job.backup_type or "").strip().lower() != "workstation":
|
||||
continue
|
||||
try:
|
||||
if _ensure_cove_offline_run(cove_acc, job, warn_days, err_days):
|
||||
changes += 1
|
||||
except Exception as exc:
|
||||
logger.warning(
|
||||
"Cove offline detection: error on account %s: %s",
|
||||
getattr(cove_acc, "account_id", "?"),
|
||||
exc,
|
||||
)
|
||||
try:
|
||||
db.session.rollback()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
if changes:
|
||||
try:
|
||||
db.session.commit()
|
||||
except Exception:
|
||||
db.session.rollback()
|
||||
return changes
|
||||
|
||||
|
||||
def _ensure_cove_offline_run(cove_acc, job, warn_days: int, err_days: int) -> bool:
|
||||
"""Create / update / remove the synthetic offline JobRun for one job.
|
||||
|
||||
Uses a stable external_id `cove-offline-{account_id}` so the same row is
|
||||
reused across import cycles. Reviewed runs are left untouched so previously
|
||||
acknowledged alerts do not reappear or mutate.
|
||||
|
||||
Returns True when a row was inserted, updated, or deleted.
|
||||
"""
|
||||
from .models import JobRun
|
||||
|
||||
codes = _parse_colorbar_codes(getattr(cove_acc, "colorbar_28d", None) or "")
|
||||
streak = _trailing_inactive_streak(codes)
|
||||
|
||||
if streak >= err_days:
|
||||
target_status = "Error"
|
||||
elif streak >= warn_days:
|
||||
target_status = "Warning"
|
||||
else:
|
||||
target_status = None
|
||||
|
||||
external_id = f"cove-offline-{cove_acc.account_id}"
|
||||
existing = (
|
||||
JobRun.query.filter_by(job_id=job.id, external_id=external_id).first()
|
||||
)
|
||||
|
||||
if target_status is None:
|
||||
# Condition cleared — drop unreviewed offline run; leave reviewed history alone.
|
||||
if existing and existing.reviewed_at is None:
|
||||
db.session.delete(existing)
|
||||
return True
|
||||
return False
|
||||
|
||||
remark = (
|
||||
f"Cove workstation inactive for {streak} day(s) — "
|
||||
f"no backup activity in colorbar. "
|
||||
f"Account: {cove_acc.account_name or cove_acc.account_id} | "
|
||||
f"Computer: {cove_acc.computer_name or '-'} | "
|
||||
f"Customer: {cove_acc.customer_name or '-'}"
|
||||
)
|
||||
now_utc = datetime.utcnow()
|
||||
|
||||
if existing:
|
||||
# Do not mutate a run the user has already reviewed.
|
||||
if existing.reviewed_at is not None:
|
||||
return False
|
||||
changed = False
|
||||
if existing.status != target_status:
|
||||
existing.status = target_status
|
||||
changed = True
|
||||
if existing.remark != remark:
|
||||
existing.remark = remark
|
||||
changed = True
|
||||
if changed:
|
||||
existing.run_at = now_utc
|
||||
return changed
|
||||
|
||||
run = JobRun(
|
||||
job_id=job.id,
|
||||
mail_message_id=None,
|
||||
run_at=now_utc,
|
||||
status=target_status,
|
||||
remark=remark,
|
||||
missed=False,
|
||||
override_applied=False,
|
||||
source_type="cove_api",
|
||||
external_id=external_id,
|
||||
)
|
||||
db.session.add(run)
|
||||
return True
|
||||
|
||||
|
||||
def _process_account(account: dict) -> str:
|
||||
"""Upsert a Cove account into the staging table and create a JobRun if linked.
|
||||
|
||||
|
||||
@ -44,6 +44,10 @@ def inbox():
|
||||
# Use location column if available; otherwise just return all
|
||||
if hasattr(MailMessage, "location"):
|
||||
query = query.filter(MailMessage.location == "inbox")
|
||||
# Hide messages that belong to archived jobs; keep unlinked messages (job_id IS NULL).
|
||||
query = query.outerjoin(Job, Job.id == MailMessage.job_id).filter(
|
||||
(MailMessage.job_id.is_(None)) | (Job.archived.is_(False))
|
||||
)
|
||||
if q:
|
||||
for pat in _patterns(q):
|
||||
query = query.filter(
|
||||
|
||||
@ -1094,6 +1094,16 @@ def _ensure_missed_runs_for_job(job: Job, start_from: date, end_inclusive: date)
|
||||
if getattr(job, "backup_type", "").lower() in ("cloud connect backup", "cloud connect agent"):
|
||||
return 0
|
||||
|
||||
# Cove workstations are typically PCs that are powered off outside business hours.
|
||||
# Schedule-based missed-run detection produces noise for "device was off" days,
|
||||
# so skip entirely. Real Cove statuses (Failed/Warning/Not started) still surface
|
||||
# via the regular import flow. Servers and Microsoft 365 remain unaffected.
|
||||
if (
|
||||
(getattr(job, "backup_software", "") or "") == "Cove Data Protection"
|
||||
and (getattr(job, "backup_type", "") or "").lower() == "workstation"
|
||||
):
|
||||
return 0
|
||||
|
||||
tz = _get_ui_timezone()
|
||||
resolved_schedule = _get_effective_schedule_for_job(job)
|
||||
schedule_map = resolved_schedule.get("weekly_map") or {i: [] for i in range(7)}
|
||||
@ -1439,6 +1449,7 @@ def run_checks_page():
|
||||
.select_from(Job)
|
||||
.outerjoin(Customer, Customer.id == Job.customer_id)
|
||||
.filter(Job.archived.is_(False))
|
||||
.filter(db.or_(Customer.id.is_(None), Customer.active.is_(True)))
|
||||
)
|
||||
if q:
|
||||
for pat in _patterns(q):
|
||||
@ -2969,49 +2980,31 @@ def api_run_checks_unmark_reviewed():
|
||||
return jsonify({"status": "ok", "updated": updated, "skipped": skipped})
|
||||
|
||||
|
||||
@main_bp.post("/api/run-checks/mark-success-override")
|
||||
@login_required
|
||||
@roles_required("admin", "operator")
|
||||
def api_run_checks_mark_success_override():
|
||||
"""Create a time-bounded override so the selected run is treated as Success (override)."""
|
||||
data = request.get_json(silent=True) or {}
|
||||
def _get_run_error_text(run, obj_rows):
|
||||
"""Extract the most relevant error text from a run for override matching."""
|
||||
# First try: error messages from problem objects
|
||||
for rr in obj_rows or []:
|
||||
err = (rr.get("error_message") or "").strip()
|
||||
if err:
|
||||
return err[:255]
|
||||
# Second try: run remark
|
||||
remark = (getattr(run, "remark", None) or "").strip()
|
||||
if remark:
|
||||
return remark[:255]
|
||||
# Last resort: legacy objects
|
||||
try:
|
||||
run_id = int(data.get("run_id") or 0)
|
||||
objs = list(run.objects) if hasattr(run, "objects") else []
|
||||
except Exception:
|
||||
run_id = 0
|
||||
objs = []
|
||||
for obj in objs or []:
|
||||
em = (getattr(obj, "error_message", None) or "").strip()
|
||||
if em:
|
||||
return em[:255]
|
||||
return ""
|
||||
|
||||
if run_id <= 0:
|
||||
return jsonify({"status": "error", "message": "Invalid run_id."}), 400
|
||||
|
||||
run = JobRun.query.get_or_404(run_id)
|
||||
job = Job.query.get_or_404(run.job_id)
|
||||
|
||||
# Do not allow overriding a missed placeholder run.
|
||||
if bool(getattr(run, "missed", False)):
|
||||
return jsonify({"status": "error", "message": "Missed runs cannot be marked as success."}), 400
|
||||
|
||||
# If it is already a success or already overridden, do nothing.
|
||||
if bool(getattr(run, "override_applied", False)):
|
||||
return jsonify({"status": "ok", "message": "Already overridden."})
|
||||
|
||||
if _status_is_success(getattr(run, "status", None)):
|
||||
return jsonify({"status": "ok", "message": "Already successful."})
|
||||
|
||||
# Build a tight validity window around this run.
|
||||
run_ts = getattr(run, "run_at", None) or getattr(run, "created_at", None) or datetime.utcnow()
|
||||
start_at = run_ts - timedelta(minutes=1)
|
||||
end_at = run_ts + timedelta(minutes=1)
|
||||
|
||||
comment = (data.get("comment") or "").strip()
|
||||
if not comment:
|
||||
# Keep it short and consistent; Operators will typically include a ticket number separately.
|
||||
comment = "Marked as success from Run Checks"
|
||||
comment = comment[:2000]
|
||||
|
||||
created_any = False
|
||||
|
||||
# Prefer object-level overrides (scoped to this job) to avoid impacting other jobs.
|
||||
obj_rows = []
|
||||
def _fetch_problem_objects(run_id):
|
||||
"""Fetch run object rows and filter to problem objects only."""
|
||||
try:
|
||||
obj_rows = (
|
||||
db.session.execute(
|
||||
@ -3027,15 +3020,17 @@ def api_run_checks_mark_success_override():
|
||||
ORDER BY co.object_name ASC
|
||||
"""
|
||||
),
|
||||
{"run_id": run.id},
|
||||
{"run_id": run_id},
|
||||
)
|
||||
.mappings()
|
||||
.all()
|
||||
)
|
||||
except Exception:
|
||||
obj_rows = []
|
||||
return obj_rows
|
||||
|
||||
def _obj_is_problem(status: str | None) -> bool:
|
||||
|
||||
def _obj_is_problem(status):
|
||||
s = (status or "").strip().lower()
|
||||
if not s:
|
||||
return False
|
||||
@ -3045,12 +3040,89 @@ def api_run_checks_mark_success_override():
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def _duration_to_end_at(duration, start_at):
|
||||
"""Convert a duration string to an end_at datetime. Returns None for permanent."""
|
||||
if duration == "1w":
|
||||
return start_at + timedelta(weeks=1)
|
||||
elif duration == "1m":
|
||||
return start_at + timedelta(days=30)
|
||||
elif duration == "permanent":
|
||||
return None
|
||||
# "once" or default: ±1 minute window (handled by caller)
|
||||
return None
|
||||
|
||||
|
||||
@main_bp.post("/api/run-checks/mark-success-override")
|
||||
@login_required
|
||||
@roles_required("admin", "operator")
|
||||
def api_run_checks_mark_success_override():
|
||||
"""Create a time-bounded override so the selected run is treated as Success (override).
|
||||
|
||||
Supports two flows:
|
||||
- scope="run" (default): creates a tight ±1 minute override for this specific run.
|
||||
Returns error_text and run_info so the frontend can offer a broader override.
|
||||
- scope="job": creates an object-level override on job_id + error text match.
|
||||
- scope="global": creates a global override on backup_software + backup_type + error text match.
|
||||
|
||||
Duration options: "once" (±1 min), "1w", "1m", "permanent".
|
||||
"""
|
||||
data = request.get_json(silent=True) or {}
|
||||
try:
|
||||
run_id = int(data.get("run_id") or 0)
|
||||
except Exception:
|
||||
run_id = 0
|
||||
|
||||
if run_id <= 0:
|
||||
return jsonify({"status": "error", "message": "Invalid run_id."}), 400
|
||||
|
||||
run = JobRun.query.get_or_404(run_id)
|
||||
job = Job.query.get_or_404(run.job_id)
|
||||
|
||||
scope = (data.get("scope") or "run").strip().lower()
|
||||
if scope not in ("run", "job", "global"):
|
||||
scope = "run"
|
||||
duration = (data.get("duration") or "once").strip().lower()
|
||||
if duration not in ("once", "1w", "1m", "permanent"):
|
||||
duration = "once"
|
||||
|
||||
# Do not allow overriding a missed placeholder run.
|
||||
if bool(getattr(run, "missed", False)):
|
||||
return jsonify({"status": "error", "message": "Missed runs cannot be marked as success."}), 400
|
||||
|
||||
# For scope=run: if already overridden or successful, do nothing.
|
||||
# For broader scopes (job/global): skip this check — the run-level override
|
||||
# was already applied, and this call creates an additional broader override.
|
||||
if scope == "run":
|
||||
if bool(getattr(run, "override_applied", False)):
|
||||
return jsonify({"status": "ok", "message": "Already overridden."})
|
||||
if _status_is_success(getattr(run, "status", None)):
|
||||
return jsonify({"status": "ok", "message": "Already successful."})
|
||||
|
||||
comment = (data.get("comment") or "").strip()
|
||||
if not comment:
|
||||
comment = "Marked as success from Run Checks"
|
||||
comment = comment[:2000]
|
||||
|
||||
obj_rows = _fetch_problem_objects(run.id)
|
||||
error_text = (data.get("error_text") or "").strip()
|
||||
if not error_text:
|
||||
error_text = _get_run_error_text(run, [rr for rr in obj_rows if _obj_is_problem(rr.get("status"))])
|
||||
|
||||
now = datetime.utcnow()
|
||||
run_ts = getattr(run, "run_at", None) or getattr(run, "created_at", None) or now
|
||||
|
||||
if scope == "run":
|
||||
# Original behaviour: tight ±1 minute window per problem object.
|
||||
start_at = run_ts - timedelta(minutes=1)
|
||||
end_at = run_ts + timedelta(minutes=1)
|
||||
created_any = False
|
||||
|
||||
for rr in obj_rows or []:
|
||||
obj_name = (rr.get("object_name") or "").strip()
|
||||
obj_status = (rr.get("status") or "").strip()
|
||||
if (not obj_name) or (not _obj_is_problem(obj_status)):
|
||||
continue
|
||||
|
||||
err = (rr.get("error_message") or "").strip()
|
||||
ov = Override(
|
||||
level="object",
|
||||
@ -3069,28 +3141,15 @@ def api_run_checks_mark_success_override():
|
||||
db.session.add(ov)
|
||||
created_any = True
|
||||
|
||||
# If we couldn't build a safe object-scoped override, fall back to a very tight global override.
|
||||
if not created_any:
|
||||
match_error_contains = (getattr(run, "remark", None) or "").strip()
|
||||
if not match_error_contains:
|
||||
# As a last resort, try to match any error message from legacy objects.
|
||||
try:
|
||||
objs = list(run.objects) if hasattr(run, "objects") else []
|
||||
except Exception:
|
||||
objs = []
|
||||
for obj in objs or []:
|
||||
em = (getattr(obj, "error_message", None) or "").strip()
|
||||
if em:
|
||||
match_error_contains = em
|
||||
break
|
||||
|
||||
match_err = error_text[:255] if error_text else None
|
||||
ov = Override(
|
||||
level="global",
|
||||
backup_software=job.backup_software or None,
|
||||
backup_type=job.backup_type or None,
|
||||
match_status=(getattr(run, "status", None) or None),
|
||||
match_error_contains=(match_error_contains[:255] if match_error_contains else None),
|
||||
match_error_mode=("contains" if match_error_contains else None),
|
||||
match_error_contains=match_err,
|
||||
match_error_mode=("contains" if match_err else None),
|
||||
treat_as_success=True,
|
||||
active=True,
|
||||
comment=comment,
|
||||
@ -3099,16 +3158,109 @@ def api_run_checks_mark_success_override():
|
||||
end_at=end_at,
|
||||
)
|
||||
db.session.add(ov)
|
||||
created_any = True
|
||||
|
||||
db.session.commit()
|
||||
|
||||
# Recompute flags so the overview and modal reflect the override immediately.
|
||||
try:
|
||||
from .routes_shared import _recompute_override_flags_for_runs
|
||||
|
||||
_recompute_override_flags_for_runs(job_ids=[job.id], start_at=start_at, end_at=end_at, only_unreviewed=False)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return jsonify({"status": "ok", "message": "Override created."})
|
||||
# Return run info so the frontend can offer a broader override via the follow-up dialog.
|
||||
return jsonify({
|
||||
"status": "ok",
|
||||
"message": "Override created.",
|
||||
"run_info": {
|
||||
"run_id": run.id,
|
||||
"job_id": job.id,
|
||||
"backup_software": job.backup_software or "",
|
||||
"backup_type": job.backup_type or "",
|
||||
"job_name": job.job_name or "",
|
||||
"customer_name": job.customer.name if job.customer else "",
|
||||
"error_text": error_text,
|
||||
},
|
||||
})
|
||||
|
||||
elif scope == "job":
|
||||
# Object-level override scoped to this job, matching on error text.
|
||||
start_at = now
|
||||
end_at = _duration_to_end_at(duration, start_at)
|
||||
match_err = error_text[:255] if error_text else None
|
||||
ov = Override(
|
||||
level="object",
|
||||
job_id=job.id,
|
||||
match_error_contains=match_err,
|
||||
match_error_mode=("contains" if match_err else None),
|
||||
treat_as_success=True,
|
||||
active=True,
|
||||
comment=comment,
|
||||
created_by=current_user.username,
|
||||
start_at=start_at,
|
||||
end_at=end_at,
|
||||
)
|
||||
db.session.add(ov)
|
||||
db.session.commit()
|
||||
|
||||
# Audit log for broader override
|
||||
try:
|
||||
from .routes_shared import _log_admin_event
|
||||
duration_label = {"1w": "1 week", "1m": "1 month", "permanent": "permanent"}.get(duration, duration)
|
||||
_log_admin_event(
|
||||
"override_from_review",
|
||||
f"Broader override created from Run Checks review (scope=job, duration={duration_label})",
|
||||
details=f"run_id={run.id}, job_id={job.id}, job_name={job.job_name or ''}, error_match={match_err or 'none'}, override_id={ov.id}",
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Recompute for this job across the override window.
|
||||
try:
|
||||
from .routes_shared import _recompute_override_flags_for_runs
|
||||
_recompute_override_flags_for_runs(job_ids=[job.id], start_at=start_at, end_at=end_at or (now + timedelta(days=365)), only_unreviewed=False)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return jsonify({"status": "ok", "message": "Job-level override created."})
|
||||
|
||||
elif scope == "global":
|
||||
# Global override matching on backup_software + backup_type + error text.
|
||||
start_at = now
|
||||
end_at = _duration_to_end_at(duration, start_at)
|
||||
match_err = error_text[:255] if error_text else None
|
||||
ov = Override(
|
||||
level="global",
|
||||
backup_software=job.backup_software or None,
|
||||
backup_type=job.backup_type or None,
|
||||
match_error_contains=match_err,
|
||||
match_error_mode=("contains" if match_err else None),
|
||||
treat_as_success=True,
|
||||
active=True,
|
||||
comment=comment,
|
||||
created_by=current_user.username,
|
||||
start_at=start_at,
|
||||
end_at=end_at,
|
||||
)
|
||||
db.session.add(ov)
|
||||
db.session.commit()
|
||||
|
||||
# Audit log for broader override
|
||||
try:
|
||||
from .routes_shared import _log_admin_event
|
||||
duration_label = {"1w": "1 week", "1m": "1 month", "permanent": "permanent"}.get(duration, duration)
|
||||
_log_admin_event(
|
||||
"override_from_review",
|
||||
f"Broader override created from Run Checks review (scope=global, duration={duration_label})",
|
||||
details=f"run_id={run.id}, software={job.backup_software or ''}, type={job.backup_type or ''}, error_match={match_err or 'none'}, override_id={ov.id}",
|
||||
)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
# Recompute broadly — all jobs could be affected by a global override.
|
||||
try:
|
||||
from .routes_shared import _recompute_override_flags_for_runs
|
||||
_recompute_override_flags_for_runs(job_ids=None, start_at=start_at, end_at=end_at or (now + timedelta(days=365)), only_unreviewed=False)
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return jsonify({"status": "ok", "message": "Global override created."})
|
||||
|
||||
@ -476,6 +476,7 @@ def _build_run_checks_results(patterns: list[str], page: int) -> dict:
|
||||
.join(agg, agg.c.job_id == Job.id)
|
||||
.outerjoin(Customer, Customer.id == Job.customer_id)
|
||||
.filter(Job.archived.is_(False))
|
||||
.filter(db.or_(Customer.id.is_(None), Customer.active.is_(True)))
|
||||
)
|
||||
|
||||
match_expr = _contains_all_terms(
|
||||
|
||||
@ -331,7 +331,7 @@ Backup failed. Please check the logs for details.""",
|
||||
|
||||
if status_type not in email_sets:
|
||||
flash("Invalid status type.", "danger")
|
||||
return redirect(url_for("main.settings", section="maintenance"))
|
||||
return redirect(url_for("main.settings", section="testing"))
|
||||
|
||||
emails = email_sets[status_type]
|
||||
created_count = 0
|
||||
@ -365,7 +365,195 @@ Backup failed. Please check the logs for details.""",
|
||||
print(f"[settings-test] Failed to generate test emails: {exc}")
|
||||
flash("Failed to generate test emails.", "danger")
|
||||
|
||||
return redirect(url_for("main.settings", section="maintenance"))
|
||||
return redirect(url_for("main.settings", section="testing"))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Test run generator — creates JobRuns with run_object_links for override testing
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
_TEST_JOB_NAME = "Test Backup Job"
|
||||
_TEST_CUSTOMER_NAME = "000 Test"
|
||||
_TEST_BACKUP_SOFTWARE = "Veeam"
|
||||
_TEST_BACKUP_TYPE = "Backup"
|
||||
|
||||
_TEST_ERROR_SCENARIOS = {
|
||||
"vss": "Cannot create snapshot: VSS error 0x800423f4",
|
||||
"connection": "Cannot connect to host: connection timed out after 30s",
|
||||
"diskspace": "Low free space on target datastore DS01 (2.1 GB remaining)",
|
||||
"license": "License expired: Veeam Backup & Replication license has expired",
|
||||
"network": "Network transfer timeout: failed to transfer data within 3600s",
|
||||
"permission": "Access denied: insufficient permissions on target repository",
|
||||
}
|
||||
|
||||
|
||||
def _get_or_create_test_job():
|
||||
"""Return the test job (and its customer), creating both if needed."""
|
||||
customer = Customer.query.filter_by(name=_TEST_CUSTOMER_NAME).first()
|
||||
if not customer:
|
||||
customer = Customer(name=_TEST_CUSTOMER_NAME, active=True)
|
||||
db.session.add(customer)
|
||||
db.session.flush()
|
||||
|
||||
job = Job.query.filter_by(job_name=_TEST_JOB_NAME, customer_id=customer.id).first()
|
||||
if not job:
|
||||
job = Job(
|
||||
customer_id=customer.id,
|
||||
backup_software=_TEST_BACKUP_SOFTWARE,
|
||||
backup_type=_TEST_BACKUP_TYPE,
|
||||
job_name=_TEST_JOB_NAME,
|
||||
auto_approve=True,
|
||||
active=True,
|
||||
)
|
||||
db.session.add(job)
|
||||
db.session.flush()
|
||||
|
||||
return job, customer
|
||||
|
||||
|
||||
def _get_or_create_customer_object(customer_id, object_name):
|
||||
"""Return a customer_object row id, creating it if needed."""
|
||||
row = db.session.execute(
|
||||
text("SELECT id FROM customer_objects WHERE customer_id = :cid AND object_name = :name"),
|
||||
{"cid": customer_id, "name": object_name},
|
||||
).fetchone()
|
||||
if row:
|
||||
return row[0]
|
||||
db.session.execute(
|
||||
text("INSERT INTO customer_objects (customer_id, object_name, first_seen_at, last_seen_at) VALUES (:cid, :name, NOW(), NOW())"),
|
||||
{"cid": customer_id, "name": object_name},
|
||||
)
|
||||
db.session.flush()
|
||||
row = db.session.execute(
|
||||
text("SELECT id FROM customer_objects WHERE customer_id = :cid AND object_name = :name"),
|
||||
{"cid": customer_id, "name": object_name},
|
||||
).fetchone()
|
||||
return row[0]
|
||||
|
||||
|
||||
@main_bp.route("/settings/test-run/generate", methods=["POST"])
|
||||
@login_required
|
||||
@roles_required("admin")
|
||||
def settings_generate_test_run():
|
||||
"""Generate a single test JobRun with 3 objects for override testing."""
|
||||
try:
|
||||
status_type = (request.form.get("status") or "failed").strip().lower()
|
||||
if status_type not in ("success", "warning", "failed"):
|
||||
status_type = "failed"
|
||||
|
||||
error_scenario = (request.form.get("error_scenario") or "vss").strip().lower()
|
||||
custom_error = (request.form.get("custom_error") or "").strip()
|
||||
|
||||
if error_scenario == "custom" and custom_error:
|
||||
error_message = custom_error[:500]
|
||||
else:
|
||||
error_message = _TEST_ERROR_SCENARIOS.get(error_scenario, _TEST_ERROR_SCENARIOS["vss"])
|
||||
|
||||
job, customer = _get_or_create_test_job()
|
||||
now = datetime.utcnow()
|
||||
|
||||
# Determine run-level status
|
||||
run_status_map = {"success": "Success", "warning": "Warning", "failed": "Failed"}
|
||||
run_status = run_status_map[status_type]
|
||||
|
||||
run = JobRun(
|
||||
job_id=job.id,
|
||||
status=run_status,
|
||||
remark=error_message if status_type != "success" else "All backups completed successfully",
|
||||
run_at=now,
|
||||
source_type="test",
|
||||
)
|
||||
db.session.add(run)
|
||||
db.session.flush()
|
||||
|
||||
# Create 3 objects: VM-APP01, VM-DB01, VM-WEB01
|
||||
object_defs = [
|
||||
("VM-APP01", "Success", None),
|
||||
("VM-DB01", "Success", None),
|
||||
("VM-WEB01", "Success", None),
|
||||
]
|
||||
|
||||
if status_type == "warning":
|
||||
# First object gets the warning
|
||||
object_defs[0] = ("VM-APP01", "Warning", error_message)
|
||||
elif status_type == "failed":
|
||||
# First object gets the failure
|
||||
object_defs[0] = ("VM-APP01", "Failed", error_message)
|
||||
|
||||
for obj_name, obj_status, obj_error in object_defs:
|
||||
co_id = _get_or_create_customer_object(customer.id, obj_name)
|
||||
db.session.execute(
|
||||
text("""
|
||||
INSERT INTO run_object_links (run_id, customer_object_id, status, error_message, observed_at)
|
||||
VALUES (:run_id, :co_id, :status, :error, :observed_at)
|
||||
"""),
|
||||
{"run_id": run.id, "co_id": co_id, "status": obj_status, "error": obj_error, "observed_at": now},
|
||||
)
|
||||
|
||||
db.session.commit()
|
||||
|
||||
flash(f"Generated {run_status} test run (id={run.id}) with error: {error_message[:80] if status_type != 'success' else 'none'}", "success")
|
||||
|
||||
_log_admin_event(
|
||||
event_type="maintenance_generate_test_run",
|
||||
message=f"Generated {run_status} test run",
|
||||
details=json.dumps({"run_id": run.id, "status": run_status, "error_scenario": error_scenario}),
|
||||
)
|
||||
|
||||
except Exception as exc:
|
||||
db.session.rollback()
|
||||
print(f"[settings-test] Failed to generate test run: {exc}")
|
||||
flash("Failed to generate test run.", "danger")
|
||||
|
||||
return redirect(url_for("main.settings", section="testing"))
|
||||
|
||||
|
||||
@main_bp.route("/settings/test-run/cleanup", methods=["POST"])
|
||||
@login_required
|
||||
@roles_required("admin")
|
||||
def settings_cleanup_test_runs():
|
||||
"""Delete all test runs and associated objects for the test job."""
|
||||
try:
|
||||
job = Job.query.filter_by(job_name=_TEST_JOB_NAME).first()
|
||||
if not job:
|
||||
flash("No test data found.", "info")
|
||||
return redirect(url_for("main.settings", section="testing"))
|
||||
|
||||
# Count before deleting
|
||||
run_count = JobRun.query.filter_by(job_id=job.id).count()
|
||||
|
||||
# Delete run_object_links via cascade (job_runs → run_object_links)
|
||||
# Delete job runs
|
||||
db.session.execute(
|
||||
text("DELETE FROM run_object_links WHERE run_id IN (SELECT id FROM job_runs WHERE job_id = :jid)"),
|
||||
{"jid": job.id},
|
||||
)
|
||||
db.session.execute(
|
||||
text("DELETE FROM job_runs WHERE job_id = :jid"),
|
||||
{"jid": job.id},
|
||||
)
|
||||
# Delete customer_objects for the test customer
|
||||
customer = Customer.query.filter_by(name=_TEST_CUSTOMER_NAME).first()
|
||||
if customer:
|
||||
db.session.execute(
|
||||
text("DELETE FROM customer_objects WHERE customer_id = :cid"),
|
||||
{"cid": customer.id},
|
||||
)
|
||||
|
||||
db.session.commit()
|
||||
flash(f"Deleted {run_count} test run(s) and associated objects.", "success")
|
||||
|
||||
_log_admin_event(
|
||||
event_type="maintenance_cleanup_test_runs",
|
||||
message=f"Cleaned up {run_count} test runs",
|
||||
)
|
||||
|
||||
except Exception as exc:
|
||||
db.session.rollback()
|
||||
print(f"[settings-test] Failed to cleanup test runs: {exc}")
|
||||
flash("Failed to cleanup test runs.", "danger")
|
||||
|
||||
return redirect(url_for("main.settings", section="testing"))
|
||||
|
||||
|
||||
@main_bp.route("/settings/objects/backfill", methods=["POST"])
|
||||
@ -936,6 +1124,30 @@ def settings():
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
|
||||
# Cove workstation offline detection (colorbar-based).
|
||||
# The enable flag rides on the Cove form group, so update it whenever the
|
||||
# form was submitted from this section.
|
||||
if cove_form_touched:
|
||||
settings.cove_offline_detection_enabled = bool(
|
||||
request.form.get("cove_offline_detection_enabled")
|
||||
)
|
||||
|
||||
if "cove_workstation_warning_days" in request.form:
|
||||
try:
|
||||
warn_days = int(request.form.get("cove_workstation_warning_days") or 7)
|
||||
warn_days = max(1, min(warn_days, 28))
|
||||
settings.cove_workstation_warning_days = warn_days
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
|
||||
if "cove_workstation_error_days" in request.form:
|
||||
try:
|
||||
err_days = int(request.form.get("cove_workstation_error_days") or 14)
|
||||
err_days = max(1, min(err_days, 28))
|
||||
settings.cove_workstation_error_days = err_days
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
|
||||
# Microsoft Entra SSO
|
||||
if entra_form_touched:
|
||||
settings.entra_sso_enabled = bool(request.form.get("entra_sso_enabled"))
|
||||
|
||||
@ -1280,6 +1280,37 @@ def migrate_cove_integration() -> None:
|
||||
print(f"[migrations] Failed to migrate Cove integration columns: {exc}")
|
||||
|
||||
|
||||
def migrate_cove_offline_detection() -> None:
|
||||
"""Add Cove workstation offline-detection settings to system_settings.
|
||||
|
||||
Adds:
|
||||
- cove_offline_detection_enabled (BOOLEAN NOT NULL DEFAULT FALSE)
|
||||
- cove_workstation_warning_days (INTEGER NOT NULL DEFAULT 7)
|
||||
- cove_workstation_error_days (INTEGER NOT NULL DEFAULT 14)
|
||||
"""
|
||||
try:
|
||||
engine = db.get_engine()
|
||||
except Exception as exc:
|
||||
print(f"[migrations] Could not get engine for Cove offline detection migration: {exc}")
|
||||
return
|
||||
|
||||
columns = [
|
||||
("cove_offline_detection_enabled", "BOOLEAN NOT NULL DEFAULT FALSE"),
|
||||
("cove_workstation_warning_days", "INTEGER NOT NULL DEFAULT 7"),
|
||||
("cove_workstation_error_days", "INTEGER NOT NULL DEFAULT 14"),
|
||||
]
|
||||
|
||||
try:
|
||||
with engine.begin() as conn:
|
||||
for column, ddl in columns:
|
||||
if _column_exists_on_conn(conn, "system_settings", column):
|
||||
continue
|
||||
conn.execute(text(f'ALTER TABLE "system_settings" ADD COLUMN {column} {ddl}'))
|
||||
print("[migrations] migrate_cove_offline_detection completed.")
|
||||
except Exception as exc:
|
||||
print(f"[migrations] Failed to migrate Cove offline detection columns: {exc}")
|
||||
|
||||
|
||||
def migrate_entra_sso_settings() -> None:
|
||||
"""Add Microsoft Entra SSO columns to system_settings if missing."""
|
||||
try:
|
||||
@ -1484,6 +1515,7 @@ def run_migrations() -> None:
|
||||
migrate_rename_admin_logs_to_audit_logs()
|
||||
migrate_cove_integration()
|
||||
migrate_cove_accounts_table()
|
||||
migrate_cove_offline_detection()
|
||||
migrate_cloud_connect_accounts_table()
|
||||
migrate_cc_accounts_repo_unique_key()
|
||||
migrate_cc_remove_synthetic_missed_runs()
|
||||
|
||||
@ -136,6 +136,13 @@ class SystemSettings(db.Model):
|
||||
cove_partner_id = db.Column(db.Integer, nullable=True) # stored after successful login
|
||||
cove_last_import_at = db.Column(db.DateTime, nullable=True)
|
||||
|
||||
# Cove workstation offline detection (colorbar-based).
|
||||
# When enabled, Cove workstation jobs receive a synthetic warning/error
|
||||
# JobRun if their 28-day colorbar shows extended inactivity.
|
||||
cove_offline_detection_enabled = db.Column(db.Boolean, nullable=False, default=False)
|
||||
cove_workstation_warning_days = db.Column(db.Integer, nullable=False, default=7)
|
||||
cove_workstation_error_days = db.Column(db.Integer, nullable=False, default=14)
|
||||
|
||||
# Microsoft Entra SSO settings
|
||||
entra_sso_enabled = db.Column(db.Boolean, nullable=False, default=False)
|
||||
entra_tenant_id = db.Column(db.String(128), nullable=True)
|
||||
|
||||
@ -27,7 +27,7 @@
|
||||
<li>In the Autotask section, enable integration.</li>
|
||||
<li>Select environment (<code>production</code> or <code>sandbox</code>).</li>
|
||||
<li>Fill in username, password, and tracking identifier.</li>
|
||||
<li>Optional: set <strong>Autotask Base URL</strong> for links in notes/details.</li>
|
||||
<li>Optional: set <strong>Backupchecks Base URL</strong> for deep-links from ticket notes/details back into Backupchecks.</li>
|
||||
</ol>
|
||||
|
||||
<h2>Step 2: Configure Ticket Defaults</h2>
|
||||
|
||||
@ -301,6 +301,25 @@
|
||||
<li>Wait for more runs to be imported, then schedule inference will update automatically</li>
|
||||
</ul>
|
||||
|
||||
<h3>Cove Workstations Don't Show "Missed"</h3>
|
||||
|
||||
<ul>
|
||||
<li>
|
||||
Cove Data Protection workstation jobs are intentionally <strong>excluded</strong> from
|
||||
schedule-based missed-run detection. PCs are commonly powered off outside business hours,
|
||||
which produced false-positive Missed alerts; they no longer appear here for those jobs.
|
||||
</li>
|
||||
<li>
|
||||
Cove <em>Server</em> and <em>Microsoft 365</em> jobs continue to use the regular
|
||||
schedule-based missed-run logic.
|
||||
</li>
|
||||
<li>
|
||||
For workstation inactivity, enable
|
||||
<a href="{{ url_for('documentation.page', section='integrations', page='cove-data-protection') }}">colorbar-based offline detection</a>
|
||||
in Settings → Integrations → Cove instead.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<h3>Job Shows "Missed" but Actually Ran</h3>
|
||||
|
||||
<ul>
|
||||
|
||||
@ -159,7 +159,28 @@
|
||||
When you create an override without a start date, it is applied retroactively to existing unreviewed runs. This means jobs that match the override will immediately show the "Treat as success" status in Daily Jobs, even if they ran before the override was created.
|
||||
</div>
|
||||
|
||||
<h2>Creating an Override</h2>
|
||||
<h2>Creating Overrides Directly From Run Checks</h2>
|
||||
|
||||
<p>The fastest way to create an override is from the run you are reviewing. After clicking
|
||||
<strong>Mark as Success</strong> in the Run Checks modal, a follow-up dialog
|
||||
<em>"Apply override for future runs?"</em> appears with:</p>
|
||||
|
||||
<ul>
|
||||
<li><strong>Scope:</strong>
|
||||
<ul>
|
||||
<li><em>Only this run</em> — no override is created.</li>
|
||||
<li><em>This job, same error message</em> — creates a job-scoped override.</li>
|
||||
<li><em>All jobs with same software/type and error</em> — creates a global override.</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li><strong>Duration:</strong> 1 week, 1 month, or permanent (until manually disabled).</li>
|
||||
<li>The error text from the run's problem objects is pre-filled, ready to be trimmed.</li>
|
||||
</ul>
|
||||
|
||||
<p>Broader overrides created this way are audit-logged with their scope, duration, and source
|
||||
run. This path covers most day-to-day use cases without ever opening the Overrides page.</p>
|
||||
|
||||
<h2>Creating an Override (Overrides Page)</h2>
|
||||
|
||||
<p>To create a new override:</p>
|
||||
|
||||
|
||||
@ -132,9 +132,34 @@
|
||||
Backup objects are only shown if the parser extracted them from the email. Not all backup software emails include object-level details. If no objects are shown, the parser didn't detect individual items in the email.
|
||||
</div>
|
||||
|
||||
<h3>Cove / Cloud Connect Summary Panels</h3>
|
||||
|
||||
<p>For runs imported via the Cove API or Veeam Cloud Connect (no source email), the modal hides
|
||||
the email section and shows a structured summary panel instead:</p>
|
||||
|
||||
<ul>
|
||||
<li><strong>Cove summary:</strong> account name, computer, customer, active datasources, last
|
||||
session timestamp and status. Per-datasource objects appear in the Backup Objects table.</li>
|
||||
<li><strong>Cloud Connect summary:</strong> tenant, repository, used / quota / free storage and
|
||||
a link to the source provider report.</li>
|
||||
</ul>
|
||||
|
||||
<p>See <a href="{{ url_for('documentation.page', section='integrations', page='cove-data-protection') }}">Cove Data Protection</a>
|
||||
and <a href="{{ url_for('documentation.page', section='integrations', page='veeam-cloud-connect') }}">Veeam Cloud Connect</a>
|
||||
for the underlying data flow.</p>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>💡 Cove same-day suppression:</strong><br>
|
||||
For Cove jobs, once the first <em>complete success</em> run on a given local day is recorded
|
||||
(status <code>Success</code> with all object statuses <code>Success</code>), all newer Cove
|
||||
runs on that same day are hidden from Run Checks — regardless of status. This prevents
|
||||
duplicate review of the same day's backup activity.
|
||||
</div>
|
||||
|
||||
<h3>Email Content</h3>
|
||||
|
||||
<p>The original email body from the backup software is displayed in an embedded iframe:</p>
|
||||
<p>For email-imported runs, the original email body from the backup software is displayed in
|
||||
an embedded iframe:</p>
|
||||
|
||||
<ul>
|
||||
<li>HTML emails are rendered with their original formatting</li>
|
||||
@ -187,6 +212,36 @@
|
||||
<li>All runs for this job are immediately marked as reviewed and the job disappears from the Run Checks page</li>
|
||||
</ol>
|
||||
|
||||
<h3>Mark as Success (with optional override)</h3>
|
||||
|
||||
<p>For warning or failed runs that should be treated as successful, use <strong>Mark as Success</strong>:</p>
|
||||
|
||||
<ol>
|
||||
<li>Open the run details and click <strong>Mark as Success</strong>.</li>
|
||||
<li>The run is recorded as success-by-override and removed from the unreviewed list.</li>
|
||||
<li>A follow-up dialog asks <em>"Apply override for future runs?"</em> with two choices that
|
||||
persist beyond this single run:
|
||||
<ul>
|
||||
<li><strong>Only this run</strong> (default — no future override is created).</li>
|
||||
<li><strong>This job, same error message</strong> — creates a job-scoped override that
|
||||
treats future occurrences with the same error text as success.</li>
|
||||
<li><strong>All jobs with same software/type and error</strong> — creates a global override
|
||||
across the same backup software/type combination.</li>
|
||||
</ul>
|
||||
</li>
|
||||
<li>Choose a <strong>duration</strong>: 1 week, 1 month, or permanent (until manually disabled).</li>
|
||||
<li>The error text is pre-filled from the run's problem objects so you can review and trim it
|
||||
before saving.</li>
|
||||
</ol>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>💡 When to use which scope:</strong><br>
|
||||
Use "Only this run" for one-off events, "This job" for issues specific to a single environment
|
||||
(e.g. a particular VM that always warns), and "All jobs" for known cosmetic warnings from a
|
||||
backup product that recur across many customers. Broader overrides are audit-logged with
|
||||
scope, duration, and source run details.
|
||||
</div>
|
||||
|
||||
<h3>Mark as Reviewed (Bulk)</h3>
|
||||
|
||||
<p>For efficiency, especially with successful backups:</p>
|
||||
|
||||
@ -140,6 +140,23 @@
|
||||
New jobs without learned schedules will not appear on Daily Jobs until a pattern is established.
|
||||
</div>
|
||||
|
||||
<h2>Cove Workstations: Missed-Run Detection Disabled</h2>
|
||||
|
||||
<p>
|
||||
Schedule-based missed-run detection (synthetic <code>Missed</code> rows for past slots
|
||||
without a real run) is <strong>not</strong> applied to Cove Data Protection workstation jobs.
|
||||
Workstation devices are routinely powered off outside business hours, which produced
|
||||
false-positive Missed alerts. Cove <em>Server</em> and <em>Microsoft 365</em> jobs continue
|
||||
to use the regular missed-run logic.
|
||||
</p>
|
||||
|
||||
<p>
|
||||
For workstation inactivity, enable colorbar-based offline detection in
|
||||
<strong>Settings → Integrations → Cove</strong>. See
|
||||
<a href="{{ url_for('documentation.page', section='integrations', page='cove-data-protection') }}">Cove Data Protection</a>
|
||||
for details.
|
||||
</p>
|
||||
|
||||
<h2>Schedule Accuracy</h2>
|
||||
|
||||
<p>Schedule learning is based on pattern recognition and may not be 100% accurate in all cases:</p>
|
||||
|
||||
@ -8,15 +8,11 @@
|
||||
verify that backups are running successfully across their customer infrastructure.
|
||||
</p>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>📝 Coming Soon:</strong>
|
||||
This page is under construction. Screenshots and additional content will be added in future updates.
|
||||
</div>
|
||||
|
||||
<h2>Key Features</h2>
|
||||
|
||||
<ul>
|
||||
<li><strong>Automated Mail Parsing:</strong> Import backup reports via email and automatically parse results</li>
|
||||
<li><strong>API Integrations:</strong> Direct API import for Cove Data Protection (N-able) and Veeam Cloud Connect — no email needed</li>
|
||||
<li><strong>Review Workflow:</strong> Review all backup jobs daily and mark them as reviewed (goal: clear the Run Checks queue)</li>
|
||||
<li><strong>Customer Management:</strong> Organize backups by customer and manage multiple backup jobs per customer</li>
|
||||
<li><strong>Autotask Integration:</strong> Manually create tickets in Autotask PSA for failed backups requiring follow-up</li>
|
||||
@ -24,11 +20,6 @@
|
||||
<li><strong>Role-Based Access:</strong> Admin, Operator, Reporter, and Viewer roles</li>
|
||||
</ul>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>💡 Note:</strong>
|
||||
Screenshots will be added in a future update to illustrate the dashboard and key features.
|
||||
</div>
|
||||
|
||||
<h2>How It Works</h2>
|
||||
|
||||
<p>BackupChecks follows a simple workflow:</p>
|
||||
@ -60,40 +51,75 @@
|
||||
|
||||
<h2>Supported Backup Software</h2>
|
||||
|
||||
<p>BackupChecks supports parsing backup reports from:</p>
|
||||
<p>BackupChecks ships with the following parsers and API integrations:</p>
|
||||
|
||||
<table>
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Software</th>
|
||||
<th>Support Level</th>
|
||||
<th>Source</th>
|
||||
<th>Type</th>
|
||||
<th>Notes</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>Veeam Backup & Replication</td>
|
||||
<td>Full</td>
|
||||
<td>Email notifications with detailed job status</td>
|
||||
<td>Email parser</td>
|
||||
<td>Detailed per-object job results</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Acronis Cyber Protect</td>
|
||||
<td>Full</td>
|
||||
<td>Backup completion reports</td>
|
||||
<td>Veeam Cloud Connect</td>
|
||||
<td>Email parser</td>
|
||||
<td>Daily provider report; per-tenant inbox flow</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Synology Active Backup</td>
|
||||
<td>Full</td>
|
||||
<td>Cove Data Protection (N-able)</td>
|
||||
<td>API integration</td>
|
||||
<td>JSON-RPC API; account-based inbox flow + 28-day colorbar history</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>NAKIVO Backup & Replication</td>
|
||||
<td>Email parser</td>
|
||||
<td>Job completion reports</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Synology Active Backup / Hyper Backup</td>
|
||||
<td>Email parser</td>
|
||||
<td>Task result notifications</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>QNAP Hybrid Backup Sync</td>
|
||||
<td>Full</td>
|
||||
<td>Email parser</td>
|
||||
<td>Job completion emails</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Custom/Other</td>
|
||||
<td>Configurable</td>
|
||||
<td>Syncovery</td>
|
||||
<td>Email parser</td>
|
||||
<td>Profile result notifications</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>BoxAfe (Microsoft 365 backup)</td>
|
||||
<td>Email parser</td>
|
||||
<td>Tenant-level result reports</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>R-Drive Image</td>
|
||||
<td>Email parser</td>
|
||||
<td>Workstation/server image backup reports</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>3CX</td>
|
||||
<td>Email parser</td>
|
||||
<td>3CX phone system backup notifications</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>NTFS Auditing</td>
|
||||
<td>Email parser</td>
|
||||
<td>File-system audit reports</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Custom / Other</td>
|
||||
<td>Manual</td>
|
||||
<td>Manual job configuration for non-standard formats</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
@ -112,6 +138,8 @@
|
||||
<li><strong>Backend:</strong> Flask (Python) application with PostgreSQL database</li>
|
||||
<li><strong>Frontend:</strong> Bootstrap 5 for responsive UI</li>
|
||||
<li><strong>Mail Import:</strong> Microsoft Graph API integration for automated email retrieval</li>
|
||||
<li><strong>Cove Data Protection:</strong> JSON-RPC API integration with background polling</li>
|
||||
<li><strong>Veeam Cloud Connect:</strong> Email-based per-tenant import with separate inbox flow</li>
|
||||
<li><strong>Autotask API:</strong> Optional integration for manual ticket creation</li>
|
||||
<li><strong>Reporting:</strong> Built-in report generation with scheduling</li>
|
||||
</ul>
|
||||
|
||||
@ -43,11 +43,63 @@
|
||||
|
||||
<ul>
|
||||
<li>Cove runs are stored as API runs (<code>source_type = cove_api</code>).</li>
|
||||
<li>Cove run details are visible in Job Detail and Run Checks with a Cove summary panel.</li>
|
||||
<li>Cove run details are visible in Job Detail and Run Checks with a Cove summary panel (no mail section is shown for Cove runs).</li>
|
||||
<li>No mail message is required for Cove runs.</li>
|
||||
<li>Historical backfill can create runs from Cove 28-day colorbar data.</li>
|
||||
<li>
|
||||
<strong>Same-day duplicate suppression:</strong> once the first <em>complete success</em>
|
||||
run for a Cove job/day is recorded (status <code>Success</code> with at least one persisted
|
||||
object and all object statuses equal to <code>Success</code>), all newer Cove runs on that
|
||||
same local day are hidden in Run Checks — regardless of whether they are
|
||||
<code>Success</code>, <code>Warning</code>, or <code>Failed/Error</code>. Sort order in the modal
|
||||
stays newest → oldest.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<h2>Workstation Offline Handling</h2>
|
||||
|
||||
<p>
|
||||
Cove workstation devices are commonly powered off outside business hours, which used to
|
||||
produce false-positive missed-run alerts. Two mechanisms reduce this noise:
|
||||
</p>
|
||||
|
||||
<h3>Schedule-based missed runs are skipped (always on)</h3>
|
||||
<ul>
|
||||
<li>
|
||||
For jobs with <code>backup_software = "Cove Data Protection"</code> and
|
||||
<code>backup_type = "Workstation"</code>, the missed-run generator is disabled.
|
||||
No synthetic <code>Missed</code> rows are created when a workstation is off during a
|
||||
scheduled slot.
|
||||
</li>
|
||||
<li>
|
||||
Cove <em>Server</em> and <em>Microsoft 365</em> jobs keep the regular schedule-based
|
||||
missed-run logic.
|
||||
</li>
|
||||
<li>
|
||||
Real Cove statuses (<code>Failed</code>, <code>Warning</code>, <code>Not started</code>)
|
||||
continue to surface via the normal import flow, so genuine problems still get reported.
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
<h3>Colorbar-based offline detection (optional toggle)</h3>
|
||||
<p>
|
||||
Configure under <strong>Settings → Integrations → Cove → Workstation offline detection</strong>.
|
||||
</p>
|
||||
<ul>
|
||||
<li><strong>Enable colorbar-based offline detection</strong> — off by default.</li>
|
||||
<li><strong>Warning after N inactive days</strong> — default 7 (range 1–28).</li>
|
||||
<li><strong>Error after N inactive days</strong> — default 14 (range 1–28).</li>
|
||||
</ul>
|
||||
<p>
|
||||
When enabled, every Cove import cycle checks each linked workstation's 28-day colorbar
|
||||
(<code>D09F08</code>) and counts the consecutive trailing days with status <code>0</code>
|
||||
(no backup activity). If the streak crosses the warning or error threshold, a single synthetic
|
||||
<code>JobRun</code> is upserted for that account so the alert appears in Run Checks. The same
|
||||
row is reused across cycles, so it escalates Warning → Error in place and is removed
|
||||
automatically once activity resumes. Runs that have already been reviewed in Run Checks are
|
||||
never mutated or removed, so acknowledged alerts do not reappear.
|
||||
</p>
|
||||
|
||||
<h2>Troubleshooting</h2>
|
||||
|
||||
<ul>
|
||||
|
||||
@ -40,6 +40,25 @@
|
||||
Once you approve an email, it immediately disappears from the inbox and the job appears in operational views (Jobs, Daily Jobs, Run Checks). The inbox only contains emails awaiting approval or deletion.
|
||||
</div>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>💡 Archived jobs are filtered out:</strong><br>
|
||||
Mail messages whose linked job has been <em>archived</em> are hidden from the inbox.
|
||||
Messages without a linked job remain visible. To process emails for an archived job, unarchive
|
||||
the job first in <strong>Jobs</strong>.
|
||||
</div>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>💡 Parallel inboxes for API integrations:</strong><br>
|
||||
This Inbox handles email-imported reports. API-based integrations have their own staging
|
||||
pages with the same approve-and-link flow:
|
||||
<ul>
|
||||
<li><strong>Cove Accounts</strong> — Cove Data Protection accounts awaiting link to a job
|
||||
(see <a href="{{ url_for('documentation.page', section='integrations', page='cove-data-protection') }}">Cove Data Protection</a>).</li>
|
||||
<li><strong>Cloud Connect Accounts</strong> — Veeam Cloud Connect tenants awaiting link
|
||||
(see <a href="{{ url_for('documentation.page', section='integrations', page='veeam-cloud-connect') }}">Veeam Cloud Connect</a>).</li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
<h3>Table Columns</h3>
|
||||
|
||||
<table>
|
||||
|
||||
@ -38,11 +38,24 @@
|
||||
<li><strong>Delete orphaned:</strong> removes orphaned jobs and related run/email data.</li>
|
||||
</ul>
|
||||
|
||||
<h2>Generate Test Run</h2>
|
||||
|
||||
<p>Creates a single <code>JobRun</code> with three persisted run objects, attached to a fixed
|
||||
test job (<code>__test-override-job__</code>) and customer that are auto-created on first use.
|
||||
Designed for exercising the Smart Override flow in Run Checks without needing a live backup.</p>
|
||||
|
||||
<ul>
|
||||
<li>Choose a <strong>status</strong>: Success, Warning, or Failed.</li>
|
||||
<li>Choose an <strong>error scenario</strong>: VSS, connection timeout, disk space, license,
|
||||
network, permissions, or a custom free-text message.</li>
|
||||
<li><strong>Delete all test runs</strong> removes all generated test data (runs and objects).</li>
|
||||
</ul>
|
||||
|
||||
<h2>Generate Test Emails</h2>
|
||||
|
||||
<ul>
|
||||
<li>Generates one Veeam test email per selected status: Success, Warning, or Error.</li>
|
||||
<li>Useful for parser testing and maintenance validation.</li>
|
||||
<li>Generates one Veeam test email in the inbox per selected status: Success, Warning, or Error.</li>
|
||||
<li>Useful for parser testing and inbox-flow validation.</li>
|
||||
</ul>
|
||||
|
||||
<h2>Jobs Maintenance</h2>
|
||||
|
||||
@ -7,14 +7,9 @@
|
||||
Learn how to log in to BackupChecks and manage your authentication session.
|
||||
</p>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>📝 Coming Soon:</strong><br>
|
||||
This page is under construction. Screenshots will be added in a future update.
|
||||
</div>
|
||||
|
||||
<h2>Logging In</h2>
|
||||
|
||||
<p>BackupChecks uses a traditional username and password authentication system.</p>
|
||||
<p>BackupChecks supports local username/password authentication and, optionally, Microsoft Entra (Azure AD) Single Sign-On.</p>
|
||||
|
||||
<h3>Login Steps</h3>
|
||||
|
||||
@ -23,15 +18,17 @@
|
||||
<li>You will be automatically redirected to the login page if not authenticated</li>
|
||||
<li>Enter your <strong>username</strong> in the username field</li>
|
||||
<li>Enter your <strong>password</strong> in the password field</li>
|
||||
<li>Complete the <strong>captcha</strong> by solving the simple math problem (e.g., "3 + 5 = ?")</li>
|
||||
<li>If the login captcha is enabled, solve the simple math problem (e.g., "3 + 5 = ?")</li>
|
||||
<li>Click the <strong>Login</strong> button</li>
|
||||
</ol>
|
||||
|
||||
<p>If your credentials are correct and the captcha is solved, you will be redirected to the dashboard.</p>
|
||||
<p>If your credentials are correct (and the captcha, if shown, is solved), you will be redirected to the dashboard.</p>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>📝 Future Change:</strong><br>
|
||||
The captcha requirement is planned to become optional via a system setting. Since BackupChecks is typically deployed in restricted local environments, the captcha may be disabled to streamline the login process.
|
||||
<strong>💡 Captcha is configurable:</strong><br>
|
||||
The login captcha is controlled by the <code>login_captcha_enabled</code> setting in
|
||||
<strong>Settings → General</strong>. It is enabled by default; administrators can disable it
|
||||
for restricted internal deployments where the extra step is unnecessary.
|
||||
</div>
|
||||
|
||||
<h3>First Login</h3>
|
||||
@ -50,17 +47,24 @@
|
||||
Bookmark the BackupChecks URL for quick access. The system will remember your last active role between sessions.
|
||||
</div>
|
||||
|
||||
<h2>Authentication Method</h2>
|
||||
<h2>Authentication Methods</h2>
|
||||
|
||||
<p>BackupChecks uses local database authentication:</p>
|
||||
<p>BackupChecks supports the following authentication methods:</p>
|
||||
|
||||
<ul>
|
||||
<li><strong>Username/Password:</strong> Credentials are stored securely in the database</li>
|
||||
<li><strong>Local Username/Password:</strong> Credentials are stored hashed in the database</li>
|
||||
<li><strong>Microsoft Entra SSO (OAuth/OIDC):</strong> Optional Single Sign-On via Microsoft Entra (Azure AD), configurable in <strong>Settings → Integrations → Entra SSO</strong>. See <a href="{{ url_for('documentation.page', section='settings', page='entra-sso') }}">Entra SSO setup</a>.</li>
|
||||
<li><strong>Session-Based:</strong> After login, a secure session is created and stored in a cookie</li>
|
||||
<li><strong>No SSO/OAuth:</strong> External authentication providers are not currently supported</li>
|
||||
<li><strong>No Two-Factor Authentication:</strong> 2FA is not currently implemented</li>
|
||||
</ul>
|
||||
|
||||
<div class="doc-callout doc-callout-warning">
|
||||
<strong>⚠️ Entra SSO is implemented but not yet validated in production:</strong><br>
|
||||
The integration is wired up end-to-end (login flow, token exchange, optional auto-provisioning,
|
||||
group filtering) but has not been tested against a live tenant in our deployment. Treat it as a
|
||||
preview feature until verified. Local username/password remains the recommended login method.
|
||||
</div>
|
||||
|
||||
<div class="doc-callout doc-callout-info">
|
||||
<strong>💡 Note:</strong><br>
|
||||
User accounts are created and managed by administrators via Settings → User Management.
|
||||
|
||||
@ -440,6 +440,7 @@
|
||||
<div class="modal-footer">
|
||||
<a id="rcm_eml_btn" class="btn btn-outline-primary" href="#" style="display:none;" rel="nofollow">Download EML</a>
|
||||
<a id="rcm_job_btn" class="btn btn-outline-secondary" href="#">Open job page</a>
|
||||
<button type="button" class="btn btn-warning btn-sm" id="rcm_mark_success_override" disabled>Mark as Success</button>
|
||||
<button type="button" class="btn btn-primary" id="rcm_mark_all_reviewed" disabled>Mark as Reviewed</button>
|
||||
<button type="button" class="btn btn-secondary" data-bs-dismiss="modal">Close</button>
|
||||
</div>
|
||||
@ -447,6 +448,77 @@
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{# Smart Override follow-up dialog (Phase 1) #}
|
||||
<div class="modal fade" id="smartOverrideModal" tabindex="-1" aria-labelledby="smartOverrideModalLabel" aria-hidden="true">
|
||||
<div class="modal-dialog modal-dialog-centered">
|
||||
<div class="modal-content">
|
||||
<div class="modal-header">
|
||||
<h5 class="modal-title" id="smartOverrideModalLabel">Apply override for future runs?</h5>
|
||||
<button type="button" class="btn-close" data-bs-dismiss="modal" aria-label="Close"></button>
|
||||
</div>
|
||||
<div class="modal-body">
|
||||
<p class="text-muted small mb-3">This run has been marked as success. Would you like to automatically handle similar future occurrences?</p>
|
||||
|
||||
<div class="mb-3">
|
||||
<label class="form-label fw-semibold">Scope</label>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="so_scope" id="so_scope_run" value="run" checked>
|
||||
<label class="form-check-label" for="so_scope_run">Only this run <span class="text-muted">(already done)</span></label>
|
||||
</div>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="so_scope" id="so_scope_job" value="job">
|
||||
<label class="form-check-label" for="so_scope_job">This job, same error message</label>
|
||||
</div>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="so_scope" id="so_scope_global" value="global">
|
||||
<label class="form-check-label" for="so_scope_global">All jobs with same software/type and error</label>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="so_duration_group" class="mb-3" style="display:none;">
|
||||
<label class="form-label fw-semibold">Duration</label>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="so_duration" id="so_dur_1w" value="1w" checked>
|
||||
<label class="form-check-label" for="so_dur_1w">1 week</label>
|
||||
</div>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="so_duration" id="so_dur_1m" value="1m">
|
||||
<label class="form-check-label" for="so_dur_1m">1 month</label>
|
||||
</div>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="so_duration" id="so_dur_perm" value="permanent">
|
||||
<label class="form-check-label" for="so_dur_perm">Permanent <span class="text-muted">(until manually disabled)</span></label>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div id="so_error_group" class="mb-3" style="display:none;">
|
||||
<label class="form-label fw-semibold">Error text to match</label>
|
||||
<textarea class="form-control form-control-sm" id="so_error_text" rows="2" readonly></textarea>
|
||||
<div class="form-text">Runs with this error text will be automatically treated as success.</div>
|
||||
</div>
|
||||
|
||||
<div id="so_scope_info" class="mb-3" style="display:none;">
|
||||
<div class="small text-muted">
|
||||
<span id="so_info_software"></span>
|
||||
<span id="so_info_job"></span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="mb-2">
|
||||
<label class="form-label fw-semibold">Comment <span class="text-muted fw-normal">(optional)</span></label>
|
||||
<input type="text" class="form-control form-control-sm" id="so_comment" placeholder="e.g. Known issue, VSS timeout">
|
||||
</div>
|
||||
|
||||
<div id="so_status" class="small text-danger mt-2" style="display:none;"></div>
|
||||
</div>
|
||||
<div class="modal-footer">
|
||||
<button type="button" class="btn btn-secondary" data-bs-dismiss="modal">No thanks</button>
|
||||
<button type="button" class="btn btn-primary" id="so_apply_btn" disabled>Apply broader override</button>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
<div class="modal fade" id="autotaskLinkModal" tabindex="-1" aria-labelledby="autotaskLinkModalLabel" aria-hidden="true">
|
||||
@ -968,6 +1040,97 @@ table.addEventListener('change', function (e) {
|
||||
});
|
||||
}
|
||||
|
||||
// --- Smart Override follow-up dialog logic ---
|
||||
var soModal = document.getElementById('smartOverrideModal');
|
||||
var soApplyBtn = document.getElementById('so_apply_btn');
|
||||
var soDurationGroup = document.getElementById('so_duration_group');
|
||||
var soErrorGroup = document.getElementById('so_error_group');
|
||||
var soScopeInfo = document.getElementById('so_scope_info');
|
||||
var soErrorText = document.getElementById('so_error_text');
|
||||
var soComment = document.getElementById('so_comment');
|
||||
var soStatus = document.getElementById('so_status');
|
||||
var soInfoSoftware = document.getElementById('so_info_software');
|
||||
var soInfoJob = document.getElementById('so_info_job');
|
||||
var _soRunInfo = null;
|
||||
|
||||
// Show/hide duration and error fields based on scope selection
|
||||
var soScopeRadios = document.querySelectorAll('input[name="so_scope"]');
|
||||
soScopeRadios.forEach(function (radio) {
|
||||
radio.addEventListener('change', function () {
|
||||
var v = this.value;
|
||||
var isBroader = (v === 'job' || v === 'global');
|
||||
soDurationGroup.style.display = isBroader ? '' : 'none';
|
||||
soErrorGroup.style.display = isBroader ? '' : 'none';
|
||||
soScopeInfo.style.display = isBroader ? '' : 'none';
|
||||
soApplyBtn.disabled = !isBroader;
|
||||
if (soStatus) soStatus.style.display = 'none';
|
||||
|
||||
if (_soRunInfo) {
|
||||
if (v === 'job') {
|
||||
soInfoJob.textContent = 'Job: ' + (_soRunInfo.job_name || '') + ' (' + (_soRunInfo.customer_name || '') + ')';
|
||||
soInfoSoftware.textContent = '';
|
||||
} else if (v === 'global') {
|
||||
soInfoSoftware.textContent = 'Software: ' + (_soRunInfo.backup_software || '') + ' / ' + (_soRunInfo.backup_type || '');
|
||||
soInfoJob.textContent = '';
|
||||
}
|
||||
}
|
||||
});
|
||||
});
|
||||
|
||||
function _showSmartOverrideDialog(runInfo) {
|
||||
_soRunInfo = runInfo;
|
||||
// Reset state
|
||||
document.getElementById('so_scope_run').checked = true;
|
||||
document.getElementById('so_dur_1w').checked = true;
|
||||
soDurationGroup.style.display = 'none';
|
||||
soErrorGroup.style.display = 'none';
|
||||
soScopeInfo.style.display = 'none';
|
||||
soApplyBtn.disabled = true;
|
||||
if (soComment) soComment.value = '';
|
||||
if (soStatus) soStatus.style.display = 'none';
|
||||
if (soErrorText) soErrorText.value = runInfo.error_text || '(no error text available)';
|
||||
|
||||
_modalShow(soModal);
|
||||
}
|
||||
|
||||
if (soApplyBtn) {
|
||||
soApplyBtn.addEventListener('click', function () {
|
||||
if (!_soRunInfo) return;
|
||||
var scope = (document.querySelector('input[name="so_scope"]:checked') || {}).value || 'run';
|
||||
if (scope === 'run') return; // Nothing to do
|
||||
|
||||
var duration = (document.querySelector('input[name="so_duration"]:checked') || {}).value || '1w';
|
||||
var comment = (soComment ? soComment.value : '').trim();
|
||||
var errorText = (soErrorText ? soErrorText.value : '').trim();
|
||||
|
||||
soApplyBtn.disabled = true;
|
||||
if (soStatus) soStatus.style.display = 'none';
|
||||
|
||||
apiJson('/api/run-checks/mark-success-override', {
|
||||
method: 'POST',
|
||||
body: JSON.stringify({
|
||||
run_id: _soRunInfo.run_id,
|
||||
scope: scope,
|
||||
duration: duration,
|
||||
comment: comment || undefined,
|
||||
error_text: errorText || undefined
|
||||
})
|
||||
})
|
||||
.then(function (j) {
|
||||
if (!j || j.status !== 'ok') throw new Error((j && j.message) || 'Failed');
|
||||
_modalHide(soModal);
|
||||
window.location.reload();
|
||||
})
|
||||
.catch(function (e) {
|
||||
if (soStatus) {
|
||||
soStatus.textContent = (e && e.message) ? e.message : 'Failed to create override.';
|
||||
soStatus.style.display = '';
|
||||
}
|
||||
soApplyBtn.disabled = false;
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
if (btnMarkSuccessOverride) {
|
||||
btnMarkSuccessOverride.addEventListener('click', function () {
|
||||
if (!currentRunId) return;
|
||||
@ -978,7 +1141,14 @@ table.addEventListener('change', function (e) {
|
||||
})
|
||||
.then(function (j) {
|
||||
if (!j || j.status !== 'ok') throw new Error((j && j.message) || 'Failed');
|
||||
// If run_info is returned, offer the smart override dialog
|
||||
if (j.run_info) {
|
||||
// Hide main modal first, then show follow-up
|
||||
_modalHide(document.getElementById('runChecksModal'));
|
||||
_showSmartOverrideDialog(j.run_info);
|
||||
} else {
|
||||
window.location.reload();
|
||||
}
|
||||
})
|
||||
.catch(function (e) {
|
||||
alert((e && e.message) ? e.message : 'Failed to mark as success (override).');
|
||||
@ -987,6 +1157,17 @@ table.addEventListener('change', function (e) {
|
||||
});
|
||||
}
|
||||
|
||||
// When smart override dialog is dismissed without action, reload to reflect the initial override.
|
||||
// Use a permanent listener (not .one()) since this dialog can be opened multiple times.
|
||||
if (soModal) {
|
||||
soModal.addEventListener('hidden.bs.modal', function () {
|
||||
if (_soRunInfo) {
|
||||
_soRunInfo = null;
|
||||
window.location.reload();
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
function renderAlerts(payload) {
|
||||
var box = document.getElementById('rcm_alerts');
|
||||
if (!box) return;
|
||||
|
||||
@ -26,6 +26,9 @@
|
||||
<li class="nav-item">
|
||||
<a class="nav-link {% if section == 'maintenance' %}active{% endif %}" href="{{ url_for('main.settings', section='maintenance') }}">Maintenance</a>
|
||||
</li>
|
||||
<li class="nav-item">
|
||||
<a class="nav-link {% if section == 'testing' %}active{% endif %}" href="{{ url_for('main.settings', section='testing') }}">Testing</a>
|
||||
</li>
|
||||
<li class="nav-item">
|
||||
<a class="nav-link {% if section == 'news' %}active{% endif %}" href="{{ url_for('main.settings', section='news') }}">News</a>
|
||||
</li>
|
||||
@ -562,6 +565,36 @@
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<hr class="my-3" />
|
||||
<h6 class="mb-2">Workstation offline detection</h6>
|
||||
<div class="form-text mb-2">
|
||||
Cove workstations are often powered off outside business hours, which causes
|
||||
false-positive missed-run alerts. Schedule-based missed-run detection is
|
||||
always disabled for Cove workstations. With this option enabled, a separate
|
||||
check uses the 28-day Cove colorbar to flag a job only when the workstation
|
||||
has truly been inactive for several days.
|
||||
</div>
|
||||
<div class="row g-3">
|
||||
<div class="col-md-4">
|
||||
<div class="form-check form-switch mt-2">
|
||||
<input class="form-check-input" type="checkbox" id="cove_offline_detection_enabled" name="cove_offline_detection_enabled" {% if settings.cove_offline_detection_enabled %}checked{% endif %} />
|
||||
<label class="form-check-label" for="cove_offline_detection_enabled">Enable colorbar-based offline detection</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<label for="cove_workstation_warning_days" class="form-label">Warning after N inactive days</label>
|
||||
<input type="number" class="form-control" id="cove_workstation_warning_days" name="cove_workstation_warning_days"
|
||||
value="{{ settings.cove_workstation_warning_days or 7 }}" min="1" max="28" />
|
||||
<div class="form-text">Consecutive days without a successful backup before a Warning is raised.</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<label for="cove_workstation_error_days" class="form-label">Error after N inactive days</label>
|
||||
<input type="number" class="form-control" id="cove_workstation_error_days" name="cove_workstation_error_days"
|
||||
value="{{ settings.cove_workstation_error_days or 14 }}" min="1" max="28" />
|
||||
<div class="form-text">Consecutive days without a successful backup before an Error is raised.</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="d-flex justify-content-between align-items-center mt-3">
|
||||
<div id="cove-test-result" class="small"></div>
|
||||
<div class="d-flex gap-2">
|
||||
@ -750,26 +783,10 @@
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="col-12 col-lg-6">
|
||||
<div class="card h-100 border-info">
|
||||
<div class="card-header bg-info text-white">Generate test emails</div>
|
||||
<div class="card-body">
|
||||
<p class="mb-3">Generate Veeam test emails in the inbox for testing parsers and maintenance operations. Each button creates 1 Veeam Backup Job email with the specified status.</p>
|
||||
<div class="d-flex flex-column gap-2">
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_emails', status_type='success') }}">
|
||||
<button type="submit" class="btn btn-success w-100">Generate success email (1)</button>
|
||||
</form>
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_emails', status_type='warning') }}">
|
||||
<button type="submit" class="btn btn-warning w-100">Generate warning email (1)</button>
|
||||
</form>
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_emails', status_type='error') }}">
|
||||
<button type="submit" class="btn btn-danger w-100">Generate error email (1)</button>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-12 col-lg-6">
|
||||
<div class="card h-100 border-danger">
|
||||
<div class="card-header bg-danger text-white">Jobs maintenance</div>
|
||||
@ -805,6 +822,74 @@
|
||||
</div>
|
||||
|
||||
|
||||
{% endif %}
|
||||
|
||||
{% if section == 'testing' %}
|
||||
|
||||
<div class="card mb-4 border-primary">
|
||||
<div class="card-header bg-primary text-white">Generate test run</div>
|
||||
<div class="card-body">
|
||||
<p class="text-muted mb-3">Generate a single test run with objects for testing the Smart Override flow in Run Checks. Each click creates one run for a fixed test job (<code>__test-override-job__</code>).</p>
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_run') }}">
|
||||
<div class="row g-3 align-items-end">
|
||||
<div class="col-md-3">
|
||||
<label class="form-label fw-semibold">Status</label>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="status" id="tr_success" value="success">
|
||||
<label class="form-check-label" for="tr_success">Success</label>
|
||||
</div>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="status" id="tr_warning" value="warning">
|
||||
<label class="form-check-label" for="tr_warning">Warning</label>
|
||||
</div>
|
||||
<div class="form-check">
|
||||
<input class="form-check-input" type="radio" name="status" id="tr_failed" value="failed" checked>
|
||||
<label class="form-check-label" for="tr_failed">Failed</label>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-5" id="tr_error_group">
|
||||
<label class="form-label fw-semibold">Error scenario</label>
|
||||
<select class="form-select form-select-sm" name="error_scenario" id="tr_error_scenario">
|
||||
<option value="vss">VSS snapshot error (0x800423f4)</option>
|
||||
<option value="connection">Connection timeout</option>
|
||||
<option value="diskspace">Low disk space on datastore</option>
|
||||
<option value="license">License expired</option>
|
||||
<option value="network">Network transfer timeout</option>
|
||||
<option value="permission">Access denied / permissions</option>
|
||||
<option value="custom">Custom error message...</option>
|
||||
</select>
|
||||
<input type="text" class="form-control form-control-sm mt-2" name="custom_error" id="tr_custom_error" placeholder="Enter custom error message" style="display:none;">
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<button type="submit" class="btn btn-primary">Generate test run</button>
|
||||
</div>
|
||||
</div>
|
||||
</form>
|
||||
<hr class="my-3">
|
||||
<form method="post" action="{{ url_for('main.settings_cleanup_test_runs') }}" onsubmit="return confirm('Delete all test runs?');" class="d-inline">
|
||||
<button type="submit" class="btn btn-outline-danger btn-sm">Delete all test runs</button>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="card mb-4 border-info">
|
||||
<div class="card-header bg-info text-white">Generate test emails</div>
|
||||
<div class="card-body">
|
||||
<p class="mb-3">Generate Veeam test emails in the inbox for testing parsers and inbox flow. Each button creates 1 Veeam Backup Job email with the specified status.</p>
|
||||
<div class="d-flex flex-column gap-2" style="max-width: 300px;">
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_emails', status_type='success') }}">
|
||||
<button type="submit" class="btn btn-success w-100">Generate success email</button>
|
||||
</form>
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_emails', status_type='warning') }}">
|
||||
<button type="submit" class="btn btn-warning w-100">Generate warning email</button>
|
||||
</form>
|
||||
<form method="post" action="{{ url_for('main.settings_generate_test_emails', status_type='error') }}">
|
||||
<button type="submit" class="btn btn-danger w-100">Generate error email</button>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{% endif %}
|
||||
|
||||
{% if section == 'general' %}
|
||||
@ -1083,4 +1168,31 @@
|
||||
|
||||
{% endif %}
|
||||
|
||||
<script>
|
||||
(function () {
|
||||
var statusRadios = document.querySelectorAll('input[name="status"]');
|
||||
var errorGroup = document.getElementById('tr_error_group');
|
||||
var errorScenario = document.getElementById('tr_error_scenario');
|
||||
var customError = document.getElementById('tr_custom_error');
|
||||
|
||||
if (!statusRadios.length || !errorGroup) return;
|
||||
|
||||
function updateErrorVisibility() {
|
||||
var selected = document.querySelector('input[name="status"]:checked');
|
||||
var isSuccess = selected && selected.value === 'success';
|
||||
errorGroup.style.display = isSuccess ? 'none' : '';
|
||||
}
|
||||
|
||||
statusRadios.forEach(function (r) { r.addEventListener('change', updateErrorVisibility); });
|
||||
updateErrorVisibility();
|
||||
|
||||
if (errorScenario && customError) {
|
||||
errorScenario.addEventListener('change', function () {
|
||||
customError.style.display = this.value === 'custom' ? '' : 'none';
|
||||
if (this.value === 'custom') customError.focus();
|
||||
});
|
||||
}
|
||||
})();
|
||||
</script>
|
||||
|
||||
{% endblock %}
|
||||
|
||||
@ -1,6 +1,6 @@
|
||||
# Technical Notes (Internal)
|
||||
|
||||
Last updated: 2026-03-23
|
||||
Last updated: 2026-04-16
|
||||
|
||||
## Purpose
|
||||
Internal technical snapshot of the `backupchecks` repository for faster onboarding, troubleshooting, and change impact analysis.
|
||||
@ -337,9 +337,63 @@ Cove run rows in the job detail history table are clickable even without a mail
|
||||
- Once the first complete success exists on that day, all newer Cove runs for the same day are hidden in Run Checks (overview aggregation + details modal), regardless of status (`Success`, `Warning`, `Failed/Error`).
|
||||
- Sort order in the modal remains unchanged (`newest -> oldest`).
|
||||
|
||||
### Workstation Offline Handling (added 2026-05-01)
|
||||
Two layered behaviours reduce false-positive alerts for PCs that are routinely powered off:
|
||||
|
||||
1. **Schedule-based missed-run skip (always on)** — `_ensure_missed_runs_for_job` in `routes_run_checks.py` early-returns for Cove + Workstation jobs. Servers and Microsoft 365 keep the existing missed-run generation. Real Cove statuses (Failed/Warning/Not started) continue to surface through the normal import path.
|
||||
2. **Colorbar-based offline detection (toggle in Settings → Integrations → Cove)** — when `cove_offline_detection_enabled = True`, `cove_importer._apply_offline_detection_for_workstations(settings)` runs once per import cycle. For every linked Cove workstation job it parses `cove_acc.colorbar_28d`, counts the trailing streak of `0` codes (no backup that day), and:
|
||||
- `streak >= cove_workstation_error_days` → upsert synthetic `JobRun` with `status = "Error"`.
|
||||
- `streak >= cove_workstation_warning_days` (and below error) → `status = "Warning"`.
|
||||
- Otherwise → drop any existing unreviewed offline run for the job.
|
||||
|
||||
The synthetic run uses a **stable external_id** `cove-offline-{account_id}` so it escalates in place (Warning → Error) and is removed when activity resumes. Reviewed runs (`reviewed_at IS NOT NULL`) are never mutated or deleted, so previously acknowledged alerts do not reappear.
|
||||
|
||||
### Migrations
|
||||
- `migrate_cove_integration()` — adds 8 columns to `system_settings`, `cove_account_id` to `jobs`, `source_type` + `external_id` to `job_runs`, dedup index on `job_runs.external_id`
|
||||
- `migrate_cove_accounts_table()` — creates `cove_accounts` table with indexes
|
||||
- `migrate_cove_offline_detection()` — adds `cove_offline_detection_enabled`, `cove_workstation_warning_days`, `cove_workstation_error_days` to `system_settings`
|
||||
|
||||
---
|
||||
|
||||
## Override System
|
||||
|
||||
### Overview
|
||||
Overrides allow operators to mark specific backup runs (or patterns of runs) as success, suppressing repeated alerts for known non-critical issues. The system operates at two levels: **object-level** (scoped to a specific job) and **global** (scoped to backup software/type).
|
||||
|
||||
### Override Model (`overrides` table)
|
||||
- `level`: `"global"` or `"object"`
|
||||
- `backup_software`, `backup_type`: scope for global overrides (nullable)
|
||||
- `job_id`, `object_name`: scope for object overrides (nullable)
|
||||
- `match_status`, `match_error_contains`, `match_error_mode`: matching criteria (`contains`/`exact`/`starts_with`/`ends_with`)
|
||||
- `treat_as_success`, `active`: behavior flags
|
||||
- `start_at`, `end_at`: validity window (`end_at=NULL` for permanent overrides)
|
||||
- `comment`, `created_by`, `updated_by`: audit metadata
|
||||
|
||||
### Override Evaluation (`_apply_overrides_to_run` in `routes_shared.py`)
|
||||
- Called per run to determine display status
|
||||
- Object-level overrides take precedence over global overrides
|
||||
- Uses wildcard matching (fnmatch) for `object_name`, configurable text matching for errors
|
||||
- Returns 5-tuple: `(display_status, override_applied, override_level, override_id, override_reason)`
|
||||
- `_recompute_override_flags_for_runs()` batch-updates `JobRun.override_applied` flag after override changes
|
||||
|
||||
### Smart Overrides — Phase 1 (2026-04-16)
|
||||
"Mark as Success" in Run Checks now offers a follow-up dialog to create broader overrides for recurring issues.
|
||||
|
||||
**API** (`POST /api/run-checks/mark-success-override`):
|
||||
- `scope` parameter: `"run"` (default, ±1 min window), `"job"` (object-level on job_id + error match), `"global"` (software + type + error match)
|
||||
- `duration` parameter: `"once"` (±1 min), `"1w"`, `"1m"`, `"permanent"` (end_at=NULL)
|
||||
- `error_text` parameter: pre-filled from problem objects, can be adjusted by operator
|
||||
- `comment` parameter: optional operator note
|
||||
- For `scope=run`: returns `run_info` object with error_text and job metadata so frontend can populate the follow-up dialog
|
||||
- For `scope=job/global`: creates the broader override and audit-logs it with `event_type=override_from_review`
|
||||
|
||||
**Frontend flow** (`run_checks.html`):
|
||||
1. Operator clicks "Mark as Success" → scope=run override created → follow-up modal appears
|
||||
2. Modal shows scope choice (run/job/global), duration choice (1w/1m/permanent), error text preview
|
||||
3. If operator chooses a broader scope → second API call creates the additional override
|
||||
4. If operator dismisses → page reloads to reflect the initial run-level override
|
||||
|
||||
**No database migrations required** — all fields already exist on the Override model.
|
||||
|
||||
---
|
||||
|
||||
@ -735,6 +789,20 @@ File: `build-and-push.sh`
|
||||
|
||||
## Recent Changes
|
||||
|
||||
### 2026-04-16
|
||||
- **Smart Overrides Phase 1** (`main/routes_run_checks.py`, `run_checks.html`):
|
||||
- Extended `mark-success-override` API with `scope` (run/job/global) and `duration` (once/1w/1m/permanent) parameters.
|
||||
- Added follow-up dialog modal in Run Checks for creating broader overrides after "Mark as Success".
|
||||
- Broader overrides (job/global) are audit-logged with `event_type=override_from_review`.
|
||||
- Restored "Mark as Success" button in Run Checks modal footer.
|
||||
- Extracted helper functions: `_get_run_error_text()`, `_fetch_problem_objects()`, `_obj_is_problem()`, `_duration_to_end_at()`.
|
||||
- **Test run generator** (`main/routes_settings.py`, `settings.html`):
|
||||
- `POST /settings/test-run/generate`: creates a single `JobRun` with 3 objects in `run_object_links` for a fixed test job (`__test-override-job__` under `__Test Customer__`). Operator chooses status (success/warning/failed) and error scenario (VSS, connection timeout, disk space, license, network, permissions, or custom text).
|
||||
- `POST /settings/test-run/cleanup`: deletes all runs and `customer_objects` for the test job.
|
||||
- New card on Settings → Maintenance with status radio buttons, error scenario dropdown, and cleanup button.
|
||||
- **Bug fix: inactive customer filter** (`main/routes_run_checks.py`, `main/routes_search.py`):
|
||||
- Run Checks overview query was missing `Customer.active` filter — jobs for inactive customers were still shown. Added to both the overview aggregation query and the Search page Run Checks section.
|
||||
|
||||
### 2026-04-13
|
||||
- **Run Checks Cove daily suppression** (`main/routes_run_checks.py`):
|
||||
- Added Cove-specific filtering to suppress repeated same-day runs after the first complete success run.
|
||||
@ -1,6 +1,61 @@
|
||||
# Changelog - Claude Code
|
||||
# Changelog - Develop
|
||||
|
||||
This file documents all changes made to this project via Claude Code.
|
||||
This file is the long-form append-only development log. It captures every change in detail and is summarised into `docs/changelog.md` at release time. Release markers (`## YYYY-MM-DD — Released as vX.Y.Z`) indicate which entries are already published.
|
||||
|
||||
## 2026-05-01 — Released as v0.3.0
|
||||
|
||||
## [2026-05-01]
|
||||
|
||||
### Documentation
|
||||
- In-app documentation refresh after several months without updates:
|
||||
- **`integrations/cove-data-protection.html`**: added "Workstation Offline Handling" section covering the schedule-missed exclusion (always-on) and the new colorbar-based offline detection toggle with warning/error day thresholds. Same-day duplicate-suppression behaviour now documented.
|
||||
- **`backup-review/daily-jobs.html`** and **`customers-jobs/job-schedules.html`**: noted that Cove workstation jobs are excluded from schedule-based missed-run detection.
|
||||
- **`backup-review/run-checks-modal.html`**: documented the Cove and Cloud Connect summary panels shown for API-imported runs (mail section is hidden), Cove same-day suppression, and the Smart Overrides Phase 1 "Apply override for future runs?" follow-up dialog after Mark as Success (scope + duration choices).
|
||||
- **`backup-review/overrides.html`**: added "Creating Overrides Directly From Run Checks" section.
|
||||
- **`mail-import/inbox-management.html`**: noted that messages linked to archived jobs are now hidden from the inbox; cross-linked Cove Accounts and Cloud Connect Accounts staging pages.
|
||||
- **`getting-started/what-is-backupchecks.html`**: removed stale "Coming Soon" callouts; expanded supported-software table to reflect actual parsers (Veeam, Veeam Cloud Connect, Cove API, NAKIVO, Synology, QNAP, Syncovery, BoxAfe, R-Drive, 3CX, NTFS Auditing) and architecture overview now lists Cove + Cloud Connect components.
|
||||
- **`users/login-authentication.html`**: removed "No SSO/OAuth" claim; added Microsoft Entra SSO as an authentication method with an explicit "implemented but not yet validated in production" caveat. Captcha section corrected to reflect the existing `login_captcha_enabled` setting.
|
||||
- **`settings/maintenance.html`**: added "Generate Test Run" subsection (test job auto-creation, status + scenario choices, "Delete all test runs" action).
|
||||
- **`autotask/setup-configuration.html`**: aligned terminology with the Settings page (`Backupchecks Base URL`).
|
||||
|
||||
### Changed
|
||||
- Cove workstation jobs no longer receive schedule-based "Missed" runs. Workstations are commonly powered off outside business hours; the synthetic missed-run generator (`_ensure_missed_runs_for_job` in `routes_run_checks.py`) now early-returns for jobs where `backup_software == "Cove Data Protection"` and `backup_type == "Workstation"`. Servers and Microsoft 365 Cove jobs remain unaffected. Real Cove statuses (Failed / Warning / Not started) still surface via the normal import flow.
|
||||
|
||||
### Added
|
||||
- Cove workstation **offline detection** based on the 28-day colorbar (`D09F08`), configurable in Settings → Integrations → Cove:
|
||||
- **Enable colorbar-based offline detection** (off by default).
|
||||
- **Warning after N inactive days** (default 7, range 1–28).
|
||||
- **Error after N inactive days** (default 14, range 1–28).
|
||||
- When enabled, `cove_importer.run_cove_import` runs `_apply_offline_detection_for_workstations` once per cycle. For each linked Cove workstation job it counts the consecutive trailing colorbar days with status `0` (no backup). When the streak crosses the warning/error threshold, a synthetic `JobRun` is upserted with stable `external_id = "cove-offline-{account_id}"` (so the same row escalates Warning → Error and is removed once activity resumes). Reviewed runs are never mutated, so acknowledged offline alerts do not reappear.
|
||||
- New `system_settings` columns: `cove_offline_detection_enabled` (bool), `cove_workstation_warning_days` (int), `cove_workstation_error_days` (int) — added by `migrate_cove_offline_detection()`.
|
||||
|
||||
## [2026-04-24]
|
||||
|
||||
### Fixed
|
||||
- Inbox no longer lists mail messages linked to archived jobs. The inbox query now excludes messages whose `job_id` points to a job with `archived=True` (messages without a linked job remain visible).
|
||||
|
||||
## [2026-04-16]
|
||||
|
||||
### Added
|
||||
- Smart Overrides Phase 1: "Apply override for future runs?" follow-up dialog after Mark as Success in Run Checks:
|
||||
- After marking a run as success, a follow-up dialog offers to create a broader override for future occurrences of the same error.
|
||||
- **Scope options**: "Only this run" (default, existing behavior), "This job, same error message" (object-level override), "All jobs with same software/type and error" (global override).
|
||||
- **Duration options**: 1 week, 1 month, or permanent (until manually disabled).
|
||||
- Error text is pre-filled from the run's problem objects and shown in the dialog for operator review.
|
||||
- Broader overrides (job/global scope) are audit-logged with `event_type=override_from_review` including scope, duration, and source run details.
|
||||
- No database migrations required — uses existing Override model fields (`level`, `match_error_contains`, `start_at`, `end_at`).
|
||||
- Restored "Mark as Success" button in Run Checks modal footer (was previously removed from HTML but still referenced in JavaScript).
|
||||
|
||||
### Added
|
||||
- Settings → Maintenance: new "Generate test run (override testing)" card:
|
||||
- Generates a single JobRun with 3 objects in `run_object_links` for a fixed test job (`__test-override-job__`).
|
||||
- Operator chooses status (Success/Warning/Failed) and error scenario (VSS, connection timeout, disk space, license, network, permissions, or custom).
|
||||
- Test job and customer are auto-created on first use.
|
||||
- "Delete all test runs" button cleans up all generated test data.
|
||||
|
||||
### Fixed
|
||||
- Run Checks overview was not filtering out jobs belonging to inactive customers; only the missed-run sweep query had the `Customer.active` filter. Added `Customer.active` filter to the overview aggregation query.
|
||||
- Search page Run Checks section had the same missing `Customer.active` filter; now consistent with other views.
|
||||
|
||||
## [2026-04-13]
|
||||
|
||||
@ -1,3 +1,33 @@
|
||||
## v0.3.0
|
||||
|
||||
This release bundles all changes made since `v0.2.5`. Highlights: Smart Overrides Phase 1 (create overrides directly from Run Checks), Cove workstation offline handling (no more false-positive Missed alerts for powered-off PCs, plus an optional colorbar-based offline-detection toggle), a test-run generator in Settings → Maintenance, and a full in-app documentation refresh.
|
||||
|
||||
### Added
|
||||
- **Smart Overrides Phase 1** — after marking a Run Checks run as Success, a follow-up dialog offers to create a broader override for future occurrences:
|
||||
- Scope: only this run, this job + same error, or all jobs with same software/type + error.
|
||||
- Duration: 1 week, 1 month, or permanent.
|
||||
- Error text is pre-filled from the run's problem objects; broader overrides are audit-logged with scope, duration, and source run.
|
||||
- No database migration required (uses existing `Override` model).
|
||||
- **Cove workstation offline detection** (Settings → Integrations → Cove) — optional colorbar-based check that flags Cove workstations as Warning / Error after a configurable number of consecutive inactive days. Disabled by default; thresholds default to 7 / 14 days. Synthetic offline runs use a stable `external_id` per account so they escalate in place and clear automatically once activity resumes; reviewed runs are never mutated.
|
||||
- New `system_settings` columns: `cove_offline_detection_enabled`, `cove_workstation_warning_days`, `cove_workstation_error_days` (`migrate_cove_offline_detection`).
|
||||
- **Settings → Maintenance: Generate test run** card — creates a single `JobRun` with three persisted objects on a fixed test job (`__test-override-job__`) so operators can exercise the Smart Override flow without a real backup. "Delete all test runs" button cleans up.
|
||||
- Run Checks "Mark as Success" button restored in the modal footer (the JS hook existed but the button was missing).
|
||||
|
||||
### Changed
|
||||
- **Cove workstation jobs are excluded from schedule-based missed-run detection.** Workstations are routinely powered off outside business hours, which produced false-positive Missed alerts. Cove Server and Microsoft 365 jobs are unaffected; real Cove statuses (Failed / Warning / Not started) still surface via the normal import flow.
|
||||
- In-app documentation refreshed across `getting-started`, `users`, `mail-import`, `integrations` (Cove), `settings`, `backup-review`, `customers-jobs`, and `autotask` to match current behaviour. Includes Smart Overrides documentation, Cove/Cloud Connect summary panels in the Run Checks modal, archived-jobs filter on the Inbox, captcha & Entra SSO authentication options (SSO marked as implemented but not yet validated in production), expanded supported-software list, and Cove offline-handling sections.
|
||||
|
||||
### Fixed
|
||||
- **Run Checks Cove same-day suppression** — once the first complete success run for a Cove job/day is recorded (status Success with all object statuses Success), all newer Cove runs on that same local day are hidden from Run Checks (overview aggregation + modal details), regardless of status. Sort order in the modal stays newest → oldest.
|
||||
- **Inbox excludes mail messages from archived jobs** — messages whose `job_id` points to an archived job are filtered out (messages without a linked job remain visible).
|
||||
- **Run Checks / Search overview filters out inactive customers** — both queries now apply the `Customer.active` filter that was previously only on the missed-run sweep.
|
||||
|
||||
### Build / Tooling
|
||||
- Adopted the shared `docker-build-and-push` script from `/docker/develop/shared-integrations/tooling/`. The new contract:
|
||||
- Modes are `t` (test → push `:dev`) and `r` (release → push `:<version>`, `:dev`, `:latest`); bump types `1`/`2`/`3` are no longer used.
|
||||
- Release version is read from the first `## vX.Y.Z` heading in this file; the script no longer maintains `version.txt`.
|
||||
- The script performs no git operations — commit, tag, and push are run manually after the registry has accepted the images.
|
||||
|
||||
## v0.2.5
|
||||
|
||||
This release bundles all changes made since `v0.2.4`, including schedule management improvements, Autotask/remark synchronization, Run Checks stability updates, and a full documentation refresh.
|
||||
|
||||
@ -1 +0,0 @@
|
||||
v0.2.5
|
||||
Loading…
Reference in New Issue
Block a user