MAT-001

No matter selected

Matter Readiness

Track ingestion, metadata quality, extraction confidence, and source-backed outputs.

Complete extraction profile

Idle - no active ingestion

Start a fast scan or full ingestion when you are ready.

0%
0 files remaining Background work can yield to active tasks
Current file activity No active file stages
GPU Intel B580 active
Model Qwen2.5 14B
Ollama checking…
CPU 8/16 cores - 18%
RAM 9.6 / 24 GB
VRAM 6.8 / 12 GB
Storage Browser only
Documents 0

0 ready for retrieval

Indexed passages 0

Page and paragraph anchored

Key facts 0

Dates, names, places, issues

Needs review 0

Low-confidence metadata

Ingestion Queue

Recent Source Findings

Ingest Client Files

Add a NAS folder or local file batch, then normalize OCR, metadata, embeddings, and extraction.

^

Drop files or choose a batch

PDF, DOCX, TXT, email exports, image scans, and mixed disclosure folders.

NAS source

Navigate and add folders from your network share

No NAS folders added — click Browse NAS to add one.

Pipeline Steps

Idle

Matter Work Queue

Vision OCR

Runs a two-tier vision pass on low-quality scans and image files. GLM-OCR (0.9B) handles typed text; the selected Qwen model handles photographs, handwriting, and complex layouts. Files are flagged automatically during ingestion when word count or confidence is below threshold.

Document Register

Review dates, authors, document types, relevance, and extraction confidence before relying on answers.

Custom Tags

Build the matter tag vocabulary you want to reuse while reviewing documents.

Date Document Author Type Description Tags Privilege Affidavit Relevance Status Review Open Download

Case Theory and Issues

Maintain the living theory of the case so relevance, privilege review, summaries, and chronologies can be reassessed as the pleadings evolve.

Working notes

Matter summary

Live issues

Relevance criteria

Privilege posture

Extraction Profiles

Each profile has its own prompt and stores its own extraction results separately. The active profile drives the chronology and findings displays.

Extraction Prompt

AI Draft Assistance

Requires approval

Suggested Issue Updates

From ingestion

Reassessment History

Theory v1

Matter Scratchpad

Persistent working notes — inject selected sections as context into any query. Auto-synced to NAS as scratchpad.md.

Chronology Builder

Source-backed dates and facts with page and paragraph references for review and export.

Ask the Matter

Answers must cite the document, page, paragraph, and confidence for each important proposition.

7

Legal Research

Search CanLII, pull metadata and citator data, and add full decision text to any matter — all without leaving this window.

Step 1 — Search CanLII

Not checked

The CanLII API is a metadata-only service — keyword search requires the website. Search opens CanLII in a separate tab. Find your case there, then copy its URL and paste it into Step 2 below.

Open CanLII ↗

Step 2 — Look up a case by URL

Paste any CanLII URL

Paste the URL of any CanLII decision to retrieve its metadata, keywords, and citator links. Every case — including cases that cite it and cases it cites — has its own Add to Matter button to pull the full decision text into your Qdrant index.

Future Modules

Pin planned integrations here so the matter system can grow without losing the core ingestion, retrieval, and review workflow.

Local AI Settings

Swap models and define the local services the backend will use on your Ubuntu VM.

Matter manager

Matter list will appear when the server state loads.

Conversation model

Checking installed local models...

Deep Reasoning automatically uses at least this many tokens of context. Increase if reasoning is still being cut off; decrease if you are running out of RAM. Qwen3 14B supports up to 131072.

Controls how the model's attention cache is stored in VRAM during generation. f16 (default) is full precision and uses the most memory. q8_0 cuts cache memory roughly in half with no meaningful quality difference — recommended if you are hitting VRAM limits or running long contexts. q4_0 halves it again but may slightly degrade coherence on very long answers. Requires Ollama 0.5 or newer; on older versions this setting is silently ignored.

Retrieval store

Embedding models are cached on the VM. Building a new index preserves older indexes so you can switch back without losing the earlier ingestion.

Parent window size is fixed at 650 words (~850 tokens) with no overlap. Children are embedded; the matched child's parent window is returned to the LLM as context. Use Re-index clean after changing these settings.

OCR and vision

Runtime: Tesseract, 1 thread per document. Worker status will appear when the VM reports telemetry.

Vision OCR tuning

Phase 2 runs on documents that Phase 1 (GLM-OCR) could not improve adequately — typically handwritten notes, degraded scans, tables, and mixed-language pages. Tesseract+ is the CPU fallback with layout-aware preprocessing; the VLM options use a vision language model to read the page image directly and produce much better results on difficult scans, but require a GPU with enough free VRAM to load an 8B model alongside the main LLM.

When enabled, every ingested PDF or image is checked against the low-yield threshold. Files that fall below it are automatically queued for vision OCR in the background. The pipeline pauses whenever you submit a query so the GPU stays responsive. Disable this if you want to control which files get re-processed manually.

A document is flagged as a candidate for vision OCR if its passage count divided by page count falls below this number. The default of 2 means a 10-page PDF with fewer than 20 indexed passages will be re-processed. Increase this to be more aggressive (re-process more files); decrease it to only catch near-blank pages.

Phase 1 (GLM-OCR) runs a fast vision pass on each page. Its result is only accepted — skipping the slower Phase 2 — if it produces at least this many times more words than the original extraction. 1.5 means GLM must be 50% better. Set lower (e.g. 1.1) to accept smaller gains and skip Phase 2 more often; set higher (e.g. 2.0) to demand clear improvement before trusting GLM's result.

A secondary acceptance criterion for Phase 1. Even if GLM only marginally improves on the original, if it produces a dense enough result (default: 80 words per page) it is assumed the page is well-covered and Phase 2 is skipped. Increase this if you are finding that GLM passes low-quality results; decrease it for shorter documents like cover pages or indexes.

Pipeline tuning

Runtime tuning will appear when the VM reports telemetry.

OCR worker pool

Worker status will appear when helper machines check in.

Backups

Backup status will appear here.

CanLII research

CanLII metadata calls will be rate-limited and cached locally.

Setting guide

Recommended defaults included

Select an information button to see what the setting changes, recommended defaults, and what trade-offs to expect.