This document describes how Aria scans a local music library, stores metadata, and turns raw file tags into the fields used by the UI.
Aria treats the library as two layers:
raw tags: what is read directly from the media filemapped fields: the normalized database-facing fields that the UI uses for browsing, sorting, grouping, and playback context
That split is intentional. Raw tags must remain available even when Aria’s normalized view changes.
When Aria scans the library, it does this for each configured root:
- Recursively discover supported audio files
- Read tag data and audio properties from each file
- Preserve raw tags
- Apply user-configurable field mappings
- Apply catalog fallback rules for the
catalogfield when needed - Resolve album art
- Persist the resulting snapshot to SQLite
The scan result is stored in the LibrarySnapshot, which includes:
- library roots
- scan status and file counts
- field mappings
- catalog rules
- tag inventory
- scanned tracks
The scanner currently includes files with these extensions:
flacmp3m4aaacmp4oggopuswavaiffaif
Discovery is recursive and follows links.
Each scanned track stores:
id: currently the full file pathpathfile_namealbum_art_pathaudio: format, duration, sample rate, bit depth, channelsraw_tags:tag -> [values]mapped_fields:field -> [values]
Aria reads tags with Lofty, then supplements that with format-specific raw-tag recovery where needed.
The base pass:
- iterates all Lofty tags on the file
- resolves each tag key to a normalized uppercase tag name
- converts tag values to strings
- splits multi-value text when Aria sees common separators
- deduplicates values while preserving order
Aria currently splits text values on:
;/|- the NUL character
This is intentionally conservative. The goal is to preserve multi-value credits without over-splitting ordinary text.
Some formats, especially Vorbis-comment based formats, can contain useful custom keys that do not survive a purely generic tag abstraction.
To preserve those, Aria explicitly merges raw Vorbis-comment data for:
- FLAC
- Ogg Vorbis
- Opus
- Speex
That is why tags like ENSEMBLE can still appear in raw_tags even if the generic tag path would otherwise lose them.
The scan also builds a tag inventory. For each observed raw tag, Aria stores:
- tag name
- number of tracks where it occurred
- up to three example values
This inventory is useful for field-mapping and diagnostics, even though the main Settings pane no longer shows it inline.
Field mappings define how Aria builds normalized fields from raw tags.
Each mapping has:
key: internal field namelabel: UI labeltag_priorities: a priority-ordered list of raw tags
For a given field:
- Aria checks source tags in order
- the first non-empty source tag wins
- all values from that winning tag are kept
- duplicate values are removed while preserving order
Aria does not merge across multiple source tags for a single field. Priority is strict.
The default field list is:
| Field key | Default source tags |
|---|---|
album |
ALBUM |
title |
TITLE |
catalog |
CATALOGNUMBER, CATALOG |
composer |
COMPOSER |
genre |
GENRE |
conductor |
CONDUCTOR |
ensemble |
ENSEMBLE, ORCHESTRA, ALBUMARTIST |
soloist |
PERFORMER, ARTIST, ALBUMARTIST |
year |
DATE, YEAR |
disk_number |
DISCNUMBER |
track_number |
TRACKNUMBER |
Users can edit these in Settings -> Database fields.
Fields may be empty. They may also contain multiple values.
The catalog field is special.
Aria first tries to resolve catalog from the configured field mapping, which defaults to:
CATALOGNUMBERCATALOG
If that succeeds, catalog fallback parsing is not used.
If catalog is still empty after normal field mapping, Aria runs user-configurable catalog rules.
Each catalog rule now has:
label: the catalog abbreviation Aria should search for, such asBWV,WAB,K, orOpcomposers: optional composer hintsenabled
Users can edit these in Settings -> Catalog rules.
Aria ships with built-in rules for common classical catalogs, including examples such as:
BWVWABKKVDRVHWVTWVBuxWVHob.S.WoOOp
All catalog rules use the same source-tag priority.
Aria checks these tags in order:
TITLEWORKALBUM
It:
- tries
TITLEfirst for all catalog labels - only falls back to
WORKifTITLEproduced no catalog matches - only falls back to
ALBUMif neitherTITLEnorWORKproduced matches
This avoids leaking album-level range catalogs into track-level results when the track title already contains the specific catalog number.
If a rule has composer hints, Aria only applies that catalog label when one of these raw tags matches those hints:
COMPOSERWORKCOMPOSERCOMPOSERSORT
This is how labels like WAB remain Bruckner-specific.
Aria does not store or edit per-rule regex patterns anymore.
Instead, all labels share the same extraction logic:
- split colon-separated title/work/album text into segments
- search segments from right to left
- look for the configured label plus a catalog number
- keep the first segment that yields matches
The shared parser also supports sectioned forms like Hob. XVI:52 and Hob. IIIb:2, where the label is followed by a Roman-numeral section, an optional a or b, and then the item number.
The only per-rule difference is the catalog label itself, plus optional composer hints. The built-in Op rule is the catch-all fallback.
When a title contains multiple colon-separated segments, Aria searches the segments from right to left and keeps the first segment that yields catalog matches.
Example:
Das Wohltemperierte Klavier: Book 1, BWV 846-869: Präludium Es-Dur, BWV 852
Aria prefers BWV 852 from the final segment instead of the collection-level range earlier in the string.
Aria ignores catalog matches that are immediately followed by a dash and more digits, such as:
BWV 846-869
That prevents the start of a catalog range from being treated as a single-track catalog number.
Album art is resolved in this order:
- Embedded FLAC cover art
- Sidecar image files
For FLAC files, Aria:
- reads embedded pictures
- prefers
CoverFront, then other picture types in a stable priority order - writes the extracted image into a local app-data cache
- reuses the cached image on later scans if it already exists and is non-empty
If no embedded FLAC art is available, Aria looks for sidecar files named:
cover.jpgfolder.jpgfront.jpgcover.pngfolder.pngfront.png
It searches:
- the track directory
- the parent directory as a fallback when the track is inside a disc-like folder such as
Disc 1orCD1
Zero-byte sidecar files are ignored.
Embedded art extraction is currently FLAC-first. Other formats mainly rely on sidecar images for now.
Aria persists the library state into SQLite tables that include:
library_statelibrary_rootsfield_mappingscatalog_rulestag_inventoryscanned_tracks
Important details:
raw_tagsare stored as JSONmapped_fieldsare stored as JSON- audio properties are stored as JSON
- library settings and playback state are also persisted elsewhere in the same database
On Windows, the default database path is:
%LOCALAPPDATA%\Aria\aria.sqlite3
- Adding a new library directory from Settings starts a scan automatically
- Re-saving field mappings remaps existing scanned tracks from stored raw tags
- Re-saving catalog rules also remaps existing scanned tracks from stored raw tags
- A full rescan is still needed when the source files themselves changed or when album-art lookup behavior needs to be refreshed
The Tracks tab has a Show all tags action. That dialog reads raw tags directly from the selected file on demand, rather than showing only the normalized database fields.
This is useful when debugging:
- missing fields
- unexpected catalog results
- multi-value role mapping
- rare custom tags
If a field looks wrong, inspect in this order:
- raw file tags
- current field mapping
- current catalog rules, if the field is
catalog - whether the track needs a rescan
If scan behavior changes, update this document so it stays aligned with the code.