Skip to content

Implement proper Laplace smoothing in Bayes classifier#80

Merged
cardmagic merged 1 commit into
masterfrom
fix/laplace-smoothing
Dec 27, 2025
Merged

Implement proper Laplace smoothing in Bayes classifier#80
cardmagic merged 1 commit into
masterfrom
fix/laplace-smoothing

Conversation

@cardmagic

Copy link
Copy Markdown
Owner

Summary

  • Replace magic number 0.1 with proper add-one (Laplace) smoothing
  • Formula: P(word|category) = (count + 1) / (total + vocab_size)
  • Smoothing now scales correctly with vocabulary size

Before

s = category_words.key?(word) ? category_words[word] : 0.1  # Magic number
score += Math.log(s / total)

After

count = category_words[word] || 0
score += Math.log((count + 1) / smoothed_total)  # Laplace smoothing

Test Plan

  • 5 new Laplace smoothing tests added
  • All 90 tests pass
  • Bayes now correctly classifies edge cases it previously got wrong

Fixes #64

Replace magic number 0.1 with proper add-one (Laplace) smoothing:
P(word|category) = (count + 1) / (total + vocab_size)

This ensures smoothing scales correctly with vocabulary size
and applies consistently to both seen and unseen words.

Fixes #64
@cardmagic cardmagic force-pushed the fix/laplace-smoothing branch from da3148e to 9b8b236 Compare December 27, 2025 09:59
@cardmagic cardmagic merged commit e7a33cb into master Dec 27, 2025
5 checks passed
@cardmagic cardmagic deleted the fix/laplace-smoothing branch December 27, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement proper Laplace smoothing in Bayes classifier

1 participant