Skip to content

extract Scorer #85

@sam-writer

Description

@sam-writer

KenLMScorer is fantastic. Just so useful. However, it isn't core to replaCy and should be a custom pipeline component (that we expect most people to use... think like en_core_web_sm is for spaCy - a separate installation, but in all the docs) that is separately installable.

I think what using our current pipeline should look like, after extraction, is:

import en_core_web_sm
from replacy.components import MaxCountFilter
from replacy_kenlm_scorer import KenLMScorer
from spacy.utils import filter_spans


replaCy = ReplaceMatcher(en_core_web_sm.load(), etc...)
replaCy.add_pipe("span_filter", filter_spans, first=True)
replaCy.add_pipe("scorer", KenLMScorer(model_or_path), after="span_filter)
replaCy.add_pipe("max_count_filter", MaxCountFilter(defaults...), after="scorer")

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions