Package org.apache.lucene.analysis.core
Class StopFilterFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenFilterFactory
org.apache.lucene.analysis.en.AbstractWordsFileFilterFactory
org.apache.lucene.analysis.core.StopFilterFactory
- All Implemented Interfaces:
ResourceLoaderAware
Factory for
StopFilter
.
<fieldType name="text_stop" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" format="wordset" </analyzer> </fieldType>
All attributes are optional:
ignoreCase
defaults tofalse
words
should be the name of a stopwords file to parse, if not specified the factory will useEnglishAnalyzer.ENGLISH_STOP_WORDS_SET
format
defines how thewords
file will be parsed, and defaults towordset
. Ifwords
is not specified, thenformat
must not be specified.
The valid values for the format
option are:
wordset
- This is the default format, which supports one word per line (including any intra-word whitespace) and allows whole line comments beginning with the "#" character. Blank lines are ignored. SeeWordlistLoader.getLines
for details.snowball
- This format allows for multiple words specified on each line, and trailing comments may be specified using the vertical line ("|"). Blank lines are ignored. SeeWordlistLoader.getSnowballWordSet
for details.
- Since:
- 3.1
-
Field Summary
FieldsFields inherited from class org.apache.lucene.analysis.en.AbstractWordsFileFilterFactory
FORMAT_SNOWBALL, FORMAT_WORDSET
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
Constructor Summary
ConstructorsConstructorDescriptionDefault ctor for compatibility with SPIStopFilterFactory
(Map<String, String> args) Creates a new StopFilterFactory -
Method Summary
Modifier and TypeMethodDescriptioncreate
(TokenStream input) Transform the specified input TokenStreamprotected CharArraySet
Default word set implementation.Methods inherited from class org.apache.lucene.analysis.en.AbstractWordsFileFilterFactory
getFormat, getWordFiles, getWords, inform, isIgnoreCase
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
-
Constructor Details
-
StopFilterFactory
Creates a new StopFilterFactory -
StopFilterFactory
public StopFilterFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
getStopWords
-
createDefaultWords
Description copied from class:AbstractWordsFileFilterFactory
Default word set implementation.- Specified by:
createDefaultWords
in classAbstractWordsFileFilterFactory
-
create
Description copied from class:TokenFilterFactory
Transform the specified input TokenStream- Specified by:
create
in classTokenFilterFactory
-