Class CapitalizationFilterFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenFilterFactory
org.apache.lucene.analysis.miscellaneous.CapitalizationFilterFactory
Factory for
CapitalizationFilter
.
The factory takes parameters:
- "onlyFirstWord" - should each word be capitalized or all of the words?
- "keep" - a keep word list. Each word that should be kept separated by whitespace.
- "keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive.
- "forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list
- "okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley"
- "minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or"
- "maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true" keep="java solr lucene" keepIgnoreCase="false" okPrefix="McK McD McA"/> </analyzer> </fieldType>
- Since:
- solr 1.3
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
(package private) final boolean
(package private) CharArraySet
static final String
static final String
static final String
static final String
(package private) final int
(package private) final int
static final String
(package private) final int
static final String
SPI namestatic final String
(package private) Collection<char[]>
static final String
(package private) final boolean
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
Constructor Summary
ConstructorsConstructorDescriptionDefault ctor for compatibility with SPICreates a new CapitalizationFilterFactory -
Method Summary
Modifier and TypeMethodDescriptioncreate
(TokenStream input) Transform the specified input TokenStreamMethods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
Field Details
-
NAME
SPI name- See Also:
-
KEEP
- See Also:
-
KEEP_IGNORE_CASE
- See Also:
-
OK_PREFIX
- See Also:
-
MIN_WORD_LENGTH
- See Also:
-
MAX_WORD_COUNT
- See Also:
-
MAX_TOKEN_LENGTH
- See Also:
-
ONLY_FIRST_WORD
- See Also:
-
FORCE_FIRST_LETTER
- See Also:
-
keep
CharArraySet keep -
okPrefix
Collection<char[]> okPrefix -
minWordLength
final int minWordLength -
maxWordCount
final int maxWordCount -
maxTokenLength
final int maxTokenLength -
onlyFirstWord
final boolean onlyFirstWord -
forceFirstLetter
final boolean forceFirstLetter
-
-
Constructor Details
-
CapitalizationFilterFactory
Creates a new CapitalizationFilterFactory -
CapitalizationFilterFactory
public CapitalizationFilterFactory()Default ctor for compatibility with SPI
-
-
Method Details
-
create
Description copied from class:TokenFilterFactory
Transform the specified input TokenStream- Specified by:
create
in classTokenFilterFactory
-