fields¶
Field type definitions for sayt2.
Seven field types covering all search/store/sort use cases. Each type is a
pydantic BaseModel with validation, serialization, and discriminated-union
support for polymorphic deserialization from config files.
Field types carry no dependency on tantivy — the mapping from field
definitions to tantivy schema objects lives in dataset.py.
- class sayt2.fields.BaseField(*, type: str, name: Annotated[str, MinLen(min_length=1)], stored: bool = True)[source]¶
Common base for all field types.
Every subclass must override
typewith aT.Literal["..."]so that pydantic’s discriminated union can reconstruct the correct class from a plain dict.- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.StoredField(*, type: Literal['stored'] = 'stored', name: Annotated[str, MinLen(min_length=1)], stored: bool = True)[source]¶
Store-only field. Not indexed, not searchable, not sortable.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.KeywordField(*, type: Literal['keyword'] = 'keyword', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, boost: Annotated[float, Gt(gt=0)] = 1.0)[source]¶
Exact-match field (id, tag, enum). Uses the
rawtokenizer under the hood — the entire field value is treated as one token.- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.TextField(*, type: Literal['text'] = 'text', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, tokenizer: Literal['default', 'en_stem'] = 'default', boost: Annotated[float, Gt(gt=0)] = 1.0)[source]¶
Full-text search field. Uses the
default(Unicode-aware word boundary) oren_stem(English stemming) tokenizer.- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.NgramField(*, type: Literal['ngram'] = 'ngram', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, min_gram: Annotated[int, Ge(ge=1)] = 2, max_gram: Annotated[int, Ge(ge=1)] = 6, prefix_only: bool = False, lowercase: bool = True, boost: Annotated[float, Gt(gt=0)] = 1.0)[source]¶
Search-as-you-type field. Builds an ngram inverted index so that any substring of length
[min_gram, max_gram]is a valid query token.- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.NumericField(*, type: Literal['numeric'] = 'numeric', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, kind: Literal['i64', 'u64', 'f64'] = 'i64', indexed: bool = False, fast: bool = True)[source]¶
Numeric field. Defaults to sort-only (
indexed=False, fast=True) which is the typical use case for rating/year columns.- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.DatetimeField(*, type: Literal['datetime'] = 'datetime', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, indexed: bool = True, fast: bool = True)[source]¶
Datetime field backed by tantivy’s date type.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sayt2.fields.BooleanField(*, type: Literal['boolean'] = 'boolean', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, indexed: bool = True)[source]¶
Boolean field.
- model_config: ClassVar[ConfigDict] = {}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- sayt2.fields.T_Field¶
Discriminated union of all field types. Use with
TypeAdapterfor polymorphic deserialization:from pydantic import TypeAdapter adapter = TypeAdapter(T_Field) field = adapter.validate_python({"type": "ngram", "name": "title"})
alias of
Annotated[StoredField|KeywordField|TextField|NgramField|NumericField|DatetimeField|BooleanField, FieldInfo(annotation=NoneType, required=True, discriminator=’type’)]
- sayt2.fields.fields_schema_hash(fields: list[Annotated[StoredField | KeywordField | TextField | NgramField | NumericField | DatetimeField | BooleanField, FieldInfo(annotation=NoneType, required=True, discriminator='type')]]) str[source]¶
Deterministic hash of a list of field definitions.
Used as part of cache keys so that changing the schema automatically invalidates stale caches.