fields

Field type definitions for sayt2.

Seven field types covering all search/store/sort use cases. Each type is a pydantic BaseModel with validation, serialization, and discriminated-union support for polymorphic deserialization from config files.

Field types carry no dependency on tantivy — the mapping from field definitions to tantivy schema objects lives in dataset.py.

class sayt2.fields.BaseField(*, type: str, name: Annotated[str, MinLen(min_length=1)], stored: bool = True)[source]

Common base for all field types.

Every subclass must override type with a T.Literal["..."] so that pydantic’s discriminated union can reconstruct the correct class from a plain dict.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.StoredField(*, type: Literal['stored'] = 'stored', name: Annotated[str, MinLen(min_length=1)], stored: bool = True)[source]

Store-only field. Not indexed, not searchable, not sortable.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.KeywordField(*, type: Literal['keyword'] = 'keyword', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, boost: Annotated[float, Gt(gt=0)] = 1.0)[source]

Exact-match field (id, tag, enum). Uses the raw tokenizer under the hood — the entire field value is treated as one token.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.TextField(*, type: Literal['text'] = 'text', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, tokenizer: Literal['default', 'en_stem'] = 'default', boost: Annotated[float, Gt(gt=0)] = 1.0)[source]

Full-text search field. Uses the default (Unicode-aware word boundary) or en_stem (English stemming) tokenizer.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.NgramField(*, type: Literal['ngram'] = 'ngram', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, min_gram: Annotated[int, Ge(ge=1)] = 2, max_gram: Annotated[int, Ge(ge=1)] = 6, prefix_only: bool = False, lowercase: bool = True, boost: Annotated[float, Gt(gt=0)] = 1.0)[source]

Search-as-you-type field. Builds an ngram inverted index so that any substring of length [min_gram, max_gram] is a valid query token.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.NumericField(*, type: Literal['numeric'] = 'numeric', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, kind: Literal['i64', 'u64', 'f64'] = 'i64', indexed: bool = False, fast: bool = True)[source]

Numeric field. Defaults to sort-only (indexed=False, fast=True) which is the typical use case for rating/year columns.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.DatetimeField(*, type: Literal['datetime'] = 'datetime', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, indexed: bool = True, fast: bool = True)[source]

Datetime field backed by tantivy’s date type.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

class sayt2.fields.BooleanField(*, type: Literal['boolean'] = 'boolean', name: Annotated[str, MinLen(min_length=1)], stored: bool = True, indexed: bool = True)[source]

Boolean field.

model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

sayt2.fields.T_Field

Discriminated union of all field types. Use with TypeAdapter for polymorphic deserialization:

from pydantic import TypeAdapter
adapter = TypeAdapter(T_Field)
field = adapter.validate_python({"type": "ngram", "name": "title"})

alias of Annotated[StoredField | KeywordField | TextField | NgramField | NumericField | DatetimeField | BooleanField, FieldInfo(annotation=NoneType, required=True, discriminator=’type’)]

sayt2.fields.fields_schema_hash(fields: list[Annotated[StoredField | KeywordField | TextField | NgramField | NumericField | DatetimeField | BooleanField, FieldInfo(annotation=NoneType, required=True, discriminator='type')]]) str[source]

Deterministic hash of a list of field definitions.

Used as part of cache keys so that changing the schema automatically invalidates stale caches.