idx

package
v0.0.0-...-221d52c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 16, 2025 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Overview

Package idx implements reading .idx files.

The .idx file contains a list of dictionary entry titles and the associated offset and size of the main article content in the .dict file.

Each .idx file entry (word) comes in three parts:

  1. The title: a utf-8 string terminated by a null terminator ('\0').
  2. The offset: a 32 or 64 bit integer offset of the word in the .dict file in network byte order.
  3. The size: a 32 bit integer size of the word in the .dict file in network byte order.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrPrefix indicates that the query must not start with a glob wildcard.
	ErrPrefix = errors.New("search query must not start with wildcard")

	// ErrGlob indicates an error with a glob search query.
	ErrGlob = errors.New("invalid glob query")
)
View Source
var DefaultOptions = &Options{
	Folder: func() transform.Transformer {
		return transform.Nop
	},
	ScannerOptions: &ScannerOptions{
		OffsetBits: 32,
	},
}

DefaultOptions is the default options for an Idx.

View Source
var DefaultScannerOptions = &ScannerOptions{
	OffsetBits: 32,
}

DefaultScannerOptions is the default options for a Scanner.

View Source
var ErrInvalidIdxOffset = errors.New("invalid idxoffsetbits")

ErrInvalidIdxOffset indicates that the OffsetBits is an invalid value.

Functions

func Open

func Open(ifoPath string) (*os.File, error)

Open opens the .idx file given the path to the .ifo file.

Types

type Idx

type Idx struct {
	// contains filtered or unexported fields
}

Idx is a very basic implementation of an in memory search index. Implementers of dictionaries apps or tools may wish to consider using Scanner to read the .idx file and generate their own more robust search index.

func New

func New(r io.ReadCloser, options *Options) (*Idx, error)

New returns a new in-memory index.

func NewFromIfoPath

func NewFromIfoPath(ifoPath string, options *Options) (*Idx, error)

NewFromIfoPath returns a new in-memory index.

func NewWithSyn

func NewWithSyn(idxReader, synReader io.ReadCloser, options *Options) (*Idx, error)

New returns a new in-memory index with synonyms merged in.

func (*Idx) Search

func (idx *Idx) Search(query string) ([]*Word, error)

Search performs a query of the index and returns matching words. The query supports glob patterns whose pattern syntax is:

pattern:
    { term }

term:
    `*`         matches any sequence of non-separator characters
    `**`        matches any sequence of characters
    `?`         matches any single non-separator character
    `[` [ `!` ] { character-range } `]`
                character class (must be non-empty)
    `{` pattern-list `}`
                pattern alternatives
    c           matches character c (c != `*`, `**`, `?`, `\`, `[`, `{`, `}`)
    `\` c       matches character c

character-range:
    c           matches character c (c != `\\`, `-`, `]`)
    `\` c       matches character c
    lo `-` hi   matches character c for lo <= c <= hi

pattern-list:
    pattern { `,` pattern }
                comma-separated (without spaces) patterns

The pattern is folded using the given folding transformer and matches the folded word in the index.

type Options

type Options struct {
	// Folder returns a [transform.Transformer] that performs folding (e.g.
	// case folding, whitespace folding, etc.) on index entries.
	Folder func() transform.Transformer

	// ScannerOptions are the options to use when reading the .idx file.
	ScannerOptions *ScannerOptions
}

Options are options for the idx data.

type Scanner

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner scans an index from start to end.

func NewScanner

func NewScanner(r io.ReadCloser, options *ScannerOptions) (*Scanner, error)

NewScanner return a new index scanner that scans the index from start to end. The Scanner assumes ownership of the reader and should be closed with the Close method.

func NewScannerFromIfoPath

func NewScannerFromIfoPath(ifoPath string, options *ScannerOptions) (*Scanner, error)

NewScannerFromIfoPath returns a new in-memory index.

func (*Scanner) Close

func (s *Scanner) Close() error

Close closes the underlying reader.

func (*Scanner) Err

func (s *Scanner) Err() error

Err returns the first error encountered.

func (*Scanner) Scan

func (s *Scanner) Scan() bool

Scan advances the index to the next index entry. It returns false if the scan stops either by reaching the end of the index or an error.

func (*Scanner) Word

func (s *Scanner) Word() *Word

Word gets the next entry in the index.

type ScannerOptions

type ScannerOptions struct {
	// OffsetBits are the number of bits in the offset fields. Valid values for
	// OffsetBits are either 32 or 64.
	OffsetBits int
}

ScannerOptions are options for scanning an .idx file.

type Word

type Word struct {
	// Word is the word as it appears in the index.
	Word string

	// Offset is the offset in the .dict file that the corresponding entry appears.
	Offset uint64

	// Size is the total size of the corresponding .dict file entry.
	Size uint32
}

Word is an .idx file entry.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL