mpcompress.token_codecs

UniformTokenCodec

UniformTokenCodec(alphabet_size, **kwargs)

Uniform token codec for compression.

This codec assumes a uniform distribution over the alphabet and encodes tokens using uniform quantization. It extends CompressionModel to provide compression and decompression functionality for discrete tokens.

Parameters:

Name	Type	Description	Default
`alphabet_size`	`int`	Size of the token alphabet (number of possible values).	required
`**kwargs`	`dict`	Additional keyword arguments passed to parent class.	`{}`

compress

compress(tokens)

Compress tokens to bitstring.

Note: tokens should not have batch dimension.

Parameters:

Name	Type	Description	Default
`tokens`	`Tensor`	Input tokens to compress. Shape should be (H, W, ...) without batch dimension.	required

Returns:

Name	Type	Description
`coded_unit`	`dict`	Dictionary containing: "strings" (dict): Dictionary with key "t" containing nested list with encoded bitstring. Nested structure is for consistent API. "pstate" (dict): Dictionary with key "t_shape" containing the original shape of tokens as a tuple.

decompress

decompress(strings, pstate, **kwargs)

Decompress bitstring to tokens.

Parameters:

Name	Type	Description	Default
`strings`	`dict`	Dictionary with key "t" containing nested list with encoded bitstring. Nested structure is for consistent API.	required
`pstate`	`dict`	Dictionary with key "t_shape" containing the original shape of tokens as a tuple.	required
`**kwargs`	`dict`	Additional keyword arguments (unused).	`{}`

Returns:

Name	Type	Description
`task_feats`	`dict`	Dictionary with key "tokens" containing decompressed tokens of shape specified in pstate["t_shape"]. Note: tokens do not have batch dimension.

forward

forward(tokens)

Forward pass to compute uniform likelihoods.

Parameters:

Name	Type	Description	Default
`tokens`	`Tensor`	Input tokens of any shape.	required

Returns:

Name	Type	Description
`output`	`dict`	Dictionary containing: - "likelihoods" (dict): Dictionary with key "t" containing uniform likelihoods of shape matching tokens. - "tokens" (torch.Tensor): Original input tokens.