rowvoi Documentation

Interactive disambiguation of rows in a dataset using value-of-information policies.

Overview

The rowvoi package provides tools for interactively disambiguating rows in a dataset. Given a small set of candidate rows, it helps answer questions such as:

  • Which columns (features) must be observed to uniquely distinguish these rows?

  • How much information does a given feature provide about which row is correct?

  • Under a noise model and frequency priors, which feature should we acquire next to maximize expected reduction in uncertainty?

  • How does a greedy feature acquisition policy compare to the optimal minimal key in practice?

Installation

pip install rowvoi

For development:

uv pip install -e ".[dev,docs]"

Quick Start

Finding Minimal Keys

import pandas as pd
from rowvoi import minimal_key_greedy, minimal_key_exact

df = pd.DataFrame({
    "A": [1, 1, 2],
    "B": [3, 4, 3],
    "C": [5, 6, 7]
})

# Find minimal distinguishing columns for rows 0 and 1
print(minimal_key_greedy(df, [0, 1]))  # ['B']
print(minimal_key_exact(df, [0, 1]))   # ['B']

Indices and tables