Tabula Rasa Documentation

Production Table Knowledge LLM: Teaching LLMs to accurately answer questions about tabular data

Tabula Rasa is a machine learning framework designed to help large language models (LLMs) accurately understand and answer questions about tabular data. Built on modern transformer architectures, it provides tools for training, evaluation, and deployment of table-aware language models.

Features

  • 🎯 Table-Aware LLMs: Specialized models for understanding tabular data

  • 🚀 Modern Architecture: Built on PyTorch and Transformers

  • 📊 Comprehensive Evaluation: Tools for assessing model performance on table QA tasks

  • 🔧 Easy Integration: Simple API for training and inference

  • 📈 Production Ready: Optimized for deployment in production environments

Quick Start

Installation

pip install tabula-rasa

For development installation:

git clone https://github.com/gojiplus/tabula-rasa.git
cd tabula-rasa
pip install -e ".[dev]"

Basic Usage

from tabula_rasa import TabulaRasa

# Initialize the model
model = TabulaRasa()

# Process tabular data
table = {
    "columns": ["Name", "Age", "City"],
    "rows": [
        ["Alice", 30, "New York"],
        ["Bob", 25, "San Francisco"],
    ]
}

# Ask a question
question = "What is Alice's age?"
answer = model.answer(question, table)
print(answer)  # Output: 30

Contents

Indices and Tables