PICARD

Introduction

PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding
from Language Models is a simple, yet effective constrained decoding method with large pre-trained auto-regressive language model.

It works by warping the model prediction scores and integrates trivially with existing algorithms for greedy and beam search used in auto-regressive decoding from language models.

Auto-regressive Model

Autoregressive models are a class of machine learning (ML) models that automatically predict the next component in a sequence by taking measurements from previous inputs in the sequence.

Definition

Autoregression is a statistical technique used in time-series analysis that assumes that the current value of a time series is a function of its past values. Autoregressive models use similar mathematical techniques to determine the probabilistic correlation between elements in a sequence.

Requirement

By design, PICARD requires a probabilistic auto-regressive model. While most LLM models are of that nature, they are usually exposed as a black box, which effectively denies using that feature.

Search Strategy

Established strategies

While PICARD is based on beam search, its not the only search strategy, in fact there are two:

Greedy Search: Makes decisions based solely on the best immediate option without considering long-term consequences. Efficient but may not find the best solution.
Beam Search: Expands the search space by considering multiple promising candidates at each step. More likely to find better solutions than greedy search but still doesn't guarantee optimality. Less efficient due to considering multiple options. The effectiveness depends on the beam width parameter.

In essence, greedy search prioritizes immediate gains, while beam search explores a wider range of options, potentially leading to better solutions at the cost of increased computational complexity. This is demonstrated in the following figure:

Its arguments are the token ids of the current hypothesis and, for each vocabulary token, the log-softmax scores predicted by the model’s language modeling head.

PICARD Strategy

PICARD's strategy is composed of the following steps:

Candidate Selection:
- Begin with selecting the first word or token based on the initial input.
- This selected token serves as the starting point for generating subsequent tokens.
Token Generation and Interpretation:
- Generate the next token using a Language Model (LM) based on the selected token.
- Interpret the generated tokens to determine the best matched token that fits the context and constraints of the task.
- This interpretation involves considering long-term consequences such as adherence to SQL syntax and semantic correctness.
Failure Handling:
- If a failure is detected (e.g., incorrect SQL syntax, violation of constraints), revert to the previous step.
- Re-select the candidate token from the previous step and repeat the token generation and interpretation process.
Iterative Selection:
- Repeat the selection, generation, interpretation, and failure handling process for each subsequent step in the sequence until the desired output (e.g., SQL query) is generated.
Optimization and Learning:
- Optionally, incorporate optimization techniques or learning mechanisms to improve the selection and interpretation process over time.
- This may involve reinforcement learning, model fine-tuning, or other approaches to enhance the performance of the text-to-SQL system.

Operational Modes

PICARD supports the following modes:

Lexing: In lexing mode, PICARD checks the output on a lexical level only.
Parsing: In the parsing mode, PICARD checks the output on a grammatical level. It attempts to parse the detokenized model output to a data structure that represents the abstract syntax tree
Parsing with guards: In this mode, PICARD construct an abstract syntax tree, and applies additional checks to guarantee that tokens are within their scope.