Mathematical Structures in Computer Science

Paper

Extending relational algebra with similarities

MELITA HAJDINJAKa1 and GAVIN BIERMANa2

a1 Faculty of Electrical Engineering, University of Ljubljana, Slovenia Email: Melita.Hajdinjak@fe.uni-lj.si

a2 Microsoft Research, Cambridge, United Kingdom Email: gmb@microsoft.com

Abstract

In this paper we propose various extensions to the relational model to support similarity-based querying. We build upon the -relation model, where tuples are assigned values from an arbitrary semiring , and its associated positive relational algebra $\text{RA}^{+}_{\mathcal{K}}$. We consider a recently proposed extension to $\text{RA}^{+}_{\mathcal{K}}$ using a monus operation on the semiring to support negative queries, and show how, surprisingly, it fails for important ‘fuzzy’ semirings. Instead, we suggest using a negation operator. We also consider the identities satisfied by the relational algebra $\text{RA}^{+}_{\mathcal{K}}$. We show that moving from a semiring to a particular form of lattice (a De Morgan frame) yields a relational algebra that satisfies all the classical (positive) relational algebra identities. We claim that to support real-world similarity queries realistically, one must move from tuple-level annotations to attribute-level annotations. We show in detail how our De Morgan frame-based model can be extended to support attribute-level annotations and give worked examples of similarity queries in this setting.

(Received January 26 2011)

(Revised November 15 2011)

(Online publication April 25 2012)