Categorical Query Languagelogo

Home | Download | Getting Started | Manual | Wiki | Papers | Screen Shots | Github | Google Group | YouTube | Conexus | Contact


Transparent Denormalization

Many data integration tasks require denormalizing a schema (adding redundant attributes) to increase query performance. The different copies of a piece of data must then be kept in sync, an error-prone task. In CQL, equational constraints can transparently enforce that the redundant copies of a piece of data are all the same.

In the CQL example below (built-in to the IDE with name Denormalize), the normalized schema contains information about males and their mothers. The denormalized schema contains an additional redundant attribute, the name of each male's mother, as well as an equation specifying how the redundant attribute is derived. When the normalized data is loaded into the denormalized schema, the value of the redundant attribute is automatically computed. The equation linking the redundant data to the master data will be respected by every CQL operation on the denormalized schema, ensuring that the redundant attribute can never become out of sync.

We begin by specifying a normalized source schema containing males and females, a foreign key indicating the mother of each male, and a string attribute for the name of each male and female:

typeside Ty = literal { 
	java_types
		String = "java.lang.String"
	java_constants
		String = "return input[0]"
}

schema NormalizedSchema = literal : Ty {
	entities
		Male 
		Female
	foreign_keys
		mother : Male -> Female
  	attributes
  		female_name : Female -> String
  		male_name   : Male   -> String 
}


Here is some sample data and its view in the IDE:

instance NormalizedData = literal : NormalizedSchema {
	generators
		Al Bob Charlie : Male
		Ellie Fran : Female
	equations
		Al.male_name = Albert 
		Al.mother = Ellie
		
		Bob.male_name = George
		Bob.mother = Ellie
		
		Charlie.male_name = Charles	
		Charlie.mother = Fran

		Ellie.female_name = Elaine
		Fran.female_name = Francine
}

denormalization 0

Next, we specify the denormalized schema by importing the normalized schema, adding an attribute for each male's mother's name, and an equation stating that the attribute must equal each male's mother's name:

schema DeNormalizedSchema = literal : Ty {
	imports
		NormalizedSchema
	attributes
		mother_name : Male -> String
  	observation_equations
  		forall m:Male. mother_name(m) = female_name(mother(m))
}


Finally, we import the normalized data onto the denormalized schema and view it in the IDE:

instance DeNormalizedData = literal : DeNormalizedSchema {
	imports
		NormalizedData
}

denormalization 1

A screen shot of the entire development is shown below:

denormalization 2