Join us
@evangertis ・ Jan 16,2022 ・ 3 min read ・ 2854 views ・ Originally posted on faun.pub
In data we can visualize relationships in a hierarchical structure. As scientists our goal is to not only understand the data, but more importantly we want to be able to visualize the connections between pieces of information.
Why do I care?
In data we can visualize relationships in a hierarchical structure. As scientists our goal is to not only understand the data, but more importantly we want to be able to visualize the connections between pieces of information. Manual definition of concept hierarchies can be a tedious and time-consuming task for a user or a domain expert. Fortunately, many hierarchies are implicit within the database schema and can be automatically defined at the schema definition level. The concept hierarchies can be used to transform the data into multiple levels of granularity. For example, data mining patterns regarding sales may be found relating to specific regions or countries, in addition to individual branch locations.
What is a concept?
A group of records that have been assigned a label.
What is a concept hierarchy?
Means generating a hierarchical order among concepts.
Let’s get deeper
1- Ordering of the attributes of the schema level by user or expert.
Let us assume that a set of the following attributes are given:
School, Department, college, school and an expert defines the hierarchy as follows:
Department-> College -> School
This means we have 3 attributes for the above three concepts, and we want to automatically generate hierarchy among the three attributes.
2- Ordering by adding hierarchy within a footstep
For example, college could be further divided into:
Science-oriented
Health-oriented
Humanity-oriented
3- Ordering by set grouping or value grouping
Attribute values for an attribute age could go with the hierarchy among the set of groups:
{20–39}, {40–59}, {60–82}
4- Ordering by decoding operation data in an attribute is a set of emails:
dmbrook@cs.sfu.ca
Concept hierarchies can be created by separating different components of the email data.
Dmbrook->cs->sfu->ca
Concept hierarchies can be created by separating different components of the email data.
5- Ordering by data clustering and data distribution analysis
6- Ordering by use of rules
Hierarchy among the values of an attribute profit (for items (x)) considering the price, cost, and threshold for the profit can be found by the following set of rules
How do I do it?
Suppose a user selects a set of location-oriented attributes — street, country, province
or state, and city — from the AllElectronics database, but does not specify the hierarchical
ordering among the attributes.
First, sort the attributes in ascending order based on the number of distinct values in each attribute. This results in the following (where the number of distinct values per attribute is shown in parentheses):
country (15),
province or state (365),
city (3567),
and street (674,339).
Second, generate the hierarchy from the top down according to the sorted order, with the first attribute at the top level and the last attribute at the bottom level. Finally, the user can examine the generated hierarchy, and when necessary, modify it to reflect desired semantic relationships among the attributes. In this example, it is obvious that there is no need to modify the generated hierarchy.
How do I do it?
Suppose a user selects a set of location-oriented attributes — street, country, province
or state, and city — from the AllElectronics database, but does not specify the hierarchical
ordering among the attributes.
First, sort the attributes in ascending order based on the number of distinct values in each attribute. This results in the following (where the number of distinct values per attribute is shown in parentheses):
country (15),
province or state (365),
city (3567),
and street (674,339).
Second, generate the hierarchy from the top down according to the sorted order, with the first attribute at the top level and the last attribute at the bottom level. Finally, the user can examine the generated hierarchy, and when necessary, modify it to reflect desired semantic relationships among the attributes. In this example, it is obvious that there is no need to modify the generated hierarchy.
Join other developers and claim your FAUN account now!
Influence
Total Hits
Posts
Only registered users can post comments. Please, login or signup.