Welcomemobile versionlogin
To World Of Information Technologies
RIBBON
FAQ
PICTURES
ABOUT AUTHOR
SMITHY
development
devices
science
technologies
4 difficulties when working with unbalanced hierarchies
2015-01-17 00:09 | | comments - 0development
What is a balanced hierarchy
Balanced hierarchy - it's such a hierarchy in which all branches of the structure have a predetermined number of levels.
Example. Balanced 3-level hierarchy:
Element 1
Element 1.1
Element 1.1.1
Element 1.2
Element 1.2.1
Element 1.2.2
Element 1.2.3
Element 1.3
Element 1.3.1
Element 1.3.2
Element 2
Element 2.1
Element 2.1.1
Element 3
Element 3.1
Element 3.1.1
Element 3.1.2

The classic mechanism for organizing a balanced hierarchy in a DB table.

What is a unbalanced hierarchy
Unbalanced hierarchy - it's such a hierarchy, which has a heterogeneous structure and has no restrictions on the number of levels in its branches.
Example. Unbalanced hierarchy:
Element 1
Element 1.1
Element 1.2
Element 1.2.1
Element 1.2.1.1
Element 1.2.1.2
Element 1.2.2
Element 1.2.3
Element 1.3
Element 1.3.1
Element 1.3.2
Element 2
Element 3
Element 3.1

The classic mechanism for organizing a unbalanced hierarchy in a DB table.

Difficulty 1. Not all external applications can work with unbalanced hierarchies
In case if is very necessary connect to a data source for some very important program, and it does not know how to handle unbalanced hierarchy, one of the best solutions it is convert an unbalanced hierarchy to balanced. The main trouble is that unbalanced hierarchy has no physical limitations on the number of levels, but balanced hierarchy must have a exact number of levels. Therefore, usually, the developer before writing the conversion mechanism determines the maximum level for a hierarchy (or by fact, or by theoretical maximum), and then realise conversion mechanism. For convert to a balanced hierarchy, I use mainly two approaches: SQL select query with a complex structure and recursive procedure (stored procedure in the database, or analysis algorithm in the client application) forming a separate table.
The advantage of the first approach is that the transformation can be performed using only a single query and there is no need to increase the number of database objects. The disadvantage is the complexity of the query (and consequently its productivity is low), and its complexity increases in arithmetic progression, depending on the maximum number of levels of hierarchy.
The advantage of the second approach is that work with the generated static table is much faster than working with complex queries. The disadvantage of this approach is that every time you change a table with an unbalanced hierarchy must rebuild the entire table with a balanced hierarchy (which is unacceptable in systems where the structure of an unbalanced hierarchy is constantly changing).

Difficulty 2. Getting elements of unbalanced hierarchy by a particular level
Often the problem arises to select from the table not all members of the hierarchy, and only a specific level. And selecting the 1st level it's simple task (select all elements whose "parent is null"), then with the other levels is more difficult. The best solution, I believe, is to add an attribute with the level number. Update the field with the level number is better using trigger (DB feature). If use trigger mechanism is not possible, to update the level number of the element/s you can use recursive procedure or a recursive algorithm on the client application.

Difficulty 3. Removing elements from unbalanced hierarchy
One of the regular tasks, is removing elements from hierarchy, and the biggest problem - it does not prevent the emergence of so-called "gaps" (when there are elements in the table hierarchy that contain references to non-existing elements). The best solution for removing members of a hierarchy, in my opinion, is to scan and deleting all nested elements of hierarchy using recursive algorithm. From experience, in complex systems, sometimes (rarely, but cases have been), due to the nature of work of some databases, gaps can sometimes appear. Therefore, I would recommend, include "scanning algorithms hierarchies for gaps" in maintenance of storage mechanisms.

Difficulty 4. Sorting elements of unbalanced hierarchy
Sorting of unbalanced hierarchies, probably, one of the most common problem and at the same time not having beautiful solutions (I have not found a universal solution). To solve this problem, I have done the following:
  1. determine all possible sort combinations for a specific hierarchy (by name, by date, by rating, and so on)
  2. for each sort combination, add sorting attribute to the table with the hierarchy
  3. make a recursive algorithm, which updating sorting indices
But, when using the described approach, there is a serious performance problem when adding new elements of the hierarchy (in the simplest case, there is a need rebuild all sorting attributes). And then, to avoid strong performance reduction, I use a little trick. If during the full rebuilding of sorting attributes, use not dense filling of the sequence ( 1, 2, 3, 4, and so on), and rarefied sequence (100, 200, 300, 400 and so on), sorting attribute of new elements can be calculate on sorting attribute from parent element and adjacent elements, without the need to recalculate the sorting attribute of the entire hierarchy. Of course, this a mechanism does not eliminate the need to periodically recalculate the sorting attributes of the entire hierarchy, but allows to optimize algorithms.

Conclusion
Of course, here described not all of the difficulties that arise when working with unbalanced hierarchies. I have described only the ones that I decide on a regular basis. I would be glad if you, in the comments, offer more interesting solutions to the described problems.
sql databases author's material
comments - 0 (kyrylo%20kovalenko)
first to last | last to firstregistration/login
<1>