|
Hi,
If I understood you correctly, you have already created tables where you can handle dynamic amount of attributes and values. Querying such tables is often a bit tricky if you have several levels etc, but CTE[^] is a helpful tool when handling dynamic hierarchies.
On the other hand another possibility could be to use XML data type. This would allow you flatten the relational hierarchy and still be able to use highly dynamic values.
|
|
|
|
|
Yeah, XML data type, someone mentioned that... Have to have a look, thanks!!
I though I knew about CTE already but.. didn't know ou can use them for recursive queries!!! mm... thanks (again!)!
A train station is where the train stops. A bus station is where the bus stops. On my desk, I have a work station....
_________________________________________________________
My programs never have bugs, they just develop random features.
|
|
|
|
|
No problem
|
|
|
|
|
Hi guys, I've learnt 'bout database designing and I got a difficulty that you can help.
Here the question is: a relation R has 3 attributes, A,B,C and A is a primary key. Then I want to insert new attribute D. In each tuple of R, value of attribute D is assigned by (value of B + value of C).
My task is using Relational algebra, writing a expression to do so.
Below is an example
R - before doing anything
A B C
1 10 9
2 3 7
3 9 6
4 12 0
5 1 5
R - after inserting attribute D
A B C D
1 10 9 19
2 3 7 10
3 9 6 15
4 12 0 12
5 1 5 6
Anyway, thanks alot for reading my question and helping my solve this problem
|
|
|
|
|
It's not quite clear what your actual question is but concerning calculated columns. It's typically best to avoid columns which are calculations based on the values of the same (or other) rows. This is because you have to take care that when the original values change, also the result of the calculation is changed.
If you're using SQL Server, then things are a bit different since the database has the ability to use computed columns (see: http://msdn.microsoft.com/en-us/library/ms191250.aspx[^]).
If your question is about updating the value, then:
- first insert the new column
- then update the existing rows. Something like:
UPDATE R
SET D = A + B
|
|
|
|
|
Duc Huy Nguyen wrote: My task is using Relational algebra, writing a expression to do so.
That sounds complicated. I'd rephrase that to "you need to add a column to the table, that holds the sum of columns B + C". You'd need to alter the table[^], ending up with something like this;
BEGIN TRAN
CREATE TABLE [R]
(
A INT,
B INT,
C INT
)
INSERT INTO [R]
SELECT 1, 10, 9 UNION ALL
SELECT 2, 3, 7 UNION ALL
SELECT 3, 9, 6 UNION ALL
SELECT 4, 12, 0 UNION ALL
SELECT 5, 1, 5
ALTER TABLE [R]
ADD D AS [B] + [C]
SELECT *
FROM [R]
ROLLBACK
I are Troll
|
|
|
|
|
You shouldn't include calculated columns in your table like this, IIRC it breaks first normal form as the columns are no longer atomic. You should put the calculation into your select statements -or- [better] create a View with the calculated column in it.
|
|
|
|
|
May I ask how it's no longer atomic? The "new" value (the result from the calculation) is a new atomic fact, even though it's derived from other data - it's not a concatenation of both original columns and you cannot deduce the original values from it.
I are Troll
|
|
|
|
|
Eddy Vluggen wrote: May I ask how it's no longer atomic?
The calculated column relies on other data, it isn't atomic. The classic examples use concatenation, but it is realy a relies upon relationship.
|
|
|
|
|
Keith Barrow wrote: The calculated column relies on other data, it isn't atomic
It relies on other data, that's true, but you cannot decompose the value. It's a new atomic fact, not a composition of other facts. Yes, you can deduce it from other data, but in this case only in one direction (as opposed to a birthdate and an age)
Keith Barrow wrote: The classic examples use concatenation, but it is realy a relies upon relationship.
Any concatenation of characters would fall under this category. The question is whether the fact that's being described is atomic, not the data.
Whether or not a column is computed (and whether or not the value is stored or recomputed), doesn't matter much for the relational model; that's merely an implementation-detail.
I are Troll
|
|
|
|
|
Hmmmm, I get your point, but I don't undertand how:
"hello " + "world" = "hello world"
Is not a new fact, but:
2 + 5 = 7
is.
|
|
|
|
|
That's not information, that's data. I can't model data. Your first example may be a single fact, contained in a varchar field. The second might be too, but could just as well be a derived fact. Like having two apples and five banana's equalling seven pieces of fruit. The numbers of fruit in store is a different fact, and it might be fetching totals from an apple-database and a banana-database.
An atomic piece of data is a bit. An atomic piece of information, is dependent on context - the key of the table.
Take a look at your name - under what circumstance would you see the "data" as a single entity of atomic data? Which different parts of your name are used under those circumstances?
I are Troll
|
|
|
|
|
In the relational model, there is is no concept of DEcompose the value !
If a value can be composed from other values, COMPOSE it, or you will denormalize the DB !!!
This is (should be) only allowed, if the normal form takes some time for calculation, and the application is time critical.
If this is the case, you should use INSERT and UPDATE triggers to ensure the data consistence
|
|
|
|
|
Klaus-Werner Konrad wrote: If a value can be composed from other values, COMPOSE it
If the original data can be reconstructed from that resulting data, you'll be able to reverse that process.
Klaus-Werner Konrad wrote: , or you will denormalize the DB !!!
Not according to the theories of Codd.
Klaus-Werner Konrad wrote: This is (should be) only allowed, if the normal form takes some time for calculation, and the application is time critical.
1) Allowed by whom? If it's my responsibility, then I'll model as I think is correct - if it were something that could simply be "decided" once and for all, they'd put it in a servicepack and be done with it.
2) Has nothing to do with speed.
3) Whether a column is calculated doesn't say anything about the fact "when" it's going to be calculated.
Klaus-Werner Konrad wrote: If this is the case, you should use INSERT and UPDATE triggers to ensure the data consistence
That would be modeled as a calculated column in the conceptual model. The physical model might specify your solution, but it may also choose from a whole galaxy of other options. Like not including the column at all, and have the calculation in code.
I are Troll
|
|
|
|
|
Well - I'll take your signature as the truth ...
|
|
|
|
|
Eddy Vluggen wrote: If the original data can be reconstructed from that resulting data, you'll be
able to reverse that process.
Bullshit - you cannot reverse the result of a calculation - nevertheless you can have a calculation of other colums ...
Eddy Vluggen wrote: Not according to the theories of Codd.
You will at least violate 2NF ...
Eddy Vluggen wrote: 1) Allowed by whom? If it's my responsibility, then I'll model as I think is
correct - if it were something that could simply be "decided" once and
for all, they'd put it in a servicepack and be done with it.
Ok, it's YOUR decision, but HERE we're talking about NORMALIZED databases !!!
Eddy Vluggen wrote: 2) Has nothing to do with speed.
Let's assume it's somethig like SUM( SQRT( PI * DRM) )
Depending on your indices and the sort of query this can be a REALLY time consuming operation, and in such a case a denormalization would be valid (=accepted), but colA +colB isn't such a case, 'cause it's cheap to calc.
Eddy Vluggen wrote: 3) Whether a column is calculated doesn't say anything about the fact "when"
it's going to be calculated.
See my response above
Eddy Vluggen wrote: That would be modeled as a calculated column in the conceptual model.
The relational model from Codd don't know 'calculated column', as this is AGAINST the model by itself ...
|
|
|
|
|
Klaus-Werner Konrad wrote: bullsh*t - you cannot reverse the result of a calculation - nevertheless you can have a calculation of other colums ...
From mp/h to m/s and back.
Klaus-Werner Konrad wrote: You will at least violate 2NF ...
Second normal form is about finding the complete key. It doesn't make statements on atomicity of facts.
Klaus-Werner Konrad wrote: Ok, it's YOUR decision, but HERE we're talking about NORMALIZED databases !!!
I know. Stop shouting.
Klaus-Werner Konrad wrote: Let's assume it's somethig like SUM( SQRT( PI * DRM) ) Depending on your indices and the sort of query this can be a REALLY time consuming operation, and in such a case a denormalization would be valid (=accepted), but colA +colB isn't such a case, 'cause it's cheap to calc.
Let me repeat that; Normalization has nothing to do with speed. It's about creating a theoretical correct model.
A wise man might decide to optimize by denormalizing when implementing the physical layer.
Klaus-Werner Konrad wrote: The relational model from Codd don't know 'calculated column', as this is AGAINST the model by itself ...
Name the rule that it breaks.
I are Troll
|
|
|
|
|
Eddy Vluggen wrote:
Klaus-Werner Konrad wrote: bullsh*t - you cannot reverse the
result of a calculation - nevertheless you can have a calculation of other
colums ...
From mp/h to m/s and back.
This is not a calculation - it's a conversion, and every conversion can be reverted !
Eddy Vluggen wrote:
Klaus-Werner Konrad wrote: You will at least violate 2NF
...
Second normal form is about finding the complete key. It
doesn't make statements on atomicity of facts.
Wikipedia:
"No non-prime attribute in the table is functionally dependent on a proper subset of a candidate key"
Eddy Vluggen wrote:
Klaus-Werner Konrad wrote: Ok, it's YOUR decision, but HERE
we're talking about NORMALIZED databases !!!
I know. Stop
shouting.
Sorry - I'm just old fashioned (You've ever connected via a 300baud accoustic modem ? No Internet, no HTML, no chance to enhance your msg than put *stars* or _underlines_ around it ?)
And - *just* in _these_ days I prefer plain text messages, 'cause evertything else can carry some unwanted payload ...
Eddy Vluggen wrote:
Klaus-Werner Konrad wrote: Let's assume it's somethig like
SUM( SQRT( PI * DRM) ) Depending on your indices and the sort of query this
can be a REALLY time consuming operation, and in such a case a denormalization
would be valid (=accepted), but colA +colB isn't such a case, 'cause it's cheap
to calc.
Let me repeat that; Normalization has nothing to do
with speed. It's about creating a theoretical correct model. A wise man
might decide to optimize by denormalizing when implementing the physical layer.
So you repeat my statement (see underlined portions of my original post) ...
Eddy Vluggen wrote:
Klaus-Werner Konrad wrote: The relational model from Codd
don't know 'calculated column', as this is AGAINST the model by itself
...
Name the rule that it breaks.
As said before: NF2
|
|
|
|
|
Klaus-Werner Konrad wrote: This is not a calculation - it's a conversion, and every conversion can be reverted !
A conversion is a specific type of calculation. A multiplication with a constant is another example; not a conversion, but still reversible.
Klaus-Werner Konrad wrote: So you repeat my statement (see underlined portions of my original post) ...
No, I did not repeat you, but you're correct to state that in such a case it would be defend able to denormalize.
Klaus-Werner Konrad wrote: As said before: NF2
I was looking for argumentation. As said before, it doesn't. All non-atomic facts have been removed before it can be called 1NF. The 2NF rule states that the fact that's described (possibly the result of a calculation) should be identifyable with the same key that you use to identify the tupel.
Take a "Departments" table - it may very well have a fact that states how many employees it currently harbors. The 2NF rule states that the "count" fact belong to the object that we use to identify a single tupel in departments-table. It's the key that gives the fact a context.
The fact "32" as a "count" would lose context (and no longer be information) if we cannot link it to a specific department. It's not relevant for the conceptual model how that number gets there - it may be a small operation, it may be a long operation. When describing the domain-model, we do include the information that it's information that can be deduced from other data - saves the person who has to implement the physical layer a bit time.
I are Troll
|
|
|
|
|
The relational model was developed to deal with issue of data inconsistency and data redundancy.
Calculated columns are redundant and also they can cause data inconsistency problems; if the values used in the calculation are changed, the calculate column will be inconsistent/incorrect.
|
|
|
|
|
If the original values change then, there will be data inconsistency (i.e. the calculated columns will not be correct).
|
|
|
|
|
SilimSayo wrote: If the original values change then, there will be data inconsistency (i.e. the calculated columns will not be correct).
Incorrect isn't the same as inconsistent, and up-to-date is not the same as incorrect.
I are Troll
|
|
|
|
|
|
SilimSayo wrote: Semantics!!!!
The exact definition is important, and not just a semantical detail. And no, I do not appreciate hit&run comments, as it's a lousy way of communicating.
I are Troll
|
|
|
|
|
Suppose I have a calculated field for age computed from dob and the current date. If this field is not updated and you query someone's age, it will return incorrect results. For example, if the query returns someone's [current] age as 64 when it should have been 65, the result would be incorrect.
|
|
|
|
|