Click to See Complete Forum and Search --> : alter live table


shuvo
April 7th, 2007, 12:13 AM
I am working on a web project, which uses sql server 2005. The database designer has designed the database in a unique way.When he needs to add a field, he is adding columns to the the table by dynamically adding columns to it.Mind it there could be millions of rows in that table . He is altering the table by writing a trigger in another table,I mean if a row is added to that table, he is adding a column to this table.I objected to this kind of design and adviced to keep these data in row format, cause adding columns in a live database sounds really odd to me.What could be the bad effects?Is there any many risk,if the database is designed like this?Please comment.

Shuja Ali
April 7th, 2007, 01:17 AM
First bad affect will be the performance. When you add a column to a database table which has say million rows, it will take some time for the database to alter the table. Second thing is that, if this column is used in searching some data in the table then this column will need to be indexed.

I do not see any reason why a column should be added when same thing can be done by adding a new row.

exterminator
April 7th, 2007, 06:57 AM
Very, very wierd! Can you expand on how the data is being managed? I mean with some examples on the table structure and processes (triggers etc)?

TheCPUWizard
April 7th, 2007, 10:21 AM
Dynamic Table Schemas are almost always a bad design.

Why not super-normalize your data so the main table has 3 columns: RecNo, ColNo, Value:. Then have another table which contains the definitions of all the columns.

Then all your app has to do is add one row to the column def table to accomplish your task???

shuvo
April 9th, 2007, 11:46 PM
the requirement is to store different skill for different professions in a table.Its normal to store skill list as rows.but as we have to search for employee skills, where the table will be millions of rows,the designer is adding columns to the skills table , i mean if a skill is added, then he is adding a column to the skill table.

exterminator
April 9th, 2007, 11:59 PM
the requirement is to store different skill for different professions in a table.Its normal to store skill list as rows.but as we have to search for employee skills, where the table will be millions of rows,the designer is adding columns to the skills table , i mean if a skill is added, then he is adding a column to the skill table.Hmm, so this added column is just a 0 or 1 column to tell if an employee has a skill or not.. right? Makes sense. But there are better ways to do that. 3 tables - employee, skills and another one just keeping the employee skill mapping - having 2 columns employee_id and skill_id. That way the column count remains static throughout. The existing queries on client side or in the stored procedures will not need any modifications which might be the case with the earlier design - where the schema is changing constantly upon definition of a new skill. Does that make sense?

Shuja Ali
April 10th, 2007, 12:27 AM
Hmm, so this added column is just a 0 or 1 column to tell if an employee has a skill or not.. right? Makes sense. But there are better ways to do that. 3 tables - employee, skills and another one just keeping the employee skill mapping - having 2 columns employee_id and skill_id. That way the column count remains static throughout. The existing queries on client side or in the stored procedures will not need any modifications which might be the case with the earlier design - where the schema is changing constantly upon definition of a new skill. Does that make sense?
I like it. :thumb: This is how it should be designed.

shuvo
April 10th, 2007, 01:48 AM
I already designed the database a month ago, like u proposed.but my senior changed it and made it dynamic.its really odd.he says, he is doing for saving time when the search is done.Is there any chance that, the trigger will fail to add a column to a table, when the table already has millions of rows.this kind of things used to happen in foxpro and other old databases,without rollback safety.I heard sql 2005 is very high performing and safe.Whats the possiblity of a diseaster happening, something like the table getting corrupted?

Shuja Ali
April 10th, 2007, 01:57 AM
An alter Table statement can surely fail if you are trying to add a Column with a Not Null constraint or a Date time stamp. You should make your senior understand that this the way it should be. I don't understand, how your seniors solution will improve the performance.

shuvo
April 10th, 2007, 02:09 AM
as we propose:

ID SKill

1001 Skill5
1001 Skill6
1001 Skill7

So u see 3 rows are generated for each employee.

but with dynamic database,


ID Skill 5 skill 6 skill 7 skill 8

1001 True False True False

So only one row is generated for a single employee.So when you search for skills of employees, the search span will be lot less.As our table has too many rows, search speed will be lot less.This is the logic he is explaining.
How much difference would it really make?Is it worth taking the risk of altering the table on the fly?

exterminator
April 10th, 2007, 02:36 AM
Have you or he done any kind of profiling to see if it's a real performance caveat? What was the test that you do? What were the sample queries fired?

Maintenance is a huge problem in such system. If you have an application layer - any user adding a new skill has to change the database schema and then change the query to read that extra column. Why would the users of your database want to let you change code like that?

I don't think there would be any major performance issues. You might try putting an index on the skill-employee map table on employee_id as well to see how the profiling result varies.

Shuja Ali
April 10th, 2007, 03:03 AM
Imagine this scenario. The database is backed up every day in the evening for your application. Now one morning couple of users added two more skills, which in turn added two more columns to the employee table. And then they entered the data. After sometime the database crashed on the same day. How are going to retain the design of the table that was modified today? Has your DBA/Senior checked this kind of scenario.

Coming back to the performance issue that you might face. When you have properly designed tables with proper indexes, you will surely have minimal performance drawbacks.

davide++
April 10th, 2007, 05:35 AM
Hi all.

First, Exterminator's solution is the best way to describe the set of skills for employees: a simple and normalizated many-to-many relationship between employees' table and skills' table.
Designing employees' table with boolean fields, one field for each skill, is a common example of bad de-normalization: when you define a new skill you have to add a new field, and the main goal of normalization is to avoid changes of data structure.
I'm wondering about how this de-normalization is managed: changing the structure of tables at run time is a very strange technique, that takes more problems than advantages, as said. For example, for Oracle an alter table invalidates the stored procedures that use the table. Are you sure that the same thing doesn't happen for SqlServer?

exterminator
April 10th, 2007, 12:27 PM
For example, for Oracle an alter table invalidates the stored procedures that use the table. Are you sure that the same thing doesn't happen for SqlServer?Are you really sure about that david? It would be a real problem need to recompile all the stored procedures again. Any reference document?

davide++
April 11th, 2007, 03:44 AM
Hi exterminator.
Yes, I'm sure: few minutes ago I've executed an ALTER TABLE statement, and now four packages are invalid :)
This is rigth: using an ALTER TABLE command you change the structure of tables, so the stored functions that use these tables may be wrong after changes; for example, you can drop a column that is used in a stored function, so the stored function cannot work yet.

In fact this is a problem, and in Oracle changing the structure of tables at runtime is a technique never used (or used very rarely).

cjard
April 16th, 2007, 04:28 PM
as we propose:

ID SKill

1001 Skill5
1001 Skill6
1001 Skill7

So u see 3 rows are generated for each employee.

but with dynamic database,


ID Skill 5 skill 6 skill 7 skill 8

1001 True False True False

So only one row is generated for a single employee.So when you search for skills of employees, the search span will be lot less.As our table has too many rows, search speed will be lot less.This is the logic he is explaining.
How much difference would it really make?Is it worth taking the risk of altering the table on the fly?

Altering the table on the fly to store this, is moronic. Noone with an ounce of sense when it comes to database design, would do this..

If you need to use data in this fashion, you store it row-ar and use a PIVOT query to turn it columnar.

Let me put another notion to you:

in a table of 1000 employees and 50 skills, you have 50,000 boolean cells and all of them are populated. If an employee can have on average 3 skills, then youre using 50,000 cells to store 3000 bits of intersting info.. 47000 cells are False for no good reason

alternately the emp_id:skill_id table has 3000 rows - much more efficient


Other reasons why adding a column is undesirable:

How will you index it? Counting all employees with a particular skill requires a full table scan unless you index all 50 columns individually. In oracle, a table can have at most 32 indexes

Indexing the skill_id coliumn is easy, one-index.. Inserts, updates and deletes are quick

Most rdbms are limited to a max number of columns per table - once you get more than 1000 skills, youre screwed on oracle and sqlserver

-

Do the company a favour and get someone with a brain (i.e. you) to do the databases. Tables grow vertically (rows added) not horizontally (columns added)