Sunday, March 25, 2012

Database for datawarehouse

Hi,
Does anyone have any information on comparison between different databases
for datawarehousing? I am working on a study to develop a datawarehousing
solution for my client, so I would like to get some information on
competitive aspects of SQL Server 2005 over DB2 and Oracle.
Thanks all~
SQL 2005 offer a better package then the others. (at a lower price then the
2 others)
In 1 product you have a database engine, OLAP engine, data mining engine,
report engine and ETL engine.
Analysis Services (7.0, 2000... and 2005...) is the most OLAP server used.
(http://www.olapreport.com/market.htm)
From a performance point of view the 3 RDBMS servers are near identical
(www.tpc.org / TCP-H benchmark)
From a 3rd party support Microsoft is better, you'll found a lot of tools &
product compatible with AS 2000/2005 & SQL 2000/2005.
But you'll found a lot of articles which push one of these vendors...
You have to setup a complete list of needs & requirements of what your
client want.
But from my experience, this doesn't really helps you in your process.
Today the internal expertise & knowledge of a particular database/technology
is most important, if you know and allready use the product you are able to
provide better & faster results.
Also... get the demo versions of the tools or ask the vendors for a proof
of concept.
if you are really neutral, let the sales rep. to do their jobs.
"Stella" <stellajst@.hotmail.com> wrote in message
news:ulhZuoP5FHA.884@.TK2MSFTNGP14.phx.gbl...
> Hi,
> Does anyone have any information on comparison between different databases
> for datawarehousing? I am working on a study to develop a datawarehousing
> solution for my client, so I would like to get some information on
> competitive aspects of SQL Server 2005 over DB2 and Oracle.
> Thanks all~
>
|||There is a technical paper with detailed comparison between Oracle and
SQL Server from wisdomforce site:
http://wisdomforce.com/dweb/resource...0g_compare.pdf
As far as I remember it contains a lot of OLAP info
J=E9j=E9 wrote:
> SQL 2005 offer a better package then the others. (at a lower price then t=
he
> 2 others)
> In 1 product you have a database engine, OLAP engine, data mining engine,
> report engine and ETL engine.
> Analysis Services (7.0, 2000... and 2005...) is the most OLAP server used.
> (http://www.olapreport.com/market.htm)
> From a performance point of view the 3 RDBMS servers are near identical
> (www.tpc.org / TCP-H benchmark)
> From a 3rd party support Microsoft is better, you'll found a lot of tools=
&
> product compatible with AS 2000/2005 & SQL 2000/2005.
> But you'll found a lot of articles which push one of these vendors...
> You have to setup a complete list of needs & requirements of what your
> client want.
> But from my experience, this doesn't really helps you in your process.
> Today the internal expertise & knowledge of a particular database/technol=
ogy
> is most important, if you know and allready use the product you are able =
to[vbcol=seagreen]
> provide better & faster results.
> Also... get the demo versions of the tools or ask the vendors for a proof
> of concept.
> if you are really neutral, let the sales rep. to do their jobs.
>
> "Stella" <stellajst@.hotmail.com> wrote in message
> news:ulhZuoP5FHA.884@.TK2MSFTNGP14.phx.gbl...
ses[vbcol=seagreen]
ing[vbcol=seagreen]
|||Jeje,
"SQL 2005 offer a better package then the others. (at a lower price
then the 2 others) In 1 product you have a database engine, OLAP
engine, data mining engine, report engine and ETL engine."
In your opinion... ;-)
Yes, SQL Server is pretty good, and I've read up on 2005 and it's
pretty good too....I've used the beta and it seems to do what the book
says...
But I recently did a project where the design points were 10,000 users,
180M inserts per day and 15TB of disk.....and there is no question SQL
Server has no references at that level and therefore would not make the
cut under the 'proven technology' selection criteria.
So, which database is 'better' is best prefixed by asking the question,
what are you thinking of doing with it, and Stella has not provided
such information apart from 'data warehousing'...';-)
Nowadays, under 1TB of disk (100GB of data for SQL Server) you can
hardly go wrong with any of these three...
Best Regards
Peter
|||Thanks all for your help!
To be more specific, I'm actually looking for a comparison between databases
of 5TB to 8TB in size, running on Linux or Windows. Would anyone have ever
come across any cases of data-warehousing with these criteria?
Thanks~
"Peter Nolan" <peter@.peternolan.com> wrote in message
news:1131641995.382364.174340@.z14g2000cwz.googlegr oups.com...
> Jeje,
> "SQL 2005 offer a better package then the others. (at a lower price
> then the 2 others) In 1 product you have a database engine, OLAP
> engine, data mining engine, report engine and ETL engine."
> In your opinion... ;-)
> Yes, SQL Server is pretty good, and I've read up on 2005 and it's
> pretty good too....I've used the beta and it seems to do what the book
> says...
> But I recently did a project where the design points were 10,000 users,
> 180M inserts per day and 15TB of disk.....and there is no question SQL
> Server has no references at that level and therefore would not make the
> cut under the 'proven technology' selection criteria.
> So, which database is 'better' is best prefixed by asking the question,
> what are you thinking of doing with it, and Stella has not provided
> such information apart from 'data warehousing'...';-)
> Nowadays, under 1TB of disk (100GB of data for SQL Server) you can
> hardly go wrong with any of these three...
> Best Regards
> Peter
>
|||Hi Stella,
the major metrics to provide are...
1. Volume of source data because each of the databases expands data by
a specific amount and most of them now also support varying levels of
compression. 5 to 8 TB means different sizes on different databases.
2. Number of rows in large tables.
3. Number of users.
As the size, complexity and number of users goes up
Oracle/DB2/Teradata/Sybase IQ come more into their own over MySQL and
SQL Server. I think you will find few examples of 5-8TB databases
running on SQL Server and not that many on windows/oracle. For
DB2/oracle they are commonplace.
For IQ there are almost none because you get compression in IQ.....so
something that is 8TB disk in Oracle is about 800GB disk in IQ. IQ now
holds more of the top 10 records than any other database...amazingly it
is a very well kept secret of out friends at Sybase.
I think it is unrealistic to think that as a consulting provider to a
client you would be given the benefit of years of experience in using
these databases across many projects in short appends on a newsgroup.
Long term experience across a range of databases provides a set of hard
won skills which are of great value to clients should they want someone
to assist in the database decision.
You can ask the vendors. Each vendor produces detailed papers for their
database.
Each of MySQL, SQL Server, Oracle, DB2, Teradata, Sybase IQ have their
strenghts and weaknesses. No vendor is going to tell you anything about
their weaknesses and vendors (and biased consultants) spread plenty of
mis-information and out of date information about competitive products.
For SQL Server (we are on a MSFT newsgroup) MSFT has done an excellent
job of documenting project REAL.
http://www.microsoft.com/sql/solutio...ojectreal.mspx
This is one of the most open presentations of a BI effort any of the
vendors has produced. MSFT are to be congratulated on their open-ness
on this one!!
My FAQs page discusses some of the considerations.
http://www.peternolan.com/FAQs/tabid/136/Default.aspx but it is only
intended to provide advice for smaller clients. Larger clients are
expected to have staff to advise on these decisions or pay for advice
on these decisions. I have been paid by clients to prove that the
database they wanted to use would work for what they wanted to do with
it.
Having said all that...
Many people consider what database to use as setp 1 of a BI effort. It
is one of the most ineffective places to start. It is far better to
start with what business the business is in and to design and build a
working prototype before deciding on what database to use.
It is now possible to build large scale prototypes and to have the the
ETL completely transportable from one database to the other with only
the cost of converting the table definitions.
Of course, none of the database vendors will tell anyone that it is
possible to migrate a DW between databases with trivial effort!! It is
not in their interests. And MSFT/Oracle have tied their ETL tools
tightly to the database so if you write your ETL in one of these two
ETL products you will live with the database forever.
So...It is now possible to postpone the 'which database' decision
until very late in the process, which also means it is possible to
postpone the database license payment until late in the process.
Indeed, in some clients we have built the large scale prototype on 2
competing databases so that we could really compare apples with apples.
No amount of 'competitive information' is equal to actually running two
databases side by side, latest release, with the vendors competing for
the business. The mix of HW/OS/RDBMS/DISK and versions of all these
makes it pretty much impossible to compare two solutions on
paper...though many try.
I recommend to my clients that if they are undecided about which
database then they are best advised to remain undecided until they
complete the development of the protoype and do a side by side
comparison to prove to themselves which one they feel is best for them.
It is not lost on my clients that the database vendors are far more
likely to offer incentives when the decision process is clear and
imminent and the weaknesses of their product have been exposed in large
scale prototype... ;-)
Peter
www.peternolan.com

No comments:

Post a Comment