Looks like my data warehouse project is moving to Teradata next year (from SQL Server 2005).
I'm looking for resources about best practices on Teradata - from limitations of its SQL dialect to idioms and conventions for getting queries to perform well - particularly if they highlight things which are significantly different from SQL Server 2005. Specifically tips similar to those found in The Art of SQL (which is more Oracle-focused).
My business processes are currently in T-SQL stored procedures and rely fairly heavily on SQL Server 2005 features like PIVOT, UNPIVOT, and Common Table Expressions to produce about 27m rows of output a month from a 4TB data warehouse.
Teradata is a MPP system and its capabilities allow [us] to process a huge amount of data with great performance since it works by the concept of nodes (amps), being scalable and offering great reliability about data recovering, massive processing managing, data backup, etc.
Oracle is the tailor-made system for transaction processing because of its great architectural flexibility whereas Teradata's OLAP is very powerful for slicing and dicing data for analysis. Oracle is mainly used as an online back-end application.
Teradata automatically distributes the data among various AMPs based on primary index value. This ensure the parallel processing of all the transactions perform on the table. The distribution of the data can vary based on uniqueness of primary index.
Teradata has unconditional parallelism[ix] whereas Oracle has Conditional. It gives Teradata an advantage with OLAP as the performance is exceptional to achieve a single answer faster than a non-parallel system. Parallelism uses multiple processors working together to accomplish a task quickly.
One place to start is here: http://www.teradataforum.com/
This might be a little late, but there are a few things which I can warn you about Teradata which I have learned.
Use the most recent version as often as possible. For V12 the optimizer was re-written and the database performs much better now. Try to realize that SQL Server and Teradata are very different beasts, most of the concepts will not transition well. Do not underestimate the importance of a primary index. The locks that teradata uses are very primitive when compared to other databases. Do NOT use TERA mode. You do not have any code which is legacy, ANSI mode is far superior and is widely encouraged. Join indexes are very helpful tools, but they do not provide all the answers. Parallelism, take the time to understand how FASTLOAD, MULTILOAD, and TPUMP works and find out how one can leverage it with their ETL strategy. If you are attempting to run a query which needs to be performant, do not use any casts, the optimizer will not use statistics to generate the best execution plan. Working with dates are going to be a pain, just a warning. Teradata is very DDL oriented, try to understand all the syntax related when creating a table. Compression is a wonderful tool, if you have any values which are repeated in a table, make use of it. There are not many tools available with Teradata, be prepared to build a lot. The tools that exist are very expensive.
Unfortunately, I do not know much about SQL Server, so I cannot say what tools in SQL Server appear in Teradata.
Hope this helps
I would also look into the recently launched Teradata Developer Exchange as well as the TeradataForum and forums on Teradata's main website.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With