Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What do I need to know about databases?

In general, I think I do alright when it comes to coding in programming languages, but I think I'm missing something huge when it comes to databases.

I see job ads requesting knowledge of MySQL, MSSQL, Oracle, etc. but I'm at a loss to determine what the differences would be.

You see, like so many new programmers, I tend to treat my databases as a dumping ground for data. Most of what I do comes down to relatively simple SQL (INSERT this, SELECT that, DELETE this_other_thing), which is mostly independent of the engine I'm using (with minor exceptions, of course, mostly minor tweaks for syntax).

Could someone explain some common use cases for databases where the specific platform comes into play?

I'm sure things like stored procedures are a big one, but (a) these are mostly written in a specific language (T-SQL, etc) which would be a different job ad requirement than the specific RDBMS itself, and (b) I've heard from various sources that stored procedures are on their way out and that in a lot of cases they shouldn't be used now anyway. I believe Jeff Atwood is a member of this camp.

Thanks.


The above concepts do not vary much for MySQL, SQL Server, Oracle, etc.

With this question, I'm mostly trying to determine the important difference between these. I.e. why would a job ad demand n years experience with MySQL when most common use cases are relatively stable across RDBMS platforms.

CRUD statements, joins, indexes.. all of these are relatively straightforward within the confines of a certain engine. The concepts are easily transferable if you know a different RDBMS.

What I'm looking for are the specifics which would cause an employer to specify a specific engine rather than "experience using common database engines."

like image 224
Junior Programmer Avatar asked Feb 06 '10 16:02

Junior Programmer


4 Answers

I believe that the essential knowledge about databases should be:

  • What database are for?
  • Basic CRUD Operations
  • SELECT queries with JOINs
  • Normalization
  • Basic Indexing
  • Referential Integrity with Foreign Key Constraints
  • Basic Check Constraints

The above concepts do not vary much between MySQL, SQL Server, Oracle, Postgres, and other relational database systems. However you'd find a different set of concepts for the now-popular NoSQL databases, such as CouchDB, MongoDB, SimpleDB, Cassandra, Bigtable, and many others.

like image 171
Daniel Vassallo Avatar answered Oct 23 '22 04:10

Daniel Vassallo


After the CRUD statements, to be an effective DB programmer I think some of the most important things to understand are JOIN statements. Understand the difference between LEFT and RIGHT, OUTER and INNER joins, and know when to use each. Most importantly, know what the database is actually constructing when it performs a JOIN.

For me, the Wikipedia article was very helpful.

Also, indexing is very important - this is how relational databases can perform fast queries. Understand how to use them and what happens under the hood.

Wikipedia article on DB indexing.

You should also know how to construct a many-to-one relationship (using foreign keys) and a many-to-many relationship (using join tables).

I know that in your question you're asking about specific DB implementations, but if you're to be taken literally and you only know about SELECT, INSERT, UPDATE, and DELETE, then the above concepts will be far more valuable than learning the intricacies of a particular implementation.

like image 23
danben Avatar answered Oct 23 '22 04:10

danben


It's not just stored procs and functions. Each database has fundamental differences and quirks that are important to understand even though SQL works more or less the same.

Examples:

  • Oracle and MySQL handle locking differently, in different situations.
  • Oracle doesn't have autoincrementing primary keys like MySQL and SQL Server.
  • Subtle vendor-specific behavior, like the way Oracle does sorting for VARCHARs differently depending on locale.

If you really want to improve your applications, you eventually have to become familiar with the details about how your specific database works. Most of the time it doesn't make a lot of difference, but when it does matter, it usually makes a big difference, especially when it comes to performance.

like image 1
Ken Liu Avatar answered Oct 23 '22 03:10

Ken Liu


Some things which seem to come up when talking with my Database-keen colleagues:

  • Row vs page vs table locking escalation when doing multiple complex joins, implies sometimes doing very different things on different vendors dbs. This is where the theory is really hitting the tarmac and often it is non-intuitive.
  • Differences between how cursors are best used on different vendor db implementations
  • Odd stuff in the stored proc language variants, like how best to handle failure cases
  • Differences in how temporary tables and views are best used depending on the underlying implementations.

All of these kind of things don't really matter until you are trying to solve something that either has to - Run very fast - Contain lots and lots of data - Gets very big and complex (i.e. multiple queries hitting same tables simultaneously)

These are the kinds of things that DBAs should be helping with, so depends on if you are aiming to be a DBA or a programmer. None of the above have really hurt me yet, because I've not worked on db-intensive systems, but I've worked near a few, and the programmers on those end up knowing a lot about the internals, restrictions, and good features about the specific database they are using.

Best way to get knowledge like that (other than on the job) is to read the manuals or hang out with people that already know and ask them about it.

like image 1
DaveC Avatar answered Oct 23 '22 02:10

DaveC