I'm creating a database for myself and I have a registration form that asks people for their name, password, email, mobile, age, and location. It's hopefully going to be like a dating site.
So far I have (for my database)
create table members(
user VARCHAR(16),
password VARCHAR(16),
email VARCHAR(320),
mobile VARCHAR(15),
age INT(3)
#location ???
INDEX(user(6));
)
EDIT forgot to ask the question
What datatype should be used for the location
and does the database look safe enough?
The location is by country and is selected by a drop-down menu.
Does the database look safe enough?
No. And it's going to take a bit to explain.
Let's start with all those limits. Over-enthusiastic use of column limits is a very common problem.
How those limits work is often misunderstood, for example you asked about security, or you might think you're saving space. The real problem is it hard codes unnecessary limits about how the rest of the software behaves into the database, and your limits are pretty stingy.
You asked whether this makes the table more secure. Column limits aren't really about security, though I guess they theoretically keep someone from filling up your disk, but that's not what the limits you have are about.
For example, you have passwords limited to 16 characters. That's not a decision the database should be making, but a security consideration. When you're looking into password security later on, you'll find that 16 character passwords are barely adequate. You want something more like 64 or 128, and that will require an expensive alter table
.
More importantly, you're storing passwords in the clear. This is a big security problem.
Then you've got email (presumably email addresses) set to 320 characters?! That's a hell of an email address! But users only get 16 for their names?
One common misunderstanding is that these limits reduce the amount of disk used. They don't. varchar
will only store as much as necessary for that row. age INT(3)
doesn't use less space than age INT
, it's a fixed size. Ok, it makes sure that you don't have any 12938 year olds on your dating site It doesn't even do that. It's just how many fields get displayed which is definitely not something that should be in your schema. MySQL does some strange things.
You could use an unsigned tinyint
which stores from 0 to 255 in 1 byte... but once you're worrying about individual bytes it gets silly. And this is all moot, don't store their age at all. Store their birthday. Because people get older.
Enough about the wrong way to use limits, what's the right way?
Limits are for enforcing data integrity (and technical limitations, see the comments). That's about it. You want to make a schema that's flexible enough to support whatever your application wants to become, while also ensuring that the data is what it says it is without having to constantly second guess it.
A better schema might look like this:
create table members(
id primary key auto_increment,
username varchar(64) unique,
password_hash varchar(128),
email varchar(64) unique,
mobile varchar(32),
birthday datetime,
location integer references(locations),
index(birthday)
)
Your table was missing a primary key and that's a big problem. Usernames can change, and you don't want everything that refers to that user to break if it does. Instead, use a simple auto incrementing integer. "But I won't let users change their names!" Yeah, remember what I said about hard coding limits on the software into the data model? How you build your schema has effects for years.
Originally I pulled all the limits off the fields and switched them to unlimited text
. None of them have a compelling reason for a limit. Unless you have a really good reason, just use text
or varchar
. Limits are handled in the data model, changeable by the programmers and designers. A text
or varchar
field will only use as much space as it needs.
...but @PaulSpiegel pointed out in the comments that MySQL's has limitations on how big a field it will index. I'm used to Postgres which has no such limits. So I switched them to varchar
and picked generous limits. Technical limitations can be a reason to use a limit.
username
(not user
because that can refer to the entire user, not just their name) and email
have been marked unique
. This is about data integrity, you don't want two people with the same username, and you want to make sure each account has a single point of contact (you might argue this is putting behavior in the database, and you might be right, but it's easier to remove a unique index than to add one).
Then we come to the glaring security problem. NEVER STORE PASSWORDS! Ever. Ever ever ever. Instead, store a hash of the password. If you don't know what I'm talking about, stop right now and read Salted Password Hashing - Doing it Right.
Instead of storing the age
, we store their birthday
as a datetime
. Storing a user's age isn't looking ahead, what happens next year? With their birthday you can calculate their age, and even give them birthday presents! By storing it as a datetime
you can do all sorts of date calculations on it using MySQL's confusing date and time functions.
And you asked about what to do with location. Location can mean a lot of things, and it can get pretty complicated. This isn't a decision you need to make right now, so it's best to make it so it can be expanded and changed later. Put it in its own table, and reference it with a foreign key. We'll come back to it later, which is the whole point.
Finally, indexes. There's no need to be so skimpy with your disk space that you're only storing the first 6 characters of a username! Looking people up by username is going to be very, very common, give it a full index. But we don't need one, declaring a column unique
gives it an index.
Indexes can improve query performance, but they can also eat disk space and slow down inserts. Rather than go crazy with the indexes up front, wait until you see what queries you're going to make and what the performance is like. The only explicit index I've put in is birthday
because I'm pretty sure a dating site is doing to want to limit by age.
The missing piece here is your data model. This is the code on top of your data, for example a Member class. It would handle all the things a member can do, including access to the database and what the limitations should be. The model is the only thing that touches the database, the rest of the code calls methods on the model. This lets the database change without worrying about affecting the entire project.
This is known as Model-View-Controller or MVC and it's the basic way data driven applications are coded these days. Ruby On Rails is a good example. Look into MVC.
Ok, location. We've made location
its own table. That makes it an abstract concept rather than some hard coded fields in the members
table.
Location can get really complicated really fast. So we'll keep it pretty simple. Start with some basic information that other users are going to want to know: who's nearby. The minimum is postal code and country, you can figure a lot out from that. You'll probably also want to keep the city and state, because that's another way people want to locate people.
create table locations (
id integer primary key,
city text,
province text,
country text,
postal_code text
);
Make a Location model to encapsulate and manage the locations data.
Now you can go into as much detail as you like with managing location data without messing up your members table. You can do what @PaulSpiegel suggested and make a table of all the countries and their names to reference in the rest of your code and to ensure they're using a real country (data integrity again). You can use their postal codes to get their city and province. You can store GPS data if they'll give it to you.
So, uhhh... if you're just starting out this is probably going to seem overwhelming. Data modelling is complicated. It's not necessarily hard, just there's a lot of moving parts and things to consider so your application isn't limited by your schema. You'll probably have to make a few applications with overly simplistic schemas to really understand.
Let's see if I can boil it down.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With