I work for a company and we are always accessing an external site for information. The site was developed by an antiquated software development company who doesn't even have a website. They pretty much have a monopoly in my state since the content provider for the database only uses this extremely dysfunctional site to upload their data. The problem with this website is that it is so slow it's not even functional.
Having controlled for things like Connection speed and Browser type, it is clear that the problem lies within the website itself. So, I am thinking about redoing the site and then offering it to the content provider as a means for uploading their data. Basically, this project requires a very large database to store hundreds of thousands of names, addresses, and other types of data.
My only experience with databases is MySql, and really my only experience with dynamic content is PHP. So, yeah, I'm trying to figure out if the old PHP + MySQL combination is suitable for storing and representing large amounts of data. I have only done this on small projects, but I think the whole HTML templates with placeholders for the dynamic content would work fine.
Of course, I truly don't know why this website is so slow. Maybe it's not the DB at all. Maybe it's the server or something else. But the key thing I am trying to accomplish is to improve upon the speed and functionality of this site. I have no experience with other types of databases, so any tips / advice you can offer for doing a project like this would be greatly appreciated. Also, any tips regarding how to generally make a fast and functional site that would need to represent dynamic data from an extremely large database would also be helpful.
*Edit: I am learning python so if you think this would be a better side-scripting language then I can certainly try to implement something different than the initial plan above.
You are using a MyISAM table and the space required for the table exceeds what is permitted by the internal pointer size. MyISAM permits data and index files to grow up to 256TB by default, but this limit can be changed up to the maximum permissible size of 65,536TB (2567 − 1 bytes).
Row Size Limit Examples The MySQL maximum row size limit of 65,535 bytes is demonstrated in the following InnoDB and MyISAM examples. The limit is enforced regardless of storage engine, even though the storage engine may be capable of supporting larger rows.
There are a few points to bring up with your question.
The first item that I notice is that you mentioned storing "hundreds of thousands" of rows. This isn't very big. My 386 server with 512M ram can handle that many rows with only a small delay. If you engineer your database correctly, then your real issue is going to be about the bandwidth (requests per second) that are going to be coming to your server.
MySQL can scale, but it is not great at it. Facebook does use PHP/MySQL for some of their services, but for the faster and more bandwidth intensive tasks they use hadoop. They also use PHP, but for their faster applications they use hip-hop to convert it to c++ and compile it.
MySQL can scale, but if you don't configure it correctly then it will fail miserably when the tables get too large. PostgreSQL scales better out of the box, but either does fine if configured correctly.
It looks like your application is bigger than you are used to, but really isn't "that" big. I would worry more about response time than database scaling in your particular situation.
If you do a good design, you can have very large databases in MySQL (this other question may help you). Define properly the keys and indexes, optimize your queries (EXPLAIN
is your friend here). Select a good database engine.
There are a lot of things to do to get the best from MySQL.
EDIT: some more ideas...
It's also quite important the way you structure your data and the tables to make easy to write them, recover them or finding a compromise between both (depending on the use case).
Also if it grows you can use a cluster, partition your data between several MySQL databases in several machines (using a Master-slave schema for instance) it's another option.
To improve the performance you should also think on using some kind of cache for repetitive queries.
There is also tools like Vitess that help to scale a MySQL database in a similar way as the NoSQL databases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With