I am working with a team of developers on a website. The website will be using classes. I am in charge of creating the data access layer for the classes. There is an understanding that all user input will be escaped upon retrieval (from the post or get). Having little control over the input level (unless I personally review everyone's code), I thought it would be cool to throw in escaping on my end as well (right before it hits the database). The problem is that I don't know how to use mysql_real_escape_string without adding even more slashes.
Since user input may very well contain a slash I can't check to make sure there are slashes in it. I might be able to check for all the things that need escaping and make sure they have a slash in front of them, but that doesn't seem like the best way to do it.
Any suggestions?
Have you considered not escaping the data until it hits the data access layer? I ask, because their are some gotchas with the approach your team is taking:
'
is not special to HTML) and then re-escape the data (because <
is special). If you need to display form data to the user pulled from the database, you mustn't do that de-escape step (because it was done by the database, when the data was saved), but still must do the HTML escape step. If you make a mistake and do the wrong procedure, you corrupt data or worse introduce security problems.\'
mean to your database? How should a '
or \
be escape — if at all? If you ever change your database engine, or even change its settings, those may change. And then you have a bunch of escaping/de-escaping code to find. Missing a single escape/de-escape may lead to SQL injection.There is another way: Let whichever layer needs the data escaped escape it itself. Data is always passed between layers in raw, unescaped form. So your data access layer does all database escaping. Your HTML output code does all HTML escaping. When you decide you want to generate PDFs, your PDF code does all PDF escaping.
There is no way you could add an automatic decision to escape or not if you don't know if the input has been escaped. You can attempt to analyze it but it will never be good and you will encounter double backslash pairs and such.
Take the decision once that data sent to your access layer should be clean and handle the escaping in one place. If you do, the other developers will not have to worry about it (they probably don't want to anyway) and it will be much easier to move to another database in the future. It will also give you the freedom to move over to prepared statements at any time.
Edit: Forgot this:
Having little control over the input level (unless I personally review everyone's code)
I think it is worth the pain to have them discover it themselves if you just make it very clear that escaping is something that belongs to the database layer and should not be done elsewhere.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With