Keep the logic in the DB..
William (188 points) | Tue, 2005-09-20 12:02This tip is a quick reminder for those people who do a bit of PHP/MySQL development. This example is from a project I am currently working on.
I have a table of author's profiles that uses a field called 'auth_name'. This is a varchar(255) type field.

On the page about authors I do:
SELECT authorid, auth_name
FROM authorp
ORDER BY auth_name
and populate a HTML SELECT box with an array of results.
The client suddenly decided last week she would like the names sorted by surname rather than first name. This is potentially tricky because I only store the fullname as a single string.
While it may be possible to fix this problem by sorting the array within PHP, using a combination of string and array functions, it would be much more sensible to do this within MySQL.
1 - It is probably easier to do the sorting, while preserving the index.
2 - Performance. If you do the work in MySQL, the results will be cached for the next page view. If you do the work in PHP, you will need to pull out the results array, and then use more memory sorting the array _every_ time the page is viewed.
What does the winning query look like:
SELECT authorid, auth_name,
SUBSTRING(auth_name,LOCATE(' ',auth_name)) AS surname
FROM authorp
ORDER BY surname
So long as you use EXACTLY the same query (spaces, newlines etc) the result will come from the cache.
How does the extra code work ?
I have assumed that author names only have one forename, however they may have two surnames.
eg: F Scott Fitzgerald will be sorted on the 'S'.
The extra bit of SQL:
SUBSTRING(auth_name,LOCATE(' ',auth_name))
works by looking at the 'auth_name' field, and LOCATEs the first space.
In the above example of F Scott Fitzgerald, the result would be 2.
The 'auth_name' is then put into the SUBSTRING command, along with the parameter 2.
This will return a subsection of 'auth_name', starting at position 2, eg 'Scott Fitzgerald'.
This value is stored in a variable called 'surname'.
Finally at the end I use ORDER BY surname, to do the sort. This result is stored in the MySQL cache, ready for the next person.
Since I have not actually changed any of the fields returned by MySQL, (authorid and auth_name), I do not have to change any of my PHP code.
The speed impact will obviously be negligible for small data sets, however the more data you have and the more frequently it is being viewed the bigger the impact.
Will
PS For a bonus point, what does the F in F Scott Fitzgerald stand for ?
Commenting on this Blog entry is closed.
Looks like a good tip to me but what happens if an author has a double surname?
I guess it would be hyphenated so maybe no problem. I guess the rule of thumb is to always capture forenames and surnames in separate cells?
I believe the F stands for Francis?
----------------------
http://mijlee.com
----------------------
The LOCATE function should return the position of the first space, so it will sort on the first letter of the first half of the surname. Like I mentioned in the example F Scott Fitzgerald would be sorted by the S in Scott.
Correct on the Francis. His full name is actually, Francis Scott Key Fitzgerald
--
http://www.macscan.net/
Plus DB's are optimised to do such things, they will more happily chug through a few thousand records than our old friend php.
I would have probably reversed the string and then looked for the first space, just incase my posh friends with 10 middle names decide to sign up. I would get the last word they put in their name, which i hope could be taken as a surname.
This was obviously a bad planning job, but why use the solution that will potentially give bad data? In this particular case, it wouldn't be too much work to rename auth_name to auth_firstname, create and auth_lastname row and set up the forms to work accordingly. I'm not saying I would do that for free, but if I could convince the client that it'd be less work in the future, I'm sure they'd pay for the time. If they wouldn't pay for it, I'd probably do it with SUBSTRING and tell them "I told you so" when they get bad data. :)
The initial design of the page did not use a selection box at all. The auhor's name was merely there for display purposes. We were using ISBN numbers to tie all the data up.
This method works for now, and only took 5 mins to develop.
As we near the end of the development it is getting far to hectic to make such changes.
Instead I am blogging about it here....
--
http://www.macscan.net/
yes of course it would be more powerful to seperate to auth_firstname and auth_lastname, but you can use the substring method for anything else, so treat this as an example. :)
--
dp
...might be Jr. or III :)
Similarly, I was looking at a book just today whose author is named John Twelve Hawks, where both of the last two words are the last name.
Nice to see a few PHP/MySQL tips here!
I would happily write some more PHP/MySQL articles.
If anyone has any requests for articles or problems to be solved, please let me know.
--
http://www.macscan.net/
Yes, it's great seeing MySQL and PHP tips on creative bits. I am struggling on grasping the concept of Joins in MySQL. I am more of a visual learner so it's hard for me to read about it and not see how it's working. I know its a big subject and you don't have to go really indepth. Just any tips that helped you understand Joins would be much appreciated!
BENLEIVIAN.com
yeah I been having issues with learning how to implement joins as well.
Think I am getting it though so maybe I'll post up a basic example at some point. or maybe Will will beat me to it ;)
What sort of joins are talking about specifically?
----------------------
http://mijlee.com
----------------------
I hate to say it but joins in MySQL don't work as they should in some cases. Especially when they get slightly more complex.
I work with a "real" rdbms (Teradata) and when trying to do similar, ANSI standard type things in MySQL it either blows up, or doesn't work as it should. So as long as your joins are fairly simple, you should be ok. As soon as it becomes more complex, hold on for dear life :)
If anyone would like some SQL help, let me know, i'm more than willing to help out...
Btw, i think the first step to building quality database driven applications, is designing the database correctly. If that's done properly, the rest is a breeze...
Hey Rob
I worked with Teradata too, but this is an unfair comparison.
Teradata is meant for multi-gigabyte datawarehousing where applications are decision support where the work load is ad hoc queries grovelling through billions of rows joining many tables. Speed is not such a big factor here, since this is analysis type of work, and not transactional.
MySQL is meant for medium sized databases, where speed is the main criterion. The workload is mainly many concurrent users, e.g. web sites.
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com
--
Drupal development and customization: 2bits.com
Personal: Baheyeldin.com
did you have a look at w3schools.com/sql ? it also explains the joins quite well!
I would definitely want to see more php/sql articles at creativebits, especially sql cause good sql articles are so hard to find!
I too am delighted to see php/MySQL on this site, though i never expected it.
Keep it coming.
Nonetheless it is also a good example of how oversimplifing the beginning can potentially lead to a small (in this case) problem later. Author_Fname, Author_Mname, and Author_Lname would clearly be better, especially if one were to resort anyway.
Who knows what the client will ask for later. Not them.
I know I'll be putting a few DBAs off-side by saying this, but my professional feeling is that this is a bad idea. In my opinion, it adds another layer of code that you need to maintain to any programming scenario. Perhaps it's the amount of time I've spent with Ruby on Rails lately, but my opinion is that databases are for storing your data - leave the logic to whatever language/framework you choose to program with.
I'd rather be able to switch my data-storage solution fairly quickly and easily, without having to re-write triggers and other business logic that someone has chosen to implement in their database rather than their application code.
My 2 cents, anyhow...
This is a valid point, along with the whole issue of database abstractions layers...
Personally I do not use a DB abstraction layer and believe in writing my app for one type of database. That way I can make it take advantage of whatever DB system I am running on. In 6 years of writing DB driven web apps I have never had to move an app to a different database. People like Jeremy Zawodny, MySQL guru for Yahoo feel that DB abstractions layers must die. Rasmus Lerdof agreed in a PHP talk that DB abstraction were pretty stupid.
MySQL is a very fast database, but it is not the be all and end all of databases. A developer should use the correct tool for the job.
This can also mean looking at what DB is on offer at the host. The vast majority of PHP apps are written with MySQL in mind. If a client wants his website on a £10/month host then you will probably have to use MySQL.
MySQL has less features than most other DBs. This also means it is easier to learn what it can do, as well easier to setup and administer.
--
http://www.macscan.net/
DB logic does indeed add another layer of code, but that's one of the points about using a database rather than file storage. If you take your argumentation to the extreme, "SELECT * FROM tbl" would be the only select query you need, an operation which could be done much faster by simply reading a CSV or binary file.
Limiting the data set is what databases are designed to do, and doing so in the database itself rather than the client code both takes advantage of the database's optimizations and limits traffic—both disk reads and network transfers.
If you organize the SQL queries properly in one or more separate files like a program, it's also much easier in the long run to maintain the code and port the client program to another language or the SQL code to another database system.
Kjetil Valen
kjetil . valen (a) skir . no
Leo Valen
leo (a) code.coop
jr or III ... haha got me there..
I really don't try to be a pessimist all the time, I promise!
I think the whole point here is that it's much better to use logic in MySQL to trim out all the unneeded data, than to do it in PHP.
It's sad, but most PHP developers I have met know very little SQL, and as such, end up writing bulky PHP to extract info from a MySQL result, when a few JOINS and some basic SQL logic would have been all you needed in the first place.
I say we encourage this type of thinking :)
Nate Cavanaugh
http://alterform.com
http://shift22.com