Illinois State University



Bhalchandra

ITK478

Position Paper: MySQL vs PostgreSQL for a Small Scale E-Commerce Business.

Database management systems are important for any business, in this paper I am trying to evaluate open source DBMS options MySQL and PostgreSQL from the point of view of a small E-Commerce business. Both these database are industrial strength databases and enjoy wide following. Craigslist, Yahoo, Wikipedia, Flickr use MySQL and Genetech, BASF, Skype use PostgreSQL[4]. Any E -commerce business irrespective of it's size needs to store data about 1) Products offered 2) Customer related information 3) Order status 4) Visitor browser behavior etc. also some times there is a need to embed a database in the software solutions that the company is offering.

The Business I am considering for the comparison is say .The primary activity of this business is to create animated children books and make these e-books available to schools/ public libraries/individuals for an annual fee. Schools / Public libraries in turn make this content available to their students/ patrons for free. These books are basically short picture books with a dash of animation and interactivity and are aimed at age group 4 to 8. Because of animation the books are engaging and are loved by kids. The site has a shopping cart application thru which it offers to sell the CDs of the animated stories with related interactive content . The company also offers the same books as reading material for schools to test kids ability to comprehend reading material (In the line of Scholastic Lexile testing) along with quizzes related to mathematics, the product records and manipulates scores on individual/class basis.

Individual users directly access the site after authentication (User ID and Password combination)and as far as students and library patrons are considered access is gained in the following way:

Typically a patron of a public library accesses the web site of the library, on the home page of the library an option like “kids zone” is available on clicking that option user is authenticated and then directed to the home page(English) of the site of . On the Home Page(English) there are options to to select other languages like French or Spanish, on selection of one of these options you are taken to that version of Home Page. On any version of Home Page there are options like Stories, Audio Books, Favorites, Language learning, Puzzles and Games,Index, Contact Us, Help ...etc. .

Let us take a look at the data storage needs of this site let us assume the site has about 150 animated books each swf file taking up about 5MB space. In addition to that there is about 800 MB worth of MP3 audio files and about 200 MB of Puzzles and games. The site services about 1000 public libraries(about 10% of public libraries) and let us assume that about 200 kids from each library access this site on once a month basis. The site identifies each user uniquely and is able to store and retrieve the users favorites( Favorites are stored in the database and no cookies are used) Let us assume that each user has 10 entries in his or her favorites that results in a favorites table with 500*10*1000 rows let us assume similar volume is there for school market .Let us assume that the site needs to store records of shopping cart related purchases for at least 1 year and there are details of about 10000 transactions in the database at any point of time.

To calculate the quantum of daily downloads let us assume 1 user downloads between 5 MB to 10 MB of data and there are 5000 users (about 1%) visiting the site daily, this leads us to the figure of approximately 25 to 50 GB per day.

The nature of the content is more static than dynamic, this is natural as we don't expect the story content to change depending on the user, also, the site does not store user related scores because of simple nature of the games . Site delivers stories in about 20 to 30 seconds for a DSL connection(768Kbps) but since the first page is available for reading in about 10 seconds user is happy with the speed. Games being small in size load in less than 10 seconds.

As far as the dynamic web pages are concerned Shopping Cart application is obviously dynamic in nature and only here the response is critical.

Let us assume that there is a huge potential for growth with almost 100,000 + public/private schools and 9000 + public libraries with almost 35% of checked out material being children related [7] let us assume that the site has its own infrastructure. We can imagine the system configuration for the web site is going to be multi tiered, for each layer there are many open source and proprietary options available keeping up in the spirit of open source we can imagine our e-commerce application has web server like Apache at the view layer, PHP for business logic layer and either PostgreSQL or MySQL as DBMS forming the data layer.

There are many articles available that compare MySQL and Postgre SQL only problem is many of them are no good as the content pertains to older versions and no longer applicable, their only utility is to glean criteria for comparison.

Let us look at some of the factors which are considered for comparison and consider relevance of these factors from our business's point of view. The versions we are considering for comparison are PostgreSQL8.1 and MySQL5.0

Licensing Policy : In case of MySQL if you are using it for an Open Source project then the database software can be used free of cost. However if you are an Independent Software Vendor or Value Added Reseller or an Original Equipment Manufacturer and if you want to use MySQL with your product then you will have to buy the commercial license. Even for the commercial license the cost is very low compared to Proprietary products like Oracle.

If you want to use MySQL as an embedded database in your software product then you have to pay some price per license or copy.

In case of Postgre SQL the licensing is very liberal whatever be your application open source or commercial the cost is zero. This factor is of considerable importance for us [8], [10].

Supported Operating Systems: There is not much distinction here, both DBMS support Windows, Unix and Linux some of the 64 bit operating systems are also supported by both [8], [11].

Programming Language and Interfaces you can use:

In case of PostgreSQL you can use C,C++, C#, Java, VB, PHP among others. Several different client interfaces are available. Among them are ODBC, JDBC, an embedded SQL interface in C.

In case of MySQL API available for C, PHP, Perl, C++, Python, Java, Ruby among others. Drivers are available for JDBC, ODBC[8], [11].

Database Upper Limits:

PostgreSQL[1] MySQL[11][6]

Maximum Database Size Unlimited Unlimited

Max. Table Size 32TB OS and DB Engine dependent

Max. Row Size 1.6TB 64535 bytes(MyISAM)

Max. Field Size 1 GB All fields add up to 64535

Max Rows per Table Unlimited Limited by table space(64TB)

Max. Columns per Table 250 – 1600 1000

Max. Indexes per Table Unlimited 64

Data Integrity:

In case of MySQL it was expected that the application program will ensure that inconsistent data will not enter in the database and as such nothing was done at database level to ensure consistency of data resulting in hardship for many, this changed in MySQL version 5.0 now server enforced data integrity thru strict SQL Mode however MySQL Client can over ride this strict SQL Mode so there is still a possibility of inconsistent data getting into database.

PostgreSQL has been consistent in enforcing data integrity at server level without allowing any loophole[1].

Database Engines :

In case of MySQL there are many options for database engines are available like MyISAM, InnoDB..etc each with its own peculiarities. Read performance of MyISAM is excellent and is instrumental in giving MySQL the aura of speed however it does not support foreign keys and ACID compliance is not possible .These issues are taken care of in InnoDB but it is relatively slow also To InnoDB was recently acquired by Oracle and future of its licensing is uncertain. MySQL people are in the process of developing their own database engine “Falcon”

Database Engine is integral part of PostgreSQL[1], [11].

SQL Support

PostgreSQL supports a subset of both SQL92 and SQL 99 standards and almost 150 of mandatory 164 features

Similarly MySQL. Also supports sub set of SQL-92 and 99 [8], [6].

Stored Procedures ,Triggers and Views

Postgre SQL supports all the above

My SQL started supporting stored procedures, triggers and views from version 5.0[8], [11].

Locking:

In case of multi user environment data integrity can be ensured by DBMS enforcing table , page or row level locking. Out of these locks row level lock offers maximum concurrent data access as only affected rows are locked and not the entire table or data page. This kind of granularity is least restrictive for the users

There is another approach called Multi Version Concurrency Control that ensures data integrity in case of long running read transaction without having to lock the read data.

Postgre SQL supports both row level locking and MVCC

My SQL: Here the InnoDB engine supports row level locking and MVCC. MyISAM engine supports table level locking [8], [11].

Transactional DDL:

In case of PostgreSQL any table modification irrespective of how large it is can be rolled back except for operations like create or destroy database or table space.

In MySQL this feature is not available for MyISAM, even in InnoDB any change is immediately committed[1].

Replication:

Replication is a process where in you keep two or more databases in sync , in part or as a whole. One database acts as a master and other act as slaves. Replication is supported by both MySQL and PostgreSQL [5].

Database backup

PostgreSQL provides command like pg_dump that creates a text file which has the entire content of database at that moment. Using this text file you can create database which matches exactly with its state at the time of dump.

In MySQL similar command mysqldump is available[5].

Speed :[5].

Here we can not easily compare the two DBMS s as there are many issues involved pertaining to tuning and optimizing with related hardware but according to Reuven Learner “From the benchmarks I have seen, it appears that MySQL is indeed faster than PostgreSQL when working with small number of clients or with read only data. However all the comparisons I have seen over the last few years indicate that as more clients are added to the system,PostgreSQL handles the load better”.

Ease of Administration:

This feature is again not easy to compare without getting your hands dirty, in absence of that if we go by what Reuven Learner has said “Both MySQL and PostgreSQL are amazingly easy to administer especially in small and medium size cases you (optionally) change a few configuration options, start the server and walk away”[5].

Support:

Popularity wise MySQL is far more popular than PostgreSQL as such here is far more support available in the form of books, tutorials, mailing lists etc for MySQL than PostgreSQL. Commercially also support options for MySQL are more than PostgreSQL[5].

Conclusion:

Though there are many more factors on which we can compare the DBM Systems I believe as far as the technical issues are concerned , from our application's point of view both the options are equally good . On issues like ease of administration, support MySQL is tad better than PostgreSQL, however if we are going to embed any DBMS in our application then the money that we will have to shell out for licensing will be an important determinant of our DBMS solution. MySQL people are very flexible with regard to price they can give you discount over the list price or can work out the price as percentage of your list price, in case of PostgreSQL the cost is zero however the concept of branding is applicable here also, MySQL being more known, there could be reluctance on our customer's part to use PostgreSQL as such I believe MySQL is better choice .

References:

1

2 MySQL Deserves a Double Take, Reuven Learner, Linux Journal Archive, Volume 2007, Issue 156 (April 2007) P.18

3 Open-source databases, Part II: PostgreSQL, Reuven Learner, Linux Journal archive, Volume 2007 , Issue 157 (May 2007) P.16

4 Surfing the Net for Software Engineering Notes, Mark Doernhoefer, ACM SIGSOFT Software Engineering Notes archive

Volume 31 , Issue 5 (September 2006) P6-15

5 Open-source databases,Part3;choosing a database, Reuven Learner, Linux Journal Archive,Volume 2007, Issue 158 (June 2007) P.17

6 Research Paper, Comparison of the Enterprise Functionalities of Open Source Database Management Systems, Fabalabs Software GmbH, Honauerst. 4 A-4020 Linz

7 U.S. Department of Education, Institute of Education Sciences Public Libraries in the United States: Fiscal Year 2004

Library Statistics Program

8.

9

10

11

| |

| |

| |

| | |

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download