11th Jan

Oracle vs Microsoft SQL Server

Monday, January 11, 2016 - 13:06
0

Performance comparison in a real business application 

 

Why?

Last year we had a project in which we had to upgrade one of our client’s Clavis Security BackOffice system. One of the most important requirements was to reach a better performance in the next 5 years. The previous system used MSSQL and the new uses Oracle. We wanted to know how only the database engine influences the performance. There are plenty of performance tests on the internet between MSSQL and Oracle but unfortunately these are basically synthetic tests so we cannot get appropriate conclusions how our system works with these different database engine. That’s why we decided to make our own test with a real business application to compare the performance. This test is unique because it is not “sterile” at all and runs real business functionalities.

The blog doesn’t contains all details of the test only the most relevant aspects. If you’re interested please contact us for the full documentation (it is available only in Hungarian).

What?

Performance

Basically we would have liked to measure the performance of the database engines. The methodology was that we defined a test scenario (see it below!) and ran it with more and more thread number till one of the database reached its limit. The limit was specified by WebLogic timeout which was set up to 300 sec/database request. If the running of the test caused at least once timeout exception then the database engine reached the limit (other error was not acceptable - if caused we fixed the bug and ran the test again).

The measured thread numbers were the following:

“Loops” means how many times a thread had to run the test. With this setting we could influence the runtime of the full test.

Test scenario

The main concept of the collection of the test scenario was that it has to reflect a real client serving process. So the steps of the test are the following:

  • Login

  • Client searching

  • Client choosing

  • Account choosing

  • Cash-in transaction

    • Initializing

    • Processing

  • Investment unit - Buy transaction

    • Initializing

    • Security searching

    • Security choosing

    • Transaction processing

  • Investment unit - Sell transaction

    • Initializing

    • Security searching

    • Security choosing

    • Transaction processing

  • Checking the actual balances of all sub-accounts of the chosen account

  • Listings all balances of one of the sub-accounts

  • Listings all transactions on the transaction worksheet for the actual value date

  • Execution of the account balance report for the chosen account

  • Execution of the account history report for the chosen account

  • Logout

The lists and reports at the end of the list use data of the same tables that the transactions are working with and some of them can run several seconds. This is important first of all for the test of the MSSQL with RCI vs RCSI (see it below).

Isolation level settings in MSSQL

In the past there was only the simple read committed isolation level (RCI) in MSSQL till the version 2005. This method is very unuseful in parallel processes first of all because the engine locks the read records till the ends of the transaction. That’s why another session couldn’t work till the end of the locking process of the previous session. The newer read committed snapshot isolation level (RCSI) eliminates this problem so if this feature is switched on the parallel processes have to work much more faster.

In our tests we measure the performance with RCI and RCSI as well. Why we did that if the RCSI is definitely better? The answer is simple: we have several developments which were written before the introduction of MSSQL 2005 and sometimes the code uses the RCI “read-locking” feature so we cannot eliminate it so simple. But we would like to know how much advantage the RCSI has because if it is significant it’s worth to upgrade our older codes. (Meanwhile this upgrade procedure was begun partly based on results of these tests).

How?

Hardware and software environment

You can see in the following table the database and application server hardware and software environments.

Additional informations:

  • In fact the WebLogic runs on a virtualized hardware with the following differences compared to physical hardware:

    • Memory size: 12 GB

    • Operating system: Oracle Linux v6 Update 5

  • Additional softwares on application server:

    • JVM version: JRE 1.6.0 64 bit

    • Oracle driver class: oracle.jdbc.xa.client.OracleXADataSource

    • MSSQL driver class: com.microsoft.sqlserver.jdbc.SQLServerXADataSource

    • JavaScript engine: rhino1_6R1 (2004) - this is hardly used by Clavis because it is the main script language of the parametrization

  • We had to use our free hardwares and the configuration was specified on the basis of the previously runned tests. That’s why the application server hardware is 2 times stronger than the database server. We couldn’t reach the limits of the database servers otherwise.

 

Porting Clavis to MSSQL

Basically the new Clavis system uses the Oracle database engine, so we had to port the system to MSSQL as well in those objects which were used by the functionalities above. The system was written in JavaEE and the GUI is running as a browser thin client. In the porting procedure we tested the functionality and if we got an SQL exception we have tried to rewrite it to common SQL (e.g. coalesce instead of nvl). If it wasn’t possible we extended the SQL call to MSSQL compatible script but we took care that the functions work the same.

After the code worked without errors we tested the SQL calls of the test scenario in both case we optimized the scripts, the statistics and the server parameters according to the execution plans and resource usage. We took advantage of the special features of the database engines (e.g. index with included columns in MSSQL or BITMAP index in Oracle) but we avoided the usage of the hints and those features which are totally missing from the other engine (e.g. we didn’t set up the result cache under Oracle).

We developed detailed logging and reports to be able to check that all called SQL in the test is the same in both case (or there are only known differences).

 

Testing device

 

The test scenario was automatized in JMeter tool. After each call we checked whether there was any error or not. The test was successful if there were no errors. We ran the scenario parallel with different settings (thread number, rump-up time, etc.). While the Oracle test was running the MSSQL servers (database and application) were stopped and vice versa.

 

Database up-filling

 

Before we made the tests we filled up the database to a level of the biggest client database size. These data were consistent but randomly generated so it was near to a real business database. You can find the up-filling method in another blog post. The initial data of each table was the same in case of the 2 database engine. Another blog post will be available in this topic soon.

 

Analysis

 

After the different tests we analyzed the following statistics:

  • Run time of full test

  • Average response time

  • Maximal response time

  • Variance of response time

  • Relative variance of response time

  • Median of response time

  • Top decile of response time

  • Full response time graphs

  • Numbers of timeout exceptions

  • Database host processor, memory and disk monitoring

We made the conclusion based on this data.

Results

Oracle vs MSSQL with RCSI

Time statistics

Database HOSTs loading

The pictures show the loadings of the high level tests.

MSSQL:

Oracle:

MSSQL with RC - time statistics

Response time graph for the low level test:

 
 

Conclusions

Performance

Briefly summarized the winner is definitely the Oracle especially in the high level test. The MSSQL with RCI is not recommended in hardly multiple thread using systems. The disadvantages of the locking mechanism are definitely reflected even in the low level test: the response time of one request wasn’t adequate. The RCI reached its limit already in the middle level test that’s why we can’t measure this case with high and extreme level. The further part of this chapter is about only the Oracle and the MSSQL with RCSI.

The MSSQL with RCSI is quite good but the Oracle beat it. Both database engine reached its limit in the extreme level test but the Oracle reached it much later. The main problem of the MSSQL was the maximal response time in the high and extreme level tests. The main reason of these peaks were primary the too much hardware resource (especially the processor) usage. Very interesting the differences of the usage of the hardware in general as well; it is quite different: Oracle used 3 times more memory than the MSSQL and significantly less CPU. In this case it is recommended that the engine uses as much memory as it can. The conclusion is that the memory handling in Oracle is more efficient than in MSSQL that’s why delivered the latter worse results.

Other experiences

MSSQL:

  • The locking mechanism in MSSQL can potentially cause deadlocks. We experienced this In our test because of the following statement:

The problem is that in this case MSSQL chose page lock so it locked some records which were not modified. Fortunately we could avoid this problem (otherwise we couldn’t ensure the error-free operation) with rewriting the statement to the following:

After that MSSQL chose row lock so it locked only the 2 modified rows that’s why the deadlock was eliminated as well. There are several other cases in which MSSQL can cause deadlock unreasonably; you can soon read about this topic in a new blog spot.

 

  • The programming possibilities in MSSQL (Transact-SQL) are much poorer than in Oracle (PL/SQL). This is not such a real news because there are couple of articles on the internet which compare this 2 language but we had to port several procedures and other PL/SQL codes to MSSQL in the beginning of the test and we could see the Transact-SQL problematique close up.

 

Oracle:

  • If we have a query (e.g. in a function which returns a refcursor) with an argument and we would like to use this in 2 cases: once the argument gets a specified value once “all” (e.g. NULL) value and there is the following condition in the WHERE clause: tab.col in (arg, null). Oracle sometimes creates only one execution plan in both cases so it will use Full Table Scan which is good for “all” case but very bad in specific case. We couldn’t find a really good solution for this problem.

  • In some cases Oracle requires the histogram statistics otherwise the performance can unsatisfying. It is not a problem because the 11g gathers histograms automatically for the appropriate indexes if the server settings are good so it is highly recommended to check regularly that these statistics exist and are up-to-date.

 

 

 

Comments (0)
0