WA Performance Health Check Template



WA Performance Health CheckThe WA Performance Health Check is designed to identify potential bottlenecks that may arise during a period of heavy WA usage. Load testing was conducted with JMeter to simulate a group of users running Search for Sections. Search for Sections is a resource intensive process and therefore usually provides accurate data of how the Colleague server(s) and WA web server(s) will perform during heavy usage.JMeter testing is not meant to predict how many students can register at one time. Although Datatel has designed the JMeter scripts to simulate actual WA users, there is no substitute for a live registration. The JMeter script also only runs Search for Sections and does not encompass all the WA processes that users may run during a period of heavy WA usage.ConfigurationBefore any testing was conducted the existing server, listener and web server software settings were captured. These original settings and any changes made during the testing are listed below.Production App ListenerBefore testingAfter testingMax Concurrent Connections1001000JVM Arguments<none>-server -Xmx1024mJVM PathD:\Java\jdk1.6.25D:\Java\jdk1.6.25\jre (to take advantage of -server option)Registry Pool Size5Connection Pool Size20Web Overflow Pool Size3UI Overflow Pool Size<none>DMI Client Sockets Max <none>490Client Socket Timeout<none>600Socket Recycle Interval<none>1000Security Token Expires600Production DB ListenerBefore testingAfter testingMax Concurrent Connections100JVM PathD:\Java\jdk1.6.25D:\Java\jdk1.6.25\jre (to take advantage of -server option)JVM Arguments<none>recommend -Xmx2048mProduction App ServerBefore testingAfter testingNumber of CPU cores4Speed of CPUs3.2 GhzPhysical Memory3.75 GBProduction DB ServerBefore testingAfter testingNumber of CPU cores8Speed of CPUs2.7 GhzPhysical Memory8 GBWA Web ServerBefore testingAfter testingNumber of CPU cores4Speed of CPUs2.8 GhzPhysical Memory4 GBServletExec 5 ISAPIBefore testingAfter testingJVM_OPTIONS4096 K minimum65536 K maximum524288 K minimum1048576 K maximumimplemented workaround from doc 6088JMeter testingAt restApp ServerDB ServerWeb ServerBaseline testWe ran 100 threads in 30 seconds through JMeter. Each thread consists of 21 requests to simulate a student, including menu requests and a Search for Sections of the 2011 term for all Music Private Instruction classes (MUPI), which equaled just over 200 sections. This totals 2,100 requests over the course of 30 seconds.App ServerDB ServerWeb ServerAbout 1/3 of the way through, the test stopped. JMeter seems to have stopped getting threads through, and the web server was responding extremely slowly. We did an iisreset to restore the web server functionality.The laptop we were using seemed to be having trouble as well, so we switched to a more powerful machine and put JMeter on it. We tried 75 threads in 30 seconds with the same criteria as above (totaling 1,575 requests).App ServerDB ServerWeb ServerJMeter ResultsThe web server was the clear bottleneck. Each request averaged 12.7 seconds each. The paging requests and initial web requests are all inflated because the web server was under such a heavy load.JMeter Test 1Made the following changesAppListenerMax Concurrent Connections = 1000 (to prevent an arbitrary wait time to get into the Connection Pool, we’ll set this number high enough that it won’t come into play)JVM Options - allocating 1 GB max heap size (to make sure it always has enough memory)WSPM parameters:DMI Client Sockets Max = 490 (this is the number of persistent sockets between the web server and the App Server)Client Socket Timeout = 600 (how many seconds the socket remains open before closing due to inactivity)Socket Recycle Interval = 1000 (how many transactions to process through a socket before recycling the socket)ServletExecImplemented workaround from Doc 6088Because the web server has plenty of free memory, I set the minimum heap size at 512 MB and max heap at 1 GB. I set the minimum at 512 MB so it will have that much from the beginning and will only need to incur overhead at acquiring more memory if it ever gets higher than that. App ServerDB ServerWeb ServerJMeter ResultsSA Valet StatisticsSo the throughput between the web server and App Server has increased significantly. The response time has increased dramatically as well. The average execution time is no 4.5 seconds.The SA Valet statistics shows us that the # of licenses could be increased to reduce the average wait time that each request is waiting to get to a license. However, the App Server CPU is maxed out, so increasing the licenses would be of no help. Therefore, the bottleneck at this point is the App Server CPU.We also checked WAGC, and it was set to run every 12 hours. Checked the WWW files, and WWW.STATE.LOCAL was 2 GB (though not damaged yet). We stopped WAGC and re-started it with Clear Files = Yes (to clear the WWW files). We also changed the JVM path on the App Listener to the JRE instead of the JDK so we could add “-server” to the JVM options. We tried the test again. The CPU usage on all the servers was similar, but the response times went down further.JMeter ResultsWAGCThe WAGC process runs in the background and periodically deletes expired records from the WWW files. It is very important that WAGC is always running. If it is stopped for an extended period of time then records will accumulate in the WWW files, which will adversely affect WA performance. WAGC can be bounced independently of any of the app listeners but it will be halted if the Production_DB_Listener is bounced. WAGC should be restarted immediately after a bounce of the Production_DB_Listener.A complete list of all the WWW files is as follows:WWW.TOKENSWWW.STATEWWW.STATE.FVARWWW.STATE.LOCALWWW.STATE.TXWWW.STATE.ARGSWWW.EVENT.SEQUENCEWWW.MENU.CACHEWWW.SCREEN.DEBUGWWW.QUERY.RESULTSWWW.QUERY.RSLT.SPECWWW.QUERY.SAVELISTThese WWW files are used to store temporary data for all WA requests. The original block size and modulo of each file determines the initial size of the file at the OS level. As WA requests are processed, more and more records are added to the WWW files, which can push one or more files into level 1 overflow. When overflow occurs, Unidata will ask for more space from the operating system to store the data in the file. Eventually WAGC will run and remove expired records from the WWW files. This reduction in records can take the WWW file(s) out of overflow. As more WA requests are processed, one or more of the WWW files can be pushed into overflow again. When this occurs, Unidata will not reuse the previous overflow space that it obtained. Instead, it will ask for more overflow space and the size of the WWW file(s) will grow further at the operating system level. All of these files are static Unidata files. A static Unidata file has a size limitation of 2GB. If a static file reaches 2GB then Unidata will no longer be able to write to the file. As the WWW files go in and out of overflow, the size of each file could eventually reach the 2GB limit. The only way to reduce the size of the files back down to their original size (given their block size and modulo) is to run a CLEAR.FILE command against them.WAGC contains a Clear Files flag that will run CLEAR.FILE against all the WWW files and reduce the size of the files back down to the original size. The Clear Files flag will need to be used with WAGC occasionally to keep the size of the WWW files under control. The frequency in which the WWW files will need to be cleared depends on the amount of WA traffic. During a heavy registration period, it may be necessary to clear the files each night.When WAGC is started with the Clear Files flag set to Yes, all app listeners in the environment should be stopped. Below are the proper steps for clearing the WWW files.Stop all app listeners in the environment.Stop WAGC if it is already running.Start WAGC with the Clear Files flag set to Yes.Start all the app listeners again.When using the Clear Flags flag, WAGC will clear the WWW files when it first starts up. It will not continue to clear the files every time WAGC executes.If desired, WAGC can be set up to run in a Windows batch job and then scheduled to clear the WWW files nightly using Windows Scheduler. AnswerNet Document 5401 outlines the steps to call WAGC and start it up manually from within a batch job.LicensesIt is important to understand license usage as it can effect WA performance. If the app listener servicing WA requests tries to obtain additional licenses (due to heavy load) but there are no licenses available then WA performance will suffer. Ensuring that the app listener servicing WA requests has access to all the licenses it needs is an important part of a successful registration.There is a total of 4 pools of licenses that each app listener (app listeners include UI Web and UI 4 listeners) can draw licenses from. Registry Pool Size (located on the Properties form of each listener)Connection Pool Size (DMI Defaults)Web Overflow Pool Size (DMI Defaults)UI Overflow Pool Size (DMI Defaults)The Registry Pool Size is unique to each listener but the values of the three pools on DMI Defaults is used by all app listeners in that environment. For example, if the Connection Pool Size is 20 then each app listener in the environment can use up to 20 licenses from the Connection Pool.App listeners only use licenses as they are needed. When an app listener starts, it will automatically obtain one license. Additional licenses will only be used if there is enough demand to warrant the use of the additional license(s). There is no way to reserve licenses for an app listener or make the app listener obtain a certain number of licenses when it starts up.The table below shows the predicted and maximum license usage for each app listener in production. The predicted license usage is just that, a prediction of the licenses each listener will use during a heavy WA load. It is entirely possible that more licenses could be used if there is heavy demand on that app listener.App ListenerPredictedMaximumProduction_APP_LISTENER2828Production_APP_UI528Total3356The predicted license usage was calculated using the following:Connection Pool Size + Web Overflow + UI Overflow + Registry Pool Size20 + 3 + 0 + 5 = 28C73 has a total of 74 licenses. Given the predicted license usage above, this leaves 41 licenses for UI users and other applications that need to connect to Colleague, like SA Valet. Again, these numbers are only predictions and actual license usage by the app listeners can vary. If license usage becomes an issue during registration then consider shutting down listeners that are not needed, like listeners from other environments. The ‘listuser’ command can be used to list all the licenses that are currently in use. If certain UI users show up in the list as using multiple licenses then consider asking those users to log out of their additional UI sessions. SummaryThe JMeter response times in the above tests are not intended to predict the actual response time for students during registration. The load generated by JMeter is manufactured by scripts and is intended to expose bottlenecks, not to predict true response times for a user. The baseline overall average for all 1,575 total requests was 12.7 secondsAfter performance tuning , the average for all 1,575 requests is 3.5 seconds (72% improvement)To further increase performance:The current bottlebeck is the App Server CPU. As you consider new hardware for your App Server, please focus on adding more cores to further distribute the load being placed on the App Server.During our planning meeting, Greg mentioned that occasionally the system ran out of licenses. One option to mitigate this problem is to reduce the size of the Connection Pool. The reason I offer this as an option is because the App Server CPU is already maxed out. Therefore, decreasing your Connection Pool may not reduce your performance, and may buy you a few extra licenses in the process (until you upgrade your App Server) License Recommendations:As a guide for how many licenses you’ll want during peak registration:Start with your maximum number of UI users. This is not Portal/WebAdvisor users, this is only the users logging in to UI 4 / UI Web / UI Desktop. Add the total predicted licenses for your AppListeners as described in the previous section. Currently, this is 33.Add any phantom processes you expect to run during the day (when others are logged in). This includes the garbage collector, which runs as a Phantom.Add another 5 for SA Valet, which uses licenses to connect to each environmentUpdate your udtconfig file (located on your Colleague Server where Unidata is installed, usually under … /IBM/udxx/include). Tuning the following parameters (specifically the GLM settings) will help I/O performance on the WWW files, which live in Unidata. Once these parameters are changed, Unidata will need to be stopped/started for them to take effect.NUSERS should be your # of licenses * 1.25. For you, that would be 74 * 1.25 = 93, but if you will be adding 30 or so users soon, then leaving it at 128 should be fine.N_GLM_GLOBAL_BUCKET should be NUSERS * 3 and then the next prime #. You currently have 101. Please increase this to (128* 3) = 384, and the next prime # is 389N_GLM_SELF_BUCKET should be 53.NOTE: Increasing these settings will allocate more memory for Unidata to use for its read locks, which should improve overall performance across Unidata. The Global Memory Structure acts similarly to a Unidata file, where it needs to be sized appropriately or run the risk of going into overflow, thus degrading performance. Increasing the sizes of these parameters will prevent this overflow. The GLOBAL_BUCKET is a global setting, whereas the SELF_BUCKET parameter increases memory per process.If you continue with ServletExec to host WebAdvisor, I would consider upgrading your ServletExec version from ServletExec 5 ISAPI to ServletExec 6 AS. As of February 2009, New Atlanta no longer recommends using ServletExec ISAPI with IIS 6 or higher. Please see: In addition, this means that our JVM options tuning in ServletExec will likely have no affect because ServletExec is running within the IIS worker process (and therefore subject to limitations imposed by IIS and its AppPool). If you use ServletExec AS, then the java process will have its own isolated process, whose heap size can then be tweaked accordingly.Maintenance Recommendations:The Clear Files flag should be used on WAGC prior to the start of registration. Also, consider using the Clear Files flag on WAGC each night of registration. Even if the WWW files are not close to the 2GB limit, smaller WWW files result in quicker WA requests.Monitor license usage during registration. Try to ensure that the listener servicing WA requests has access to all the licenses it needs. If no more licenses are available then consider stopping listeners that are not needed or asking UI users to log out of extra sessions they may have open. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download