1、问题
同事将原Windows上的EAS服务器移植到CentOS后,在打开管理控制台--启动服务时卡住了,等了20分钟服务都没启动了。
查看了下admin/logs/admin.log,主要内容如下:
1 Find java home: /kingdee/eas/ibm-jdk/jre 2 Set java home: /kingdee/eas/ibm-jdk 3 [2017-12-07 13:35:14,887 INFO]Execute base path [/kingdee/eas/ibm-jdk/bin] cmd [java, -Xmx4096m, -Xms2048m, -XX:MaxPermSize=768m, -XX:PermSize=768m, -version] to check enough JVMHeap. 4 [2017-12-07 13:35:15,122 INFO]Process exitValue: 0 5 [2017-12-07 13:35:15,123 INFO]linuxOSTmpPath is [/kingdee/eas/admin/workspace] 6 [2017-12-07 13:35:15,126 INFO]Begin execute command [df -k /kingdee/eas/admin/workspace | awk '{FS="[ \t]+"}{print $4;}' | sed '1d'] ! 7 [2017-12-07 13:35:15,160 INFO]Execute command [df -k /kingdee/eas/admin/workspace | awk '{FS="[ \t]+"}{print $4;}' | sed '1d'] success ! 8 [2017-12-07 13:35:15,161 INFO]TemporaryDirectory is [/kingdee/eas/admin/workspace] and freeTempSpace is [65463568KB]. 9 [2017-12-07 13:35:15,162 INFO]Module [apusic4.0.3] isn't exists, try find module [apusic4]10 [2017-12-07 13:35:15,163 INFO]Module [apusic4] isn't exists, try find module [apusic]11 [2017-12-07 13:35:15,164 INFO]Get appengine[id:1872326553] by key[apusic:127.0.0.1/6888:admin:server1]12 [2017-12-07 13:35:15,165 INFO][/kingdee/apusic] versio is not 40313 [2017-12-07 13:35:15,168 INFO]Module [apusic4.0.3] isn't exists, try find module [apusic4]14 [2017-12-07 13:35:15,168 INFO]Module [apusic4] isn't exists, try find module [apusic]15 [2017-12-07 13:35:15,169 INFO]Get appengine[id:1872326553] by key[apusic:127.0.0.1/6888:admin:server1]16 [2017-12-07 13:35:15,170 INFO][/kingdee/apusic] versio is not 40317 [2017-12-07 13:35:15,172 INFO]Module [apusic4.0.3] isn't exists, try find module [apusic4]18 [2017-12-07 13:35:15,172 INFO]Module [apusic4] isn't exists, try find module [apusic]19 [2017-12-07 13:35:15,173 INFO]Get appengine[id:1872326553] by key[apusic:127.0.0.1/6888:admin:server1]20 [2017-12-07 13:35:15,178 INFO]Start application server [/kingdee/eas/server/profiles/server1/bin] cmd[/bin/sh, -C, startserver.sh, nohup].....21 [2017-12-07 13:35:15,216 ERROR]rm: cannot remove `/kingdee/apusic/common/jsf-api.jar': No such file or directory22 [2017-12-07 13:35:15,216 INFO]isServer1 : server123 [2017-12-07 13:35:15,227 ERROR]rm: cannot remove `/kingdee/apusic/lib/operamasks-impl.jar': No such file or directory24 [2017-12-07 13:35:15,258 INFO]Begin to make instance template dir ./workspace....25 [2017-12-07 13:35:15,259 INFO]../workspace exists,remove then create it...26 [2017-12-07 13:35:15,258 ERROR]rm: cannot remove `/kingdee/apusic/lib/ext/operamasks-third-party.jar': No such file or directory27 [2017-12-07 13:35:15,259 INFO]End to make instance template dir ./workspace....28 [2017-12-07 13:35:15,383 ERROR]JVMJ9GC032E System configuration does not support option '-Xlp'29 [2017-12-07 13:35:15,383 ERROR]JVMJ9VM015W Initialization error for library j9gc23(2): Failed to initialize; unable to parse command line30 [2017-12-07 13:35:15,385 ERROR]Could not create the Java virtual machine.
从日志中发现一个关键信息:JVMJ9GC032E System configuration does not support option '-Xlp'
第一反应是JVM参数配置有问题,找到 eas\server\profiles\server1\bin\set-server-env.sh配置文件中 确实有一项 配置JVM_CUSTOM_PARAMS中指定了-Xlp配置,将此配置取消后成功启动。
2、分析
因为使用的为IBM JDK,在官网找到了JVM相关参数的说明(http://lampwww.epfl.ch/java/jre-ibm-1.5/docs/zh_Hans/sdkandruntimeguide.lnx.zh_Hans.htm):
-Xlp Requests the JVM to allocate the Java heap with large pages. If large pages are not available, the JVM will not start, displaying the error message GC: system configuration does not support option --> '-Xlp'. The JVM uses shmget() to allocate large pages for the heap. Large pages are supported by systems running Linux kernels v2.6 or higher, or earlier kernels where large page support has been backported by the distribution. By default, large pages are not used. See Configuring large page memory allocation.
You can enable large page support, on systems that support it, by starting |Java with the -Xlp option.|Large page usage is primarily intended to provide performance improvements |to applications that allocate a lot of memory and frequently access that memory. |The large page performance improvements are mainly caused by the reduced number |of misses in the Translation Lookaside Buffer (TLB). The TLB maps a larger |virtual memory range and thus causes this improvement.|Large page support must be available in the kernel, and enabled, |to allow Java to use large pages.|To configure large page memory allocation, first ensure that |the running kernel supports large pages. Check that the file /proc/meminfo |contains the following lines:|HugePages_Total:|HugePages_Free: |Hugepagesize: The number of pages available |and their sizes vary between distributions.|If large page support is not available in your kernel, these |lines will not exist in the /proc/meminfo file. In this case, you must install |a new kernel containing support for large pages.|If large page support is available, but not enabled, HugePages_Total will be 0. In this case, your administrator must enable |large page support. Check your operating system manual for more instructions.|For the JVM to use large pages, your system must have an adequate number |of contiguous large pages available. If large pages cannot be allocated, even |when enough pages are available, possibly the large pages are not contiguous. Configuring the number of large pages at bootup will create |them contiguously.|Large page allocations will only succeed if the JVM has root |access. To use large pages, either run Java as root or set the suid bit of |the Java executable.
通过以上的内容的描述,才得知是操作系统没启用 large page support功能才导致eas 服务启动不了,查看了系统的/proc/meminfo,确实和文中描述一致。