Plan for installing Ingres 2006 Release 2 SP2 on sockeye
From PTAGISWiki
Background
In response to a tagging file that crashes Ingres in production whenever we try to load it, we are bringing or Ingres installation up to the latest patch level in the hopes that the crash will no longer occur.
Overall plan
- install Ingres 2006 Release 2 SP2
- category selection is Product: Ingres Database > Release: 2006 > Platform: Sun Solaris SPARC
- downloaded filename is ingres2006-9.1.2-100-com-sun-solaris-sparc-32-64bit.tar
- md5sum: 9b85f3c4d4245fdd7dc341a76b8683f6 ingres2006-9.1.2-100-com-sun-solaris-sparc-32-64bit.tar
- install patch 13548
- perform installation on sockeye, pitblade, blueback, bay, jack
- maintain documentation on sockeye:/home/ptagdev/ptagis3/dba_chores/InstallIngres/Install_Ingres2006_SP2 as copied from upgrade_to_Ingres2006
Email from Ingres tech support
Here is the relevant message from Ingres tech support on issue 139541:
You don't need to install the SP1 first, you can go straight to the SP2 and then install the patch 13548 just after. Yes, you have a point, the service pack that you need is Ingres 2006 Release 2 SP2. It is always strongly recommended that you install it first on a test environment before going through the production one. The main difference of patch install and service pack install are: 1) The service pack works as an upgrade installation, it means a new version installed over a previous existent installation preserving all data and server configuration (CBF). 2) There is no formal uninstall procedure on service pack installation. If needed, an uninstall must be done by an OS copy of the II_SYSTEM directory without DATA, WORK, JNL, DMP, CKP and LOG (logfile) locations. Installing a patch: 1) Have a ckpdb of all your database. If possible an offline one would be better. 2) ingstop 3) If you want you can have an OS copy at this point 4) Install the patch 5) ingstart 6) Do tests 7) If any problem, you can uninstall the patch by following the guide lines into patch.doc Installing the Service Pack: 1) Have a ckpdb of all your database. If possible an offline one would be better. 2) ingstop 3) Have an OS copy of II_SYSTEM directory without DATA, JNL, DMP, WORK, CKP, LOG locations. If you want you can also have an OS copy of these locations separated. 4) Install the Service pack over your existent installation. 5) ingstart 6) Do tests 7) If any problem, you can uninstall the Service pack by coping back the files copied before on (3)
Documentation included in SP2
Documentation in patch 13585
Installation script/history
Development detailed plan
database checkpoint of sockeye
- root crontab runs /usr/ingres/run_snapckp.sh snapckp
- found that it was last run on Aug 29, 2009
- activated it for running on sockeye Oct 20, 2009
filesystem backup of II_SYSTEM on sockeye
filesystem backup of II_SYSTEM on pitblade
install SP2 on sockeye
- SP2 was installed and a test interrogation file was loaded
- I copied a file from loaded/interrogation/TST/2007 to interrogation/hold
- set up p3_daily_3.sh to run in pittag's crontab
/usr/pit/pittag/bin/p3_daily_3.sh: STARTED @ Mon Oct 26 09:00:01 PST 2009 idl_dea_noplot.sh: STARTED @ Mon Oct 26 09:00:02 PST 2009 /usr/pit/pittag/bin/idl_dea_noplot.sh: 26/10/09 09:00 495 tags or test tags in 2460 records to load.. idl_dea_noplot.sh: Running in SERIAL mode.. fdc_maintainCountSRRT.pl: STARTED @ Mon Oct 26 09:00:35 PST 2009 fdc_maintainCountSRRT.pl: FINISHED @ Mon Oct 26 09:00:36 PST 2009 /usr/pit/pittag/bin/idl_dea_noplot.sh: FINISHED @ Mon Oct 26 09:00:36 PST 2009 /usr/pit/pittag/bin/p3_daily_3.sh: FINISHED @ Mon Oct 26 09:00:36 PST 2009
- Then ran an interrogation summary report on the load
cd /home/ptagdev/ptagis3/source/reports report -upittag ptagis3 int_load_summary -fryantest.out cat ryantest.out
PACIFIC STATES MARINE FISHERIES COMMISSION
PIT Tag Database
INTERROGATION DATA LOADER PERFORMANCE SUMMARY
FOR INTERROGATION DATA FILES UPLOADED 10/26/09
-0- The loader ran in SERIAL mode -0-
+----+-+--------------------------+---------------------+----------+
| S |A| Number of Rows | Time | Average |
| i |B| Inserted | | Rate |
| t |/+-------+--------+---------+----------+----------+ |
| e |S| Obs | Test | Fail | Start | End | Rows/Min |
+----+-+-------+--------+---------+----------+----------+----------+
TST S 57 9 0 10:00:17 10:00:29 330
+------+-------+--------+---------+---------------------+
| Sum | 57| 9 | 0 | Elapsed |
+------+-------+--------+---------+---------------------+==========+
| Overall: 66 rows | 0.2 min | 330 |
+---------------------------------+---------------------+==========+
Tested connectivity from pitblade via netutil and found that I had to reset the installation password on pitblade even though I set the password to the same value.
install patch on sockeye
- step by step actions taken to install patch 13585 on sockeye
start ingres on sockeye
install SP2 on pitblade
install patch on pitblade
start ingres on pitblade
Production detailed plan
database checkpoint of blueback
filesystem backup of II_SYSTEM on blueback
filesystem backup of II_SYSTEM on bay and jack
install SP2 on blueback
install patch on blueback
start ingres on blueback
install SP2 on bay
install patch on bay
start ingres on bay
install SP2 on jack
install patch on jack
start ingres on jack
Troubleshooting
The service pack and patch are installed on blueback and now jack is unable to connect. It returns this error when it attempts to run the query-runner.pl:
jack:D2:root: > ./query-runner.pl -u rday -r blah
DBI connect('blueback::ptagis3;-Gptagis_rpt','',...) failed: E_LQ0001 Failed to connect to DBMS session.
E_LC0001 GCA protocol service (GCA_REQUEST) failure.
Internal service status E_GC0142 -- The installation password for the
remote vnode is invalid.. at ./query-runner.pl line 72
Deep recursion on subroutine "main::catch" at ./query-runner.pl line 232, <DEF> line 41.
Deep recursion on subroutine "main::log_end" at ./query-runner.pl line 282, <DEF> line 41.
^C
However, testing the connection through netutil yields a success from jack to blueback.
Looking over the settings in netutil on bay and jack, they are identical. bay is able to run queries without difficulty using the same perl script.
I tried ingstop and ingstart on jack, but that didn't help anything.
I tried using netutil to set the installation password again, but that didn't help.
I found the user ingres on jack can connect without difficulty, but root cannot. Currently query-builder.pl is run as root. User rday can connect from jack to blueback just fine.
I tried running "mkvalidpw" as root on jack. It successfully reinstalled the binary. I then did an ingstop and ingstart on jack. The situation was unchanged after this: root cannot connect, but other users can.
checking environment
If I run on bay with the absolute minimum environment, it seems to work.
bay:D2:rday: > su - Password: Sun Microsystems Inc. SunOS 5.10 Generic January 2005 # cd /usr/local/pitweb/ptagis-1.0/bin # ./query-runner.pl -u rday -r blah # setenv setenv: not found # env HOME=/ HZ= LANG=en_US.UTF-8 LOGNAME=root MAIL=/var/mail/root PATH=/usr/sbin:/usr/bin SHELL=/sbin/sh TERM=xterm TZ=US/Pacific
On jack, the experience is different:
jack:D2:rday: > su -
Password:
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
You have mail.
# cd /usr/local/pitweb/ptagis-1.0/bin
# ./query-runner.pl -u rday -r blah
DBI connect('blueback::ptagis3;-Gptagis_rpt','',...) failed: E_LQ0001 Failed to connect to DBMS session.
E_LC0001 GCA protocol service (GCA_REQUEST) failure.
Internal service status E_GC0142 -- The installation password for the
remote vnode is invalid.. at ./query-runner.pl line 72
Deep recursion on subroutine "main::catch" at ./query-runner.pl line 232, <DEF> line 28.
Deep recursion on subroutine "main::log_end" at ./query-runner.pl line 282, <DEF> line 28.
^C# env
HOME=/
HZ=
LANG=en_US.UTF-8
LOGNAME=root
MAIL=/var/mail/root
PATH=/usr/sbin:/usr/bin
SHELL=/sbin/sh
TERM=xterm
TZ=US/Pacific
I made a copy of query-runner.pl that displays the full perl environment before trying to connect to the database. Here is the run on bay:
bay:D2:root: > ./query-runner-test.pl -u rday -r blah ENVIRONMENT key: BAY_DBA_NAME value: pittag key: BAY_NODE_DB value: blueback::ptagis3 key: BLUEBACK_DBA_NAME value: pittag key: BLUEBACK_NODE_DB value: ptagis3 key: CEW_OUTPUT_ROOT value: /home/pittag/out/cewOUT key: CEW_SRC value: /home/pittag/bin/CEW key: CLASSPATH value: /usr/ingres/ing2006r2/ingD2/ingres/lib/iijbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/iijbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/iijbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar: key: EDITOR value: vi key: HOME value: /home/rday key: II_SYSTEM value: /usr/ingres/ing2006r2/ingD2 key: ING_SET value: set lockmode session where readlock=nolock key: JACK_DBA_NAME value: pittag key: JACK_NODE_DB value: blueback::ptagis3 key: LANG value: en_US.UTF-8 key: LD_LIBRARY_PATH value: /lib:/usr/ccs/lib:/usr/ucblib:/usr/ingres/ing2006r2/ingD2/ingres/lib:/opt/SUNWspro:/usr/openwin/lib:/usr/dt/lib:/usr/ingres/SUNWspro key: LOGNAME value: root key: MAIL value: /var/mail//rday key: OLDPWD value: /usr/local/pitweb/ptagis-1.0/bin key: PATH value: /usr/ingres/ing2006r2/ingD2/ingres/bin:/usr/ingres/ing2006r2/ingD2/ingres/utility:/usr/ingres/ing2006r2/ingD2/ingres/files:/usr/ingres/ing2006r2/ingD2/ingres/lib:/usr/ingres/ing2006r2/ingD2/ingres/SUNWspro/bin:/usr/bin:/usr/local/bin:/usr/local/mysql/bin:/usr/ccs/bin:/usr/sbin:/usr/local/bin:/usr/ingres/ing2006r2/ingD2/bin:/usr/ingres/ing2006r2/ingD2/utility key: PITBLADE_DBA_NAME value: pittag key: PITBLADE_NODE_DB value: sockeye::ptagis3 key: PROMPT_COMMAND value: echo -ne "\033]0;"; uname -n; echo -ne "\007" key: PWD value: /home/pittag/bin/CEW key: SEBASTES_DBA_NAME value: pittag key: SEBASTES_NODE_DB value: blueback::ptagis3 key: SHELL value: /bin/bash key: SHLVL value: 2 key: SOCKEYE_DBA_NAME value: pittag key: SOCKEYE_NODE_DB value: ptagis3 key: SSH_CLIENT value: 10.1.2.59 42618 22 key: SSH_CONNECTION value: 10.1.2.59 42618 172.16.100.41 22 key: SSH_TTY value: /dev/pts/5 key: SUDO_COMMAND value: /bin/bash key: SUDO_GID value: 30 key: SUDO_UID value: 118 key: SUDO_USER value: rday key: TERM value: xterm key: TERM_INGRES value: vt100fx key: TZ value: US/Pacific key: USER value: root key: _ value: /bin/env
And here is the run on jack:
jack:D2:root: > ./query-runner-test.pl -u rday -r blah
ENVIRONMENT
key: BAY_DBA_NAME value: pittag
key: BAY_NODE_DB value: blueback::ptagis3
key: BLUEBACK_DBA_NAME value: pittag
key: BLUEBACK_NODE_DB value: ptagis3
key: CEW_OUTPUT_ROOT value: /home/pittag/out/cewOUT
key: CEW_SRC value: /home/pittag/bin/CEW
key: CLASSPATH value: /usr/ingres/ing2006r2/ingD2/ingres/lib/iijdbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/iijdbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/iijdbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/iijdbc.jar:/usr/ingres/ing2006r2/ingD2/ingres/lib/edbc.jar:
key: EDITOR value: vi
key: HOME value: /home/rday
key: II_SYSTEM value: /usr/ingres/ing2006r2/ingD2
key: ING_SET value: set lockmode session where readlock=nolock
key: JACK_DBA_NAME value: pittag
key: JACK_NODE_DB value: blueback::ptagis3
key: LANG value: en_US.UTF-8
key: LD_LIBRARY_PATH value: /lib:/usr/ccs/lib:/usr/ucblib:/usr/ingres/ing2006r2/ingD2/ingres/lib:/opt/SUNWspro:/usr/openwin/lib:/usr/dt/lib:/usr/ingres/SUNWspro
key: LOGNAME value: root
key: MAIL value: /var/mail//rday
key: PATH value: /usr/ingres/ing2006r2/ingD2/ingres/bin:/usr/ingres/ing2006r2/ingD2/ingres/utility:/usr/ingres/ing2006r2/ingD2/ingres/files:/usr/ingres/ing2006r2/ingD2/ingres/lib:/usr/ingres/ing2006r2/ingD2/ingres/SUNWspro/bin:/usr/bin:/usr/local/bin:/usr/local/mysql/bin:/usr/ccs/bin:/usr/sbin:/usr/local/bin:/usr/ingres/ing2006r2/ingD2/bin:/usr/ingres/ing2006r2/ingD2/utility
key: PITBLADE_DBA_NAME value: pittag
key: PITBLADE_NODE_DB value: sockeye::ptagis3
key: PROMPT_COMMAND value: echo -ne "\033]0;"; uname -n; echo -ne "\007"
key: PWD value: /home/pittag/bin/CEW
key: SEBASTES_DBA_NAME value: pittag
key: SEBASTES_NODE_DB value: blueback::ptagis3
key: SHELL value: /bin/bash
key: SHLVL value: 2
key: SOCKEYE_DBA_NAME value: pittag
key: SOCKEYE_NODE_DB value: ptagis3
key: SSH_CLIENT value: 10.1.2.59 54410 22
key: SSH_CONNECTION value: 10.1.2.59 54410 172.16.100.42 22
key: SSH_TTY value: /dev/pts/1
key: SUDO_COMMAND value: /bin/bash
key: SUDO_GID value: 30
key: SUDO_UID value: 118
key: SUDO_USER value: rday
key: TERM value: xterm
key: TERM_INGRES value: vt100fx
key: TZ value: US/Pacific
key: USER value: root
key: _ value: /bin/env
DBI connect('blueback::ptagis3;-Gptagis_rpt','',...) failed: E_LQ0001 Failed to connect to DBMS session.
E_LC0001 GCA protocol service (GCA_REQUEST) failure.
Internal service status E_GC0142 -- The installation password for the
remote vnode is invalid.. at ./query-runner-test.pl line 76
Deep recursion on subroutine "main::catch" at ./query-runner-test.pl line 236, <DEF> line 40.
Deep recursion on subroutine "main::log_end" at ./query-runner-test.pl line 286, <DEF> line 40.
^C
classpath defines iijdbc.jar incorrectly on bay
A diff of those reveals that there is a difference in the classpath variable. Bay seems to have a typo in the path to the ingres DAS driver class and so is probably still using the EDBC driver. Jack doesn't have that typo and is probably using the IIJDBC driver.
The bay classpath includes this incorrect path:
/usr/ingres/ing2006r2/ingD2/ingres/lib/iijbc.jar
I need to track down where that is being defined. The correct path is:
/usr/ingres/ing2006r2/ingD2/ingres/lib/iijdbc.jar
The path typo is not in the weblogic start (/usr/local/bea81/user_projects/domains/mydomain/startManagedWeblogic.sh) file as I expected.
The CLASSPATH typo is in bay:/usr/ingres/set_ing_install.sh
The CLASSPATH typo is a problem, but is not related to this particular issue. The perl query-runner.pl does not use the CLASSPATH variable.
root@jack cannot connect to blueback, but ingres@jack can
As a temporary measure, I rewrote query-runner.pl to drop permissions to ingres from root before connecting to blueback. This causes the output of reports to be written as ingres instead of root, so I had to chmod -R ingres on all the files in /usr/local/pitweb/ptagis-1.0/cvswd.
This should allow both app servers to work until Ingres tech support can determine a better solution.
More details at Ingres Tech Support Issue 141589
As a temporary measure, I altered query-runner.pl to setuid to ingres. This required changing permissions on the files in the user's report directories from root to ingres. This enabled me to get jack back into service while the problem is being resolved.
Resolution: bad vnode on jack
Ingres tech support suggested running netutil as root to see if it had a vnode entry that was preventing it from connecting. I did so and found that it had a private vnode entry for user root only. I removed that private vnode entry leaving the normal global vnode. That immediately fixed the problem.
I left the modifications to query-runner.pl in place since it makes sense to run that process with the least privileges required.
