CheckCheckPoint
From PTAGISWiki
Checkpoint Check -- Running Ckpdb
Manual Mode Operation
To run 'snapckp' manually (since automatic launch is disabled right now):
Make sure no sessions are active in the Ingres installation (not just 'user' DBMS server!!). It's handy to be logged in as 'ptagdev' for this step, so you can use the aliases defined there.
So - as 'ptagdev' - on 'blueback'
> see_fdl This will show most recent FDVL activity
> jld This will show most recent IDL activity ("Jay-ell-dee")
> tv Define TERM_INGRES=vt100fx ("tee-vee")
> ipm Look at sessions - Remember, this doesn't show sessions in 'loader' servers
If everything looks OK ...
Become 'root' on 'blueback' then ...
# cd /usr/ingres # ./batch_ckp.sh
This will launch snapckp as a batch job, then begin tail -f on /var/log/snapckp_log, so you can see what's going on. Sometimes, snapckp will fail to remove all of the Ingres sessions, and it will abort. That's why it's important to watch the log and make sure the job is actually running. If it fails, you'll know about it in a few seconds; simply launch it again. Only rarely have I had to launch it more than twice; never more than three times. As soon as you see 'sync sync sync' in the log file, you can be sure the job is launched and running.
When snapckp is finished, you'll have to deal with the fact that it may have blown away pooled database connections used by deployed web apps. This is accomplished easily by running the following script (many thanks to Ryan for setting it up); again as 'root' on 'blueback' ...
# cd /usr/ingres # ./rshRedeploy.sh
Any questions, my cell phone is 253-380-9661
-Doug-
Automatic Mode Recovery
'snapckp' has been running automatically only about half of the time in the recent past. For that reason, here's what I do every morning:
In order to make use of the aliases mentioned below, log in as 'ptagdev' on 'blueback' ... (Dave knows the password if you don't!!)
>idb <-- Runs 'infodb ptagis3'
blueback:ptagdev > idb
==================Thu Apr 27 15:38:56 2006 Database Information=================
Database : (ptagis3,pittag) ID : 0x3E77B57C Collation : default
Extents : 13 Last Table Id : 1538154
Config File Version Id : 0x00060001 Database Version Id : 6
Mode : DDL ALLOWED, ONLINE CHECKPOINT ENABLED
Status : VALID,JOURNAL,CKP,DUMP,ROLL_FORWARD,CFG_BACKUP
The Database has been Checkpointed.
The Database is Journaled.
Journals are valid from checkpoint sequence : 941
----Journal information---------------------------------------------------------
Checkpoint sequence : 943 Journal sequence : 26222
Current journal block : 886 Journal block size : 16384
Initial journal size : 4 Target journal size : 512
Last Log Address Journaled : <1048133482:30895:5176>
----Dump information------------------------------------------------------------
Checkpoint sequence : 943 Dump sequence : 693
Current dump block : 0 Dump block size : 16384
Initial dump size : 4 Target dump size : 512
Last Log Address Dumped : <0:0:0>
----Checkpoint History for Journal----------------------------------------------
Date Ckp_sequence First_jnl Last_jnl valid mode
----------------------------------------------------------------------------
Thu Apr 20 23:00:15 2006 941 25956 26135 1 ONLINE
Tue Apr 25 23:00:28 2006 942 26136 26179 1 ONLINE
Wed Apr 26 23:00:14 2006 943 26180 26222 1 ONLINE
----Checkpoint History for Dump-------------------------------------------------
Date Ckp_sequence First_dmp Last_dmp valid mode
----------------------------------------------------------------------------
Tue May 17 06:53:23 2005 712 0 0 0 ONLINE
Tue May 17 06:54:11 2005 713 0 0 0 ONLINE
Tue May 17 06:56:14 2005 714 0 0 0 ONLINE
Thu May 19 20:33:58 2005 715 0 0 0 ONLINE
Thu May 19 20:36:31 2005 716 0 0 0 ONLINE
Thu Apr 20 23:00:15 2006 941 0 0 1 ONLINE
Tue Apr 25 23:00:28 2006 942 0 0 1 ONLINE
Wed Apr 26 23:00:14 2006 943 0 0 1 ONLINE
----Cluster Journal History-----------------------------------------------------
Node ID Current Journal Current Block Last Log Address
------------------------------------------------------------
None.
----Extent directory------------------------------------------------------------
Location Flags Physical_path
------------------------------------------------------------------
db1 ROOT,DATA /usr/db1/ingII/ingres/data/default/ptagis3
ii_journal JOURNAL /usr/arch/ingII/ingres/jnl/default/ptagis3
ii_checkpoint CHECKPOINT /usr/ckp/ingII/ingres/ckp/default/ptagis3
ii_dump DUMP /usr/arch/ingII/ingres/dmp/default/ptagis3
ii_work WORK /usr/arch/ingII/ingres/work/default/ptagis3
db2 DATA /usr/db2/ingII/ingres/data/default/ptagis3
db3 DATA /usr/db3/ingII/ingres/data/default/ptagis3
db4 DATA /usr/db4/ingII/ingres/data/default/ptagis3
db5 DATA /usr/db5/ingII/ingres/data/default/ptagis3
db6 DATA /usr/db6/ingII/ingres/data/default/ptagis3
db7 DATA /usr/db7/ingII/ingres/data/default/ptagis3
db8 DATA /usr/db8/ingII/ingres/data/default/ptagis3
work1 WORK /usr/wrk/ingII/ingres/work/default/ptagis3
===================================================================
See if snapckp ran automatically. If so, great!! Otherwise ...
>tv <--- (Thats lower-case tee-vee) Sets TERM_INGRES to vt100 and exports it > ipm
See if any user reports are running
>see_fdl <-- Shows recent FDVL activity: Make sure that users aren't currently busy
blueback:ptagdev > tv TERM_INGRES = vt100fx blueback:ptagdev > see_fdl total 21880 -rw-r--r-- 1 ptagdev other 11236 Apr 27 14:06 2006_032040_E.L -rw-rw-rw- 1 daemon other 1583 Apr 27 14:04 2006_032040_E.V -rw-r--r-- 1 ptagdev other 11235 Apr 27 13:07 2006_032025_E.L -rw-rw-rw- 1 daemon other 1582 Apr 27 13:06 2006_032025_E.V -rw-r--r-- 1 ptagdev other 1990 Apr 27 13:04 2006_032017_P.L -rw-rw-rw- 1 daemon other 1555 Apr 27 13:03 2006_032017_P.V -rw-r--r-- 1 ptagdev other 2694 Apr 27 12:48 2006_031958_E.L -rw-r--r-- 1 ptagdev other 2774 Apr 27 12:48 2006_031957_E.L -rw-r--r-- 1 ptagdev other 3390 Apr 27 12:48 2006_031956_E.L -rw-r--r-- 1 ptagdev other 2254 Apr 27 12:47 2006_031955_E.L -rw-r--r-- 1 ptagdev other 2254 Apr 27 12:47 2006_031954_E.L -rw-r--r-- 1 ptagdev other 1855 Apr 27 12:47 2006_031953_E.L -rw-r--r-- 1 ptagdev other 3574 Apr 27 12:47 2006_031952_E.L -rw-rw-rw- 1 daemon other 1578 Apr 27 12:47 2006_031958_E.V -rw-rw-rw- 1 daemon other 1557 Apr 27 12:46 2006_031957_E.V -rw-rw-rw- 1 daemon other 1557 Apr 27 12:46 2006_031956_E.V -rw-rw-rw- 1 daemon other 1557 Apr 27 12:46 2006_031955_E.V -rw-rw-rw- 1 daemon other 1557 Apr 27 12:46 2006_031954_E.V -rw-rw-rw- 1 daemon other 1555 Apr 27 12:46 2006_031953_E.V blueback:ptagdev >
Does this mean that users are "currently busy"???
>jld <-- (That's lower-case jay-ell-dee) Extracts IDL start/stop times from $JOB_LOG: Make sure IDL isn't running
If it looks like the system is quiet ... >su - root
- cd /usr/ingres # ./run_snapckp.sh <-- Database is unavailable for just a few seconds (though ckp takes well over an hour)
In another x-term, again on 'blueback' > tail -f /var/log/snapckp_log <-- Just so you can see what's going on
Occasionally (maybe 1 time out of 10) snapckp will fail to remove sessions from the server when launched manually. If that happens, you'll find out within about 30 seconds - your './run_snapckp.sh' will "complete" and you'll see the '#" prompt again.
If that happens, just launch it again (still as 'root!!)
That's all there is to it.
Hope things go great for you while I'm away. I'll have my cell phone handy (253-380-9661) - don't hesitate to call if anything comes up (We'll be staying at the Embassy Suites Hotel in downtown Philadelphia).
-Doug-
