CheckCheckPoint

From PTAGISWiki

Jump to: navigation, search

Checkpoint Check -- Running Ckpdb

Manual Mode Operation

To run 'snapckp' manually (since automatic launch is disabled right now):

Make sure no sessions are active in the Ingres installation (not just 'user' DBMS server!!). It's handy to be logged in as 'ptagdev' for this step, so you can use the aliases defined there.

So - as 'ptagdev' - on 'blueback'

 
> see_fdl   This will show most recent FDVL activity
> jld          This will show most recent IDL activity ("Jay-ell-dee")
 
> tv           Define TERM_INGRES=vt100fx ("tee-vee")
> ipm         Look at sessions - Remember, this doesn't show sessions in 'loader' servers

If everything looks OK ...

Become 'root' on 'blueback' then ...

 
# cd /usr/ingres
# ./batch_ckp.sh

This will launch snapckp as a batch job, then begin tail -f on /var/log/snapckp_log, so you can see what's going on. Sometimes, snapckp will fail to remove all of the Ingres sessions, and it will abort. That's why it's important to watch the log and make sure the job is actually running. If it fails, you'll know about it in a few seconds; simply launch it again. Only rarely have I had to launch it more than twice; never more than three times. As soon as you see 'sync sync sync' in the log file, you can be sure the job is launched and running.

When snapckp is finished, you'll have to deal with the fact that it may have blown away pooled database connections used by deployed web apps. This is accomplished easily by running the following script (many thanks to Ryan for setting it up); again as 'root' on 'blueback' ...

# cd /usr/ingres
# ./rshRedeploy.sh

Any questions, my cell phone is 253-380-9661

-Doug-

Automatic Mode Recovery

'snapckp' has been running automatically only about half of the time in the recent past. For that reason, here's what I do every morning:

In order to make use of the aliases mentioned below, log in as 'ptagdev' on 'blueback' ... (Dave knows the password if you don't!!)

>idb <-- Runs 'infodb ptagis3'

blueback:ptagdev > idb
==================Thu Apr 27 15:38:56 2006 Database Information=================

    Database : (ptagis3,pittag)  ID : 0x3E77B57C  Collation : default
    Extents  : 13    Last Table Id : 1538154
    Config File Version Id : 0x00060001   Database Version Id : 6
    Mode     : DDL ALLOWED, ONLINE CHECKPOINT ENABLED
    Status   : VALID,JOURNAL,CKP,DUMP,ROLL_FORWARD,CFG_BACKUP

               The Database has been Checkpointed.
               The Database is Journaled.

               Journals are valid from checkpoint sequence : 941

----Journal information---------------------------------------------------------
    Checkpoint sequence :        943    Journal sequence :             26222
    Current journal block :      886    Journal block size :           16384
    Initial journal size :         4    Target journal size :            512
    Last Log Address Journaled : <1048133482:30895:5176>
----Dump information------------------------------------------------------------
    Checkpoint sequence :        943    Dump sequence :                  693
    Current dump block :           0    Dump block size :              16384
    Initial dump size :            4    Target dump size :               512
    Last Log Address Dumped : <0:0:0>
----Checkpoint History for Journal----------------------------------------------
    Date                      Ckp_sequence  First_jnl   Last_jnl  valid  mode
    ----------------------------------------------------------------------------
    Thu Apr 20 23:00:15 2006            941      25956     26135      1  ONLINE
    Tue Apr 25 23:00:28 2006            942      26136     26179      1  ONLINE
    Wed Apr 26 23:00:14 2006            943      26180     26222      1  ONLINE
----Checkpoint History for Dump-------------------------------------------------
    Date                      Ckp_sequence  First_dmp   Last_dmp  valid  mode
    ----------------------------------------------------------------------------
    Tue May 17 06:53:23 2005            712          0         0      0  ONLINE
    Tue May 17 06:54:11 2005            713          0         0      0  ONLINE
    Tue May 17 06:56:14 2005            714          0         0      0  ONLINE
    Thu May 19 20:33:58 2005            715          0         0      0  ONLINE
    Thu May 19 20:36:31 2005            716          0         0      0  ONLINE
    Thu Apr 20 23:00:15 2006            941          0         0      1  ONLINE
    Tue Apr 25 23:00:28 2006            942          0         0      1  ONLINE
    Wed Apr 26 23:00:14 2006            943          0         0      1  ONLINE
----Cluster Journal History-----------------------------------------------------
    Node ID   Current Journal   Current Block   Last Log Address
    ------------------------------------------------------------
    None.
----Extent directory------------------------------------------------------------
    Location                          Flags             Physical_path
    ------------------------------------------------------------------
    db1                               ROOT,DATA         /usr/db1/ingII/ingres/data/default/ptagis3
    ii_journal                        JOURNAL           /usr/arch/ingII/ingres/jnl/default/ptagis3
    ii_checkpoint                     CHECKPOINT        /usr/ckp/ingII/ingres/ckp/default/ptagis3
    ii_dump                           DUMP              /usr/arch/ingII/ingres/dmp/default/ptagis3
    ii_work                           WORK              /usr/arch/ingII/ingres/work/default/ptagis3
    db2                               DATA              /usr/db2/ingII/ingres/data/default/ptagis3
    db3                               DATA              /usr/db3/ingII/ingres/data/default/ptagis3
    db4                               DATA              /usr/db4/ingII/ingres/data/default/ptagis3
    db5                               DATA              /usr/db5/ingII/ingres/data/default/ptagis3
    db6                               DATA              /usr/db6/ingII/ingres/data/default/ptagis3
    db7                               DATA              /usr/db7/ingII/ingres/data/default/ptagis3
    db8                               DATA              /usr/db8/ingII/ingres/data/default/ptagis3
    work1                             WORK              /usr/wrk/ingII/ingres/work/default/ptagis3
===================================================================

See if snapckp ran automatically. If so, great!! Otherwise ...

>tv <--- (Thats lower-case tee-vee) Sets TERM_INGRES to vt100 and exports it > ipm

See if any user reports are running

>see_fdl <-- Shows recent FDVL activity: Make sure that users aren't currently busy

blueback:ptagdev > tv
TERM_INGRES = vt100fx
blueback:ptagdev > see_fdl
total 21880
-rw-r--r--   1 ptagdev  other      11236 Apr 27 14:06 2006_032040_E.L
-rw-rw-rw-   1 daemon   other       1583 Apr 27 14:04 2006_032040_E.V
-rw-r--r--   1 ptagdev  other      11235 Apr 27 13:07 2006_032025_E.L
-rw-rw-rw-   1 daemon   other       1582 Apr 27 13:06 2006_032025_E.V
-rw-r--r--   1 ptagdev  other       1990 Apr 27 13:04 2006_032017_P.L
-rw-rw-rw-   1 daemon   other       1555 Apr 27 13:03 2006_032017_P.V
-rw-r--r--   1 ptagdev  other       2694 Apr 27 12:48 2006_031958_E.L
-rw-r--r--   1 ptagdev  other       2774 Apr 27 12:48 2006_031957_E.L
-rw-r--r--   1 ptagdev  other       3390 Apr 27 12:48 2006_031956_E.L
-rw-r--r--   1 ptagdev  other       2254 Apr 27 12:47 2006_031955_E.L
-rw-r--r--   1 ptagdev  other       2254 Apr 27 12:47 2006_031954_E.L
-rw-r--r--   1 ptagdev  other       1855 Apr 27 12:47 2006_031953_E.L
-rw-r--r--   1 ptagdev  other       3574 Apr 27 12:47 2006_031952_E.L
-rw-rw-rw-   1 daemon   other       1578 Apr 27 12:47 2006_031958_E.V
-rw-rw-rw-   1 daemon   other       1557 Apr 27 12:46 2006_031957_E.V
-rw-rw-rw-   1 daemon   other       1557 Apr 27 12:46 2006_031956_E.V
-rw-rw-rw-   1 daemon   other       1557 Apr 27 12:46 2006_031955_E.V
-rw-rw-rw-   1 daemon   other       1557 Apr 27 12:46 2006_031954_E.V
-rw-rw-rw-   1 daemon   other       1555 Apr 27 12:46 2006_031953_E.V
blueback:ptagdev > 

Does this mean that users are "currently busy"???


>jld <-- (That's lower-case jay-ell-dee) Extracts IDL start/stop times from $JOB_LOG: Make sure IDL isn't running

If it looks like the system is quiet ... >su - root

  1. cd /usr/ingres # ./run_snapckp.sh <-- Database is unavailable for just a few seconds (though ckp takes well over an hour)

In another x-term, again on 'blueback' > tail -f /var/log/snapckp_log <-- Just so you can see what's going on

Occasionally (maybe 1 time out of 10) snapckp will fail to remove sessions from the server when launched manually. If that happens, you'll find out within about 30 seconds - your './run_snapckp.sh' will "complete" and you'll see the '#" prompt again.

If that happens, just launch it again (still as 'root!!)

That's all there is to it.

Hope things go great for you while I'm away. I'll have my cell phone handy (253-380-9661) - don't hesitate to call if anything comes up (We'll be staying at the Embassy Suites Hotel in downtown Philadelphia).

-Doug-

Personal tools