Difference between revisions of "RepServer Throughput Script"

From SybaseWiki
Jump to: navigation, search
m
m
Line 3: Line 3:
 
o average throughput per queue (mb per minute)<br>
 
o average throughput per queue (mb per minute)<br>
 
o average throughput combined (all queues)<br>
 
o average throughput combined (all queues)<br>
o estimated time for a replicate to catch up<br>
+
o estimated time for a replicate to catch up (Re-sync ETA)<br>
 
o current queue sizes<br>
 
o current queue sizes<br>
 
There is also a maximum recorded throughput option but this appears not to work well as some of the figures have been suspicious.<br>
 
There is also a maximum recorded throughput option but this appears not to work well as some of the figures have been suspicious.<br>
Line 13: Line 13:
 
2) The script automatically determines the elapsed interval in minutes between samples so you can cron the script to your preferred interval, be it 3 minutes or 10 minutes, the mb/min figure will be calculated correctly.<br>
 
2) The script automatically determines the elapsed interval in minutes between samples so you can cron the script to your preferred interval, be it 3 minutes or 10 minutes, the mb/min figure will be calculated correctly.<br>
 
3) In addition to configuring the script to your environment, you will also need to create the rs_throughput table prior to first run.
 
3) In addition to configuring the script to your environment, you will also need to create the rs_throughput table prior to first run.
 +
4) Re-sync ETA: this value is provided based on the current throughput average for the period specified.  For example, this means that running rs_throughput.sh 15 may give a different Re-Sync ETA compared to running rs_throughput.sh 20
 
<pre>
 
<pre>
 
#!/usr/bin/ksh
 
#!/usr/bin/ksh

Revision as of 12:14, 20 February 2009

If you want to know just how much data repserver is processing, then this is the tool.
This script will tell you:
o average throughput per queue (mb per minute)
o average throughput combined (all queues)
o estimated time for a replicate to catch up (Re-sync ETA)
o current queue sizes
There is also a maximum recorded throughput option but this appears not to work well as some of the figures have been suspicious.
The real maximum throughput can be established by noting when the replicates start to fall behind in conjunction with the current throughput figures.
This script can be particularly useful in helping to ascertain if you need to throw more CPU at your repserver box.

Notes:
1) You will need to cron this script with the insert option (rs_throughput.sh insert) and allow it to collect at least 2 samples before being able to run it interactively to obtain throughput information. The script runs silently when run with the insert option.
2) The script automatically determines the elapsed interval in minutes between samples so you can cron the script to your preferred interval, be it 3 minutes or 10 minutes, the mb/min figure will be calculated correctly.
3) In addition to configuring the script to your environment, you will also need to create the rs_throughput table prior to first run. 4) Re-sync ETA: this value is provided based on the current throughput average for the period specified. For example, this means that running rs_throughput.sh 15 may give a different Re-Sync ETA compared to running rs_throughput.sh 20

#!/usr/bin/ksh
#
# Script: rs_throughput.sh
# Author: Bob Holmes - email: cambob@gmail.com
# Date  : 19/09/08
# Version : 1.0
# Usage : Run "rs_throughput.sh help" for full usage information.
# Description : This script enables monitoring and reporting
#               of repserver throughput.
# --------------------------
#    ***    Variables for customisation can be found by searching for "!!" ***
# --------------------------
# Modification history:
# 19/02/09 : Bob Holmes : Script modified to enable easier customisation prior
#                         to release on Sybase wiki. (modifications not tested)
# --------------------------
# Comments: This script relies on the reporting..rs_throughput table for data
#           collection and reporting.  The definition for this table can be
#           found at the end of this script. 
##############################################################################
#                           setup environment                                #
##############################################################################
#
# save input variables
if ! [[ -z $1 ]]
then
  export option1=$1
else
  export option1="all"
fi

if ! [[ -z $2 ]]
then
  export option2=$2
else
  export option2="15" # default number of minutes on which to report throughput
fi

# configure environment
HOSTNAME=`hostname`
. $HOME/admin/.syb_cfg.sh $HOSTNAME  # !! This line sets up the environment from
                                     # !! from an external shell script.


##############################################################################
#                           config variables                                 #
##############################################################################
# config:
primary_ds="prim_ds"                   # !! string used to identify the inbound queue
# queues to query on:-
q_list="prim_ds.db1 prim_ds.db2 replicate_ds.db1 replicate_ds.db2" # !!
insert_filter="db1 db2"                # !! data will only be collected for queue names which have this keyword in
REP_SRV="repsrv_rs"                    # !! the actual repserver from which to get sqm info
RPT_SRV="reporting_ds"                 # !! dataserver which manages the reporting database
RPT_DB="reporting"                     # !! reporting database name
RECIPIENTS="user@domain"               # !! email config (separate addresses with commas)
ADMINDIR="/opt/home/sybase/admin"      # !! sybase admin directory

# static:
SQMFILE="$ADMINDIR/rs_sqm_tmp1.tmp"
DATE=$(date "+%d/%m/%y %H:%M:%S")

# config aware environment variables: # !! You may need to replace the line below with the equivalent
                                      # !! for your own environment. 
RISQLCMD="isql -Usa -P`grep "$REP_SRV," $HOME/admin/.servers | cut  -d , -f 4` -S$REP_SRV -w1000"
ISQLCMD="isql -Usa -P${SYBPASS} -S${RPT_SRV} -w200 -D${RPT_DB}"  


##############################################################################
#                             functions                                      #
##############################################################################

check_previous_instance()
{
# Note: next line greps for 8 instances as tests proved that less could cause the script to exit incorrectly
if [ $(ps -ef | grep rs_throughput | egrep -v "vi|grep" | grep -v "sh -c" | wc -l | awk '{print $1}') -gt 8 ]
then
  # previous instance still running - probably hanging
  echo Previous instance still running.
  exit
fi
}

fn_checkusage()
{
if [[ -z $option1 ]]
then
  printf " Usage: ./rs_throughput.sh <all>|<q_name>|<help> [<minutes to display>]\n"
  exit
fi
}

bcalc()
{
awk 'BEGIN{EQUATION='"$*"';printf("%0.1f\n",EQUATION); exit}'
}

fn_housekeeping()
{
if [ -e $SQMFILE ]
then
  rm $SQMFILE
fi
$ISQLCMD <<-EOF
set nocount on
go
delete from rs_throughput where sample_time <= dateadd(dd,-28,getdate())
go
EOF
}

fn_insert_thruput_data()
{
# 1) get current segment usage/progress info
#connect to repserver, get sqm info - put into file
$RISQLCMD << eof > $SQMFILE
admin who,sqm
go
quit
eof

# 2) insert new segment data into rs_throughput table
#date
#echo "Queue Name, First segment, Last segment, Next segment"
cat $SQMFILE | egrep $insert_filter | sed 's/Awaiting Message/AwaitingMessage/; s/Awaiting Wakeup/AwaitingWakeup/; s/\(\.[0-9][0-

9]*\)\.[0-9]/\1/' | grep -v ":0 ${primary_ds}" | awk '{print $4,$14,$15,$16}' | while read q_name first_seg last_seg next_seg
do
  q_size=$(bcalc "$last_seg - $first_seg")
$ISQLCMD <<-EOF
set nocount on
go
declare @mb_processed numeric(8,2)
declare @prev_sample datetime
declare @mins numeric(8,2)
declare @tput numeric(8,2)
if not exists (select 1 from rs_throughput where q_name = "${q_name}")
insert rs_throughput values ("${q_name}", getdate(), ${first_seg}, ${last_seg}, ${next_seg}, ${q_size}, 0, 0)
else
begin
-- get previous sample_time as key
select @prev_sample = max(sample_time) from rs_throughput where q_name = "${q_name}"
-- get difference in mb between current next segment and previous next segment
select @mb_processed = (select ${next_seg} - next from rs_throughput where sample_time = @prev_sample and q_name = "${q_name}")

select @prev_sample=dateadd(ss,-10,@prev_sample) -- # must substract 10 seconds to ensure the next datediff doesn't reflect 2m 58s 

as only 2mins!
select @mins = datediff(mi,@prev_sample,getdate())
if @mins > 0
begin
  select @tput = round((@mb_processed / @mins),2)
  insert rs_throughput values ("${q_name}", getdate(), ${first_seg}, ${last_seg}, ${next_seg}, ${q_size}, @mb_processed, @tput)
end
end
go
EOF
done
}

fn_report_thruput()
{
if [[ $option1 = [0-9]* ]]
then
  option2=$option1
  option1="all"
fi

if ! [[ "$option1" = "all" ]]
then
  q_list=$option1
fi

printf " -------------------------------------\n"
printf " Run date $DATE\n"
printf " -------------------------------------\n"

for q_name in $q_list
do
$ISQLCMD <<-EOF
set nocount on
declare @mb_avg numeric(8,2)
declare @size numeric(8,2)
declare @eta datetime
select convert(char(19),sample_time) sample_time, q_name, size q_size, mb_processed, thruput "thruput (mb/min)"  from rs_throughput
--select convert(char(19),sample_time) sample_time, q_name, size q_size, mb_processed, thruput "thruput (mb/min)", dateadd(mi,

(round((size/thruput),0)),getdate()) ETA from rs_throughput
where sample_time >= dateadd(mi,-${option2},getdate()) and q_name like "%${q_name}%"

--get mb average
select @mb_avg = convert(numeric(8,2),avg(thruput)) from rs_throughput
where sample_time >= dateadd(mi,-${option2},getdate()) and q_name like "%${q_name}%"
--get current queue size
select @size = size from rs_throughput where q_name like "%${q_name}%" having sample_time=max(sample_time)
select @eta = dateadd(mi,round((@size/@mb_avg),0),getdate())
print ""
print " Average for ${q_name}: %1! mb/min,  Re-sync ETA: %2!", @mb_avg, @eta
print ""
go
EOF
done
}

fn_show_help_text()
{
printf "Usage examples:\n"
printf "rs_throughput.sh (runs with default values showing all queues over 15 minutes)\n"
printf "rs_throughput.sh <q_name>|<minutes>\n"
printf "rs_throughput.sh <q_name> <minutes>\n"
printf "rs_throughput.sh size|queue (shows current queue sizes)\n"
printf "rs_throughput.sh max (shows highest ever throughput for queue history)\n"
printf "rs_throughput.sh help (shows this helptext)\n"
printf "rs_throughput.sh insert # this is for cron only (inserts data to $RPT_DB..rs_throughput table)\n"
}

fn_show_max_thruput()
{
$ISQLCMD <<-EOF
print ""
set rowcount 0
set nocount on
declare @q_name char(22)
declare @sample_time datetime
declare @max_thruput numeric(8,2)
select q_name, sample_time, thruput into #throughput from rs_throughput where 1=0
print "Maximum throughput per queue:"
-- get list of queues
select distinct q_name, 0 as processed into #queues from rs_throughput
set rowcount 1
select @q_name=q_name from #queues where processed = 0
while @@rowcount != 0
begin
  insert #throughput select q_name, sample_time, thruput from rs_throughput
  where sample_time in
    (select sample_time from rs_throughput where q_name = @q_name having thruput = max(thruput))
  and q_name = @q_name
  update #queues set processed = 1 where q_name = @q_name
  select @q_name=q_name from #queues where processed = 0
end
set rowcount 0
select * from #throughput order by thruput desc
go
EOF
echo
}

fn_show_current_q_size()
{
$RISQLCMD << eof > $SQMFILE
admin who,sqm
go
quit
eof
printf "Current Stable Queue usage:\n"
cat $SQMFILE | egrep $insert_filter | sed 's/Awaiting Message/AwaitingMessage/; s/Awaiting Wakeup/AwaitingWakeup/; s/\(\.[0-9][0-

9]*\)\.[0-9]/\1/' | grep -v ":0 ${primary_ds}" | awk '{print $4,$14,$15,$16}' | while read q_name first_seg last_seg next_seg
do
  q_size=$(bcalc "$last_seg - $first_seg")
  printf "$q_name: $q_size\n"
done
}

fn_show_combined_avg_thruput()
{
$ISQLCMD <<-EOF
set rowcount 0
set nocount on
declare @q_name varchar(25)
declare @mb_avg numeric(8,2)
declare @mb_combined_avg numeric(8,2)
select @mb_combined_avg=0
-- get list of queues

select distinct q_name, 0 as processed into #queues from rs_throughput
set rowcount 1
select @q_name=q_name from #queues where processed = 0
while @@rowcount != 0
begin
  select @mb_avg = convert(numeric(8,2),avg(thruput)) from rs_throughput
  where sample_time >= dateadd(mi,-${option2},getdate()) and q_name = @q_name
  select @mb_combined_avg = @mb_combined_avg + @mb_avg
  update #queues set processed=1 where q_name = @q_name
  select @q_name=q_name from #queues where processed = 0
end
print " Combined thruput average: %1! mb/min", @mb_combined_avg
print ""
go
EOF
}

##############################################################################
#                             main program                                   #
##############################################################################

check_previous_instance
fn_checkusage

case "$option1" in
        help) fn_show_help_text;
              exit;;

      insert) fn_insert_thruput_data;
              exit;;

         max) fn_show_max_thruput;
              exit;;

       queue|size) fn_show_current_q_size;
              exit;;

           *) fn_report_thruput;
              fn_show_combined_avg_thruput;;
esac

fn_housekeeping


### TABLE DEFINTION FOR RAW DATA COLLECTION
### Use the table definition below to create the table which this script 
### requires in order to insert queue data and furthermore query this
### data in order to provide throughput information.
#create table rs_throughput (
#q_name varchar(25), 
#sample_time datetime, 
#first numeric(8,2), 
#last numeric(8,2), 
#next numeric(8,2), 
#size numeric(8,2), 
#mb_processed numeric(8,2),
#thruput numeric(8,2)
#)

# end program