Difference between revisions of "RepServer Throughput Script"
m |
m |
||
Line 3: | Line 3: | ||
o average throughput per queue (mb per minute)<br> | o average throughput per queue (mb per minute)<br> | ||
o average throughput combined (all queues)<br> | o average throughput combined (all queues)<br> | ||
− | o estimated time for a replicate to catch up<br> | + | o estimated time for a replicate to catch up (Re-sync ETA)<br> |
o current queue sizes<br> | o current queue sizes<br> | ||
There is also a maximum recorded throughput option but this appears not to work well as some of the figures have been suspicious.<br> | There is also a maximum recorded throughput option but this appears not to work well as some of the figures have been suspicious.<br> | ||
Line 13: | Line 13: | ||
2) The script automatically determines the elapsed interval in minutes between samples so you can cron the script to your preferred interval, be it 3 minutes or 10 minutes, the mb/min figure will be calculated correctly.<br> | 2) The script automatically determines the elapsed interval in minutes between samples so you can cron the script to your preferred interval, be it 3 minutes or 10 minutes, the mb/min figure will be calculated correctly.<br> | ||
3) In addition to configuring the script to your environment, you will also need to create the rs_throughput table prior to first run. | 3) In addition to configuring the script to your environment, you will also need to create the rs_throughput table prior to first run. | ||
+ | 4) Re-sync ETA: this value is provided based on the current throughput average for the period specified. For example, this means that running rs_throughput.sh 15 may give a different Re-Sync ETA compared to running rs_throughput.sh 20 | ||
<pre> | <pre> | ||
#!/usr/bin/ksh | #!/usr/bin/ksh |
Revision as of 12:14, 20 February 2009
If you want to know just how much data repserver is processing, then this is the tool.
This script will tell you:
o average throughput per queue (mb per minute)
o average throughput combined (all queues)
o estimated time for a replicate to catch up (Re-sync ETA)
o current queue sizes
There is also a maximum recorded throughput option but this appears not to work well as some of the figures have been suspicious.
The real maximum throughput can be established by noting when the replicates start to fall behind in conjunction with the current throughput figures.
This script can be particularly useful in helping to ascertain if you need to throw more CPU at your repserver box.
Notes:
1) You will need to cron this script with the insert option (rs_throughput.sh insert) and allow it to collect at least 2 samples before being able to run it interactively to obtain throughput information. The script runs silently when run with the insert option.
2) The script automatically determines the elapsed interval in minutes between samples so you can cron the script to your preferred interval, be it 3 minutes or 10 minutes, the mb/min figure will be calculated correctly.
3) In addition to configuring the script to your environment, you will also need to create the rs_throughput table prior to first run.
4) Re-sync ETA: this value is provided based on the current throughput average for the period specified. For example, this means that running rs_throughput.sh 15 may give a different Re-Sync ETA compared to running rs_throughput.sh 20
#!/usr/bin/ksh # # Script: rs_throughput.sh # Author: Bob Holmes - email: cambob@gmail.com # Date : 19/09/08 # Version : 1.0 # Usage : Run "rs_throughput.sh help" for full usage information. # Description : This script enables monitoring and reporting # of repserver throughput. # -------------------------- # *** Variables for customisation can be found by searching for "!!" *** # -------------------------- # Modification history: # 19/02/09 : Bob Holmes : Script modified to enable easier customisation prior # to release on Sybase wiki. (modifications not tested) # -------------------------- # Comments: This script relies on the reporting..rs_throughput table for data # collection and reporting. The definition for this table can be # found at the end of this script. ############################################################################## # setup environment # ############################################################################## # # save input variables if ! [[ -z $1 ]] then export option1=$1 else export option1="all" fi if ! [[ -z $2 ]] then export option2=$2 else export option2="15" # default number of minutes on which to report throughput fi # configure environment HOSTNAME=`hostname` . $HOME/admin/.syb_cfg.sh $HOSTNAME # !! This line sets up the environment from # !! from an external shell script. ############################################################################## # config variables # ############################################################################## # config: primary_ds="prim_ds" # !! string used to identify the inbound queue # queues to query on:- q_list="prim_ds.db1 prim_ds.db2 replicate_ds.db1 replicate_ds.db2" # !! insert_filter="db1 db2" # !! data will only be collected for queue names which have this keyword in REP_SRV="repsrv_rs" # !! the actual repserver from which to get sqm info RPT_SRV="reporting_ds" # !! dataserver which manages the reporting database RPT_DB="reporting" # !! reporting database name RECIPIENTS="user@domain" # !! email config (separate addresses with commas) ADMINDIR="/opt/home/sybase/admin" # !! sybase admin directory # static: SQMFILE="$ADMINDIR/rs_sqm_tmp1.tmp" DATE=$(date "+%d/%m/%y %H:%M:%S") # config aware environment variables: # !! You may need to replace the line below with the equivalent # !! for your own environment. RISQLCMD="isql -Usa -P`grep "$REP_SRV," $HOME/admin/.servers | cut -d , -f 4` -S$REP_SRV -w1000" ISQLCMD="isql -Usa -P${SYBPASS} -S${RPT_SRV} -w200 -D${RPT_DB}" ############################################################################## # functions # ############################################################################## check_previous_instance() { # Note: next line greps for 8 instances as tests proved that less could cause the script to exit incorrectly if [ $(ps -ef | grep rs_throughput | egrep -v "vi|grep" | grep -v "sh -c" | wc -l | awk '{print $1}') -gt 8 ] then # previous instance still running - probably hanging echo Previous instance still running. exit fi } fn_checkusage() { if [[ -z $option1 ]] then printf " Usage: ./rs_throughput.sh <all>|<q_name>|<help> [<minutes to display>]\n" exit fi } bcalc() { awk 'BEGIN{EQUATION='"$*"';printf("%0.1f\n",EQUATION); exit}' } fn_housekeeping() { if [ -e $SQMFILE ] then rm $SQMFILE fi $ISQLCMD <<-EOF set nocount on go delete from rs_throughput where sample_time <= dateadd(dd,-28,getdate()) go EOF } fn_insert_thruput_data() { # 1) get current segment usage/progress info #connect to repserver, get sqm info - put into file $RISQLCMD << eof > $SQMFILE admin who,sqm go quit eof # 2) insert new segment data into rs_throughput table #date #echo "Queue Name, First segment, Last segment, Next segment" cat $SQMFILE | egrep $insert_filter | sed 's/Awaiting Message/AwaitingMessage/; s/Awaiting Wakeup/AwaitingWakeup/; s/\(\.[0-9][0- 9]*\)\.[0-9]/\1/' | grep -v ":0 ${primary_ds}" | awk '{print $4,$14,$15,$16}' | while read q_name first_seg last_seg next_seg do q_size=$(bcalc "$last_seg - $first_seg") $ISQLCMD <<-EOF set nocount on go declare @mb_processed numeric(8,2) declare @prev_sample datetime declare @mins numeric(8,2) declare @tput numeric(8,2) if not exists (select 1 from rs_throughput where q_name = "${q_name}") insert rs_throughput values ("${q_name}", getdate(), ${first_seg}, ${last_seg}, ${next_seg}, ${q_size}, 0, 0) else begin -- get previous sample_time as key select @prev_sample = max(sample_time) from rs_throughput where q_name = "${q_name}" -- get difference in mb between current next segment and previous next segment select @mb_processed = (select ${next_seg} - next from rs_throughput where sample_time = @prev_sample and q_name = "${q_name}") select @prev_sample=dateadd(ss,-10,@prev_sample) -- # must substract 10 seconds to ensure the next datediff doesn't reflect 2m 58s as only 2mins! select @mins = datediff(mi,@prev_sample,getdate()) if @mins > 0 begin select @tput = round((@mb_processed / @mins),2) insert rs_throughput values ("${q_name}", getdate(), ${first_seg}, ${last_seg}, ${next_seg}, ${q_size}, @mb_processed, @tput) end end go EOF done } fn_report_thruput() { if [[ $option1 = [0-9]* ]] then option2=$option1 option1="all" fi if ! [[ "$option1" = "all" ]] then q_list=$option1 fi printf " -------------------------------------\n" printf " Run date $DATE\n" printf " -------------------------------------\n" for q_name in $q_list do $ISQLCMD <<-EOF set nocount on declare @mb_avg numeric(8,2) declare @size numeric(8,2) declare @eta datetime select convert(char(19),sample_time) sample_time, q_name, size q_size, mb_processed, thruput "thruput (mb/min)" from rs_throughput --select convert(char(19),sample_time) sample_time, q_name, size q_size, mb_processed, thruput "thruput (mb/min)", dateadd(mi, (round((size/thruput),0)),getdate()) ETA from rs_throughput where sample_time >= dateadd(mi,-${option2},getdate()) and q_name like "%${q_name}%" --get mb average select @mb_avg = convert(numeric(8,2),avg(thruput)) from rs_throughput where sample_time >= dateadd(mi,-${option2},getdate()) and q_name like "%${q_name}%" --get current queue size select @size = size from rs_throughput where q_name like "%${q_name}%" having sample_time=max(sample_time) select @eta = dateadd(mi,round((@size/@mb_avg),0),getdate()) print "" print " Average for ${q_name}: %1! mb/min, Re-sync ETA: %2!", @mb_avg, @eta print "" go EOF done } fn_show_help_text() { printf "Usage examples:\n" printf "rs_throughput.sh (runs with default values showing all queues over 15 minutes)\n" printf "rs_throughput.sh <q_name>|<minutes>\n" printf "rs_throughput.sh <q_name> <minutes>\n" printf "rs_throughput.sh size|queue (shows current queue sizes)\n" printf "rs_throughput.sh max (shows highest ever throughput for queue history)\n" printf "rs_throughput.sh help (shows this helptext)\n" printf "rs_throughput.sh insert # this is for cron only (inserts data to $RPT_DB..rs_throughput table)\n" } fn_show_max_thruput() { $ISQLCMD <<-EOF print "" set rowcount 0 set nocount on declare @q_name char(22) declare @sample_time datetime declare @max_thruput numeric(8,2) select q_name, sample_time, thruput into #throughput from rs_throughput where 1=0 print "Maximum throughput per queue:" -- get list of queues select distinct q_name, 0 as processed into #queues from rs_throughput set rowcount 1 select @q_name=q_name from #queues where processed = 0 while @@rowcount != 0 begin insert #throughput select q_name, sample_time, thruput from rs_throughput where sample_time in (select sample_time from rs_throughput where q_name = @q_name having thruput = max(thruput)) and q_name = @q_name update #queues set processed = 1 where q_name = @q_name select @q_name=q_name from #queues where processed = 0 end set rowcount 0 select * from #throughput order by thruput desc go EOF echo } fn_show_current_q_size() { $RISQLCMD << eof > $SQMFILE admin who,sqm go quit eof printf "Current Stable Queue usage:\n" cat $SQMFILE | egrep $insert_filter | sed 's/Awaiting Message/AwaitingMessage/; s/Awaiting Wakeup/AwaitingWakeup/; s/\(\.[0-9][0- 9]*\)\.[0-9]/\1/' | grep -v ":0 ${primary_ds}" | awk '{print $4,$14,$15,$16}' | while read q_name first_seg last_seg next_seg do q_size=$(bcalc "$last_seg - $first_seg") printf "$q_name: $q_size\n" done } fn_show_combined_avg_thruput() { $ISQLCMD <<-EOF set rowcount 0 set nocount on declare @q_name varchar(25) declare @mb_avg numeric(8,2) declare @mb_combined_avg numeric(8,2) select @mb_combined_avg=0 -- get list of queues select distinct q_name, 0 as processed into #queues from rs_throughput set rowcount 1 select @q_name=q_name from #queues where processed = 0 while @@rowcount != 0 begin select @mb_avg = convert(numeric(8,2),avg(thruput)) from rs_throughput where sample_time >= dateadd(mi,-${option2},getdate()) and q_name = @q_name select @mb_combined_avg = @mb_combined_avg + @mb_avg update #queues set processed=1 where q_name = @q_name select @q_name=q_name from #queues where processed = 0 end print " Combined thruput average: %1! mb/min", @mb_combined_avg print "" go EOF } ############################################################################## # main program # ############################################################################## check_previous_instance fn_checkusage case "$option1" in help) fn_show_help_text; exit;; insert) fn_insert_thruput_data; exit;; max) fn_show_max_thruput; exit;; queue|size) fn_show_current_q_size; exit;; *) fn_report_thruput; fn_show_combined_avg_thruput;; esac fn_housekeeping ### TABLE DEFINTION FOR RAW DATA COLLECTION ### Use the table definition below to create the table which this script ### requires in order to insert queue data and furthermore query this ### data in order to provide throughput information. #create table rs_throughput ( #q_name varchar(25), #sample_time datetime, #first numeric(8,2), #last numeric(8,2), #next numeric(8,2), #size numeric(8,2), #mb_processed numeric(8,2), #thruput numeric(8,2) #) # end program