Overview
Comment: | Gate test launch based on journal load. Values from load calc seem wrong. Should be 0-1.0 but seeing integers 0, 1, 2 ... |
---|---|
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | v1.81-journal-based-throttling |
Files: | files | file ages | folders |
SHA1: |
2635b582e7273323b216f9f1266066a0 |
User & Date: | mrwellan on 2024-07-10 18:10:05 |
Other Links: | branch diff | manifest | tags |
Context
2024-07-10
| ||
20:11 | Force values to be real in journal stats colletion. still broken though check-in: c906466bb0 user: matt tags: v1.81-journal-based-throttling | |
18:10 | Gate test launch based on journal load. Values from load calc seem wrong. Should be 0-1.0 but seeing integers 0, 1, 2 ... check-in: 2635b582e7 user: mrwellan tags: v1.81-journal-based-throttling | |
17:44 | Added journal based statical droop based throttling of queries. check-in: fc6b05f924 user: mrwellan tags: v1.81-journal-based-throttling | |
Changes
Modified runs.scm from [0cd899f860] to [dadc9aecb3].
︙ | ︙ | |||
1145 1146 1147 1148 1149 1150 1151 | (registry-mutex (runs:dat-registry-mutex runsdat)) (flags (runs:dat-flags runsdat)) (keyvals (runs:dat-keyvals runsdat)) (run-info (runs:dat-run-info runsdat)) (all-tests-registry (runs:dat-all-tests-registry runsdat)) (run-limits-info (runs:dat-can-run-more-tests runsdat)) ;; (runs:can-run-more-tests run-id jobgroup max-concurrent-jobs)) ;; look at the test jobgroup and tot jobs running | | > > > > > > > > > > > > | 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 | (registry-mutex (runs:dat-registry-mutex runsdat)) (flags (runs:dat-flags runsdat)) (keyvals (runs:dat-keyvals runsdat)) (run-info (runs:dat-run-info runsdat)) (all-tests-registry (runs:dat-all-tests-registry runsdat)) (run-limits-info (runs:dat-can-run-more-tests runsdat)) ;; (runs:can-run-more-tests run-id jobgroup max-concurrent-jobs)) ;; look at the test jobgroup and tot jobs running (have-resources (and (if *journal-stats* (let* ((dbfname (conc (dbfile:run-id->dbnum run-id) ".db")) (stats (tt:get-journal-stats)) (load (or (alist-ref dbfname stats equal?) 0))) (if (> load 0.1) ;; dbs too busy to start more tests (begin (debug:print-info 0 *default-log-port* "Gating launch due to db load "load" based on journal file observations for "dbfname) #f) #t)) #t) ;; if journal monitoring not started do not gate (car run-limits-info))) (num-running (list-ref run-limits-info 1)) (num-running-in-jobgroup(list-ref run-limits-info 2)) (max-concurrent-jobs (list-ref run-limits-info 3)) (job-group-limit (list-ref run-limits-info 4)) ;; (prereqs-not-met (rmt:get-prereqs-not-met run-id waitons hed item-path mode: testmode itemmaps: itemmaps)) ;; (prereqs-not-met (mt:lazy-get-prereqs-not-met run-id waitons item-path mode: testmode itemmap: itemmap)) (fails (if (list? prereqs-not-met) ;; TODO: rename fails to failed-prereqs |
︙ | ︙ |
Modified tcp-transportmod.scm from [98a778bd3e] to [8195ca9d01].
︙ | ︙ | |||
1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 | (tt:write-load-tracking dbdir) (mutex-unlock! *journal-stats-mutex*) (thread-sleep! (/ (random 1000) 100.0)) (loop))) ;; call this to start a thread that is keeping the journal-stats up to date. (define (tt:start-stats dbdir) (thread-start! (make-thread (lambda ()(tt:journal-stats-run dbdir)) "Journal stats collection thread"))) (define (tt:get-journal-stats) (let* ((result (make-jstats)) (hitcounts (jstats-jcount result))) | > | 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 | (tt:write-load-tracking dbdir) (mutex-unlock! *journal-stats-mutex*) (thread-sleep! (/ (random 1000) 100.0)) (loop))) ;; call this to start a thread that is keeping the journal-stats up to date. (define (tt:start-stats dbdir) (thread-start! (make-thread (lambda ()(tt:journal-stats-run dbdir)) "Journal stats collection thread"))) (define (tt:get-journal-stats) (let* ((result (make-jstats)) (hitcounts (jstats-jcount result))) |
︙ | ︙ | |||
1226 1227 1228 1229 1230 1231 1232 | (debug:print 0 *default-log-port* "INFO: *journal-stats* not set.")) ;; convert to normalized alist (let ((tot (min (jstats-count result) 1)) ;; avoid divide by zero (hits (jstats-jcount result))) ;; 1.db => count (hash-table-map hits (lambda (fname hitcount) | | < | 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 | (debug:print 0 *default-log-port* "INFO: *journal-stats* not set.")) ;; convert to normalized alist (let ((tot (min (jstats-count result) 1)) ;; avoid divide by zero (hits (jstats-jcount result))) ;; 1.db => count (hash-table-map hits (lambda (fname hitcount) (cons fname (/ hitcount tot))))))) ;; megatest> (import tcp-transportmod) ;; megatest> (tt:write-load-tracking ".mtdb") ;; megatest> (hash-table-keys *journal-stats*) ;; (172060297) ;; megatest> (jstats->alist (hash-table-ref *journal-stats* 172060297)) ;; ((count . 1) (jcount . #<hash-table (1)>)) ;; megatest> (jstats-jcount (hash-table-ref *journal-stats* 172060297)) ;; #<hash-table (1)> ;; megatest> (hash-table->alist (jstats-jcount (hash-table-ref *journal-stats* 172060297))) ;; (("1.db" . 4)) ) |