Comment: | Added checks for vector instead of just true before accessing some data stucts that are generated on the fly and can fail due to communication errors |
Downloads: | Tarball | ZIP archive | SQL archive |
Timelines: | family | ancestors | descendants | both | v1.60 |
Files: | files | file ages | folders |
SHA1: |
4e1162ffe9db11db49179b4e117b2245 |
User & Date: | mrwellan on 2014-12-08 13:28:04 |
Other Links: | branch diff | manifest | tags |
| ||
17:36 | Added few more defensive layers to calls that *may* be part of the crash-on-startup-at-weird-random-times bug check-in: 1510977b0a user: mrwellan tags: v1.60 | |
13:28 | Added checks for vector instead of just true before accessing some data stucts that are generated on the fly and can fail due to communication errors check-in: 4e1162ffe9 user: mrwellan tags: v1.60 | |
12:39 | Fixed call where :state and :status were not aliased to -state and -status. Improved watch dog exit to not wait gratuitious five seconds before exiting check-in: a834ac5f9e user: mrwellan tags: v1.60 | |
Modified rmt.scm from [9cda1b0f8b] to [c506e20989].
︙ | ︙ | |||
79 80 81 82 83 84 85 | (define (rmt:send-receive cmd rid params #!key (attemptnum 1)) ;; start attemptnum at 1 so the modulo below works as expected ;; clean out old connections (mutex-lock! *db-multi-sync-mutex*) (let ((expire-time (- (current-seconds) (server:get-timeout) 10))) ;; don't forget the 10 second margin (for-each (lambda (run-id) (let ((connection (hash-table-ref/default *runremote* run-id #f))) | | | 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | (define (rmt:send-receive cmd rid params #!key (attemptnum 1)) ;; start attemptnum at 1 so the modulo below works as expected ;; clean out old connections (mutex-lock! *db-multi-sync-mutex*) (let ((expire-time (- (current-seconds) (server:get-timeout) 10))) ;; don't forget the 10 second margin (for-each (lambda (run-id) (let ((connection (hash-table-ref/default *runremote* run-id #f))) (if (and (vector? connection) (< (http-transport:server-dat-get-last-access connection) expire-time)) (begin (debug:print-info 0 "Discarding connection to server for run-id " run-id ", too long between accesses") ;; SHOULD CLOSE THE CONNECTION HERE (case *transport-type* ((nmsg)(nn-close (http-transport:server-dat-get-socket (hash-table-ref *runremote* run-id))))) |
︙ | ︙ | |||
104 105 106 107 108 109 110 | ((http)(condition-case (http-transport:client-api-send-receive run-id connection-info cmd params) ((commfail)(vector #f "communications fail")))) ((nmsg)(condition-case (nmsg-transport:client-api-send-receive run-id connection-info cmd params) ((timeout)(vector #f "timeout talking to server")))) (else (exit)))) | | | | | 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 | ((http)(condition-case (http-transport:client-api-send-receive run-id connection-info cmd params) ((commfail)(vector #f "communications fail")))) ((nmsg)(condition-case (nmsg-transport:client-api-send-receive run-id connection-info cmd params) ((timeout)(vector #f "timeout talking to server")))) (else (exit)))) (success (if (vector? dat) (vector-ref dat 0) #f)) (res (if (vector? dat) (vector-ref dat 1) #f))) (if (vector? connection-info)(http-transport:server-dat-update-last-access connection-info)) (if success (begin ;; (mutex-unlock! *send-receive-mutex*) (case *transport-type* ((http) res) ;; (db:string->obj res)) ((nmsg) res))) ;; (vector-ref res 1))) (begin ;; let ((new-connection-info (client:setup run-id))) |
︙ | ︙ |