-
Steve Azzopardi authored
This is what happens for jobs like `304673282`. If we look at the logs for `304673282` (full analysis [here](https://gitlab.com/gitlab-org/gitlab-runner/issues/4147#job-id-304673282)) we can see that the state was updated to `timedout` but we never see error logs like `execution took longer than`, we only log this in [1 place](https://gitlab.com/gitlab-org/gitlab-runner/blob/e8591acf243af158dda7a48db302b147dc5b1ae6/common/build.go#L323-326). We only update the state to `timedout` in [1 place](https://gitlab.com/gitlab-org/gitlab-runner/blob/e8591acf243af158dda7a48db302b147dc5b1ae6/common/build.go#L322). So if we see where [handleError](https://gitlab.com/gitlab-org/gitlab-runner/blob/e8591acf243af158dda7a48db302b147dc5b1ae6/common/build.go#L315-332) is called, it's called in 2 places, the one we are interesting in is [when the context (timeout) is done/canceled](https://gitlab.com/gitlab-org/gitlab-runner/blob/e8591acf243af158dda7a48db302b147dc5b1ae6/common/build.go#L359). This also happens when the job is `canceled` and we update the state to `canceled`. If the context is done, we will handle the error which in this case is creating the `execution took longer than` error and updating the state, but in the case of `304673282` we never are returning that error and the Job keeps running. This is because of [`<-buildFinish`](https://gitlab.com/gitlab-org/gitlab-runner/blob/e8591acf243af158dda7a48db302b147dc5b1ae6/common/build.go#L374), we don't have any timeouts waiting for the any of our scripts to finish, which can be the build script, upload artifact script or cleanup script. So a process can hang forever and the build would never finish, yet the state is timeout out and we are still burning the minutes. The only solution to this is to have a timeout until we wait for the build to finish, and just terminate the job altogether, the only problem we will end up with an environment that is not clean, but maybe we can fix that also. reference https://gitlab.com/gitlab-org/gitlab-runner/issues/4147