Bug 565854 - close sshd channel gracefully
Summary: close sshd channel gracefully
Status: RESOLVED FIXED
Alias: None
Product: JGit
Classification: Technology
Component: JGit (show other bugs)
Version: 5.8   Edit
Hardware: PC Linux
: P3 normal (vote)
Target Milestone: 5.9   Edit
Assignee: Project Inbox CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-08-06 03:31 EDT by Michal Pasierb CLA
Modified: 2023-03-08 12:12 EST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michal Pasierb CLA 2020-08-06 03:31:36 EDT
Secure Shell RFC https://tools.ietf.org/html/rfc4254#section-5.3 says SSH_MSG_CHANNEL_CLOSE shall be used to close a channel. Current JGit implementation closes ssh channel immediately when Apache SSH library is in use so it doesn't initiate SSH_MSG_CHANNEL_CLOSE. This introduces a race condition between the client closing the channel (and possibly session) and server sending SSH_MSG_CHANNEL_DATA. Usually the server is fast enough and no errors occur.

When JGit is used with Gerrit I get following exceptions when race is lost. On server side:

[2020-08-06 09:13:22,168] [sshd-SshServer[68f69ca3]-nio2-thread-3] WARN  org.apache.sshd.server.session.ServerSessionImpl : exceptionCaught(ServerSessionImpl[some_user@/127.0.0.1:50860])[state=Opened] IOException: Broken pipe
[2020-08-06 09:13:22,353] [sshd-SshServer[68f69ca3]-nio2-thread-2] WARN  org.apache.sshd.server.session.ServerSessionImpl : exceptionCaught(ServerSessionImpl[some_user@/127.0.0.1:50868])[state=Opened] IOException: Connection reset by peer
[2020-08-06 09:13:23,246] [SSH git-upload-pack /test (some_user)] ERROR com.google.gerrit.sshd.BaseCommand : Internal server error (user some_user account 1000001) during git-upload-pack '/test'
org.apache.sshd.common.SshException: flush(ChannelOutputStream[ChannelSession[id=0, recipient=0]-ServerSessionImpl[some_user@/127.0.0.1:50888]] SSH_MSG_CHANNEL_DATA) length=0 - stream is already closed
	at org.apache.sshd.common.channel.ChannelOutputStream.flush(ChannelOutputStream.java:167)
	at com.google.gerrit.sshd.BaseCommand$TaskThunk.run(BaseCommand.java:467)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        ...

This is logged by client:

2020-08-06 09:13:12,598 [sshd-JGitSshClient[76fc5687]-nio2-thread-3] WARN  org.eclipse.jgit.internal.transport.sshd.JGitClientSession - exceptionCaught(JGitClientSession[some_user@j.example.com/127.0.0.1:29418])[state=Opened] IOException: flush() Stream has been closed

Propsed fix closes the channel gracefully - initiates SSH_MSG_CHANNEL_CLOSE and waits for receive from server side.
Comment 1 Eclipse Genie CLA 2020-08-06 03:42:20 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/167332
Comment 2 Thomas Wolf CLA 2020-08-07 15:31:45 EDT
JGit flushes the output stream at the end of a fetch/push. Gerrit then flushes again at [1]. That second flush then gets the error, because the client disconnected the channel already as soon as the application protocol said it was OK to do so, no more data coming.

Perhaps Gerrit simply needs a way for a command (git-upload-pack/git-recieve-pack) to indicate to this TaskThunk that errors in these outer flushes can be ignored, or that they are not needed.

[1] https://gerrit.googlesource.com/gerrit/+/refs/heads/master/java/com/google/gerrit/sshd/BaseCommand.java#476
Comment 3 Thomas Wolf CLA 2020-08-09 06:06:42 EDT
Or perhaps it should simply be the command's responsibility to flush its streams if it produces any output. Then TaskThunk could simply always ignore exceptions from those extra flushes.
Comment 4 Thomas Wolf CLA 2020-08-28 17:54:05 EDT
One problem I have with this waiting is that sending back an exit code is actually optional for a server according to in RFC 4254, section 6.10.[1] On the other hand, it's OK for a client to request closing a channel at any time (section 5.3).

So the proper way to shut down the channel from the client when not interested in the exit code would be indeed to close gracefully via channel.close(false). If the server then still tries to send data after having replied with its own SSH_MSG_CHANNEL_CLOSE (which it must do) and then logs any failures, then that's clearly the server's problem.

[1] https://tools.ietf.org/html/rfc4254#section-6.10
Comment 5 Eclipse Genie CLA 2020-08-31 15:25:39 EDT
New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/168537
Comment 6 Thomas Wolf CLA 2020-09-01 04:34:50 EDT
(In reply to Eclipse Genie from comment #5)
> New Gerrit change created: https://git.eclipse.org/r/c/jgit/jgit/+/168537

This resolves the problem nicely in the ApacheSshProtocol2Test in https://git.eclipse.org/r/c/jgit/jgit/+/167204 .
Comment 8 Michal Pasierb CLA 2020-09-07 11:58:33 EDT
Bug status changed to RESOLVED FIXED but the error is still in code. Waste of my time. I already said my piece in comments of https://git.eclipse.org/r/c/jgit/jgit/+/167332.
Comment 9 Thomas Wolf CLA 2020-09-07 17:07:24 EDT
(In reply to Michal Pasierb from comment #8)
> Bug status changed to RESOLVED FIXED but the error is still in code.

As I explained I think that the problem is Gerrit's double flush. JGit 5.9.0 does send SSH2_MSG_CHANNEL_CLOSE; if that comes before the flushes, the server will get a SshChannelClosedException.

IMO the outer flushes in TaskThunk[1] should be wrapped in

  try {
    out.flush();
    err.flush();
  } catch (SshChannelClosedException e) {
    // Ignore -- it's OK here
  }

at least for commands that are known to flush the streams themselves, like UploadPack.

(Verified by adding such outer flushes in the SshTestGitServer, and adding an uninterruptible 1 second delay before them.)

Also compare [2]: Gerrit's BaseCommand already tries to ignore some exceptions that indicate that the client just dropped off. So the above proposal seems perfectly fine to me. Maybe that handleError() method needs updating; checking exception messages is awfully fragile, and in any case the code does not seem to be in-line with the latest Apache MINA sshd: it never raises an SshException with message "Already closed". It does raise a WindowClosedException, see [3].

[1] https://gerrit.googlesource.com/gerrit/+/refs/heads/master/java/com/google/gerrit/sshd/BaseCommand.java#476
[2] https://gerrit.googlesource.com/gerrit/+/refs/heads/master/java/com/google/gerrit/sshd/BaseCommand.java#333
[3] https://github.com/apache/mina-sshd/blob/master/sshd-core/src/main/java/org/apache/sshd/common/channel/WindowClosedException.java#L28
Comment 10 Matthias Sohn CLA 2023-03-08 12:12:23 EST
Fix this by ignoring SshChannelClosedException in
https://gerrit-review.googlesource.com/c/gerrit/+/362914