Saturday, 6 June 2020

AWS EC2 + Tomcat (JSR-356) + Secure Websockets + Cloudflare is giving : java.io.IOException: Unable to unwrap data, invalid status [CLOSED]

In this article I want to discuss about an error I have faced while I was trying a websocket POC for a requirement. The requirement was to create secure (wss) websocket server using tomcat's implementation of JSR-356 with domain name URL. I have used below technologies to achieve the same.


- AWS EC2 instance to run tomcat.
- AWS Elastic IP for assigning static IP to the EC2 instance.
- Tomcat 9.* server.
- Cloudflare DNS service to map our EC2 instance static IP to our domain name URL.

Below are different steps in order to achieve the same:

1) Write websocket server (API) using tomcat's implementation of JSR-356.

You will find many articles on the web to do this. For reference take a look at this:
Websocket example using tomcat
In this example we are creating an echo websocket server which responds back the same message to client what it has received. You can write the client using javascript or you can use tomcat websocket client library for Java.

2) Tomcat configuration for deployment- server.xml and cloudflare origin CA certificate.

As the main requirement is to create secure weboscket, I have configured HTTPS connector in tomcat server.xml file.

<Connector port="443" protocol="org.apache.coyote.http11.Http11Nio2Protocol"
connectionTimeout="-1" maxConnections="-1" acceptCount="5000" 
maxThreads="15000" scheme="https" secure="true" SSLEnabled="true"           
clientAuth="false" sslProtocol="TLSv1.2"  keystoreFile="/home/keystore" 
keystorePass="pswd123" >
<UpgradeProtocol className="org.apache.coyote.http2.Http2Protocol" /> 
</Connector> 

As we want to access this site via HTTPS, we need a valid signed certificate.
Here, the keystore is the cloudflare origin CA certificate generated from cloudflare. I will explain this in details in next section.


3) AWS EC2 Instance configuration.


I have assigned elastic IP to my AWS EC2 instance and for the inbound rule I have added HTTPS (port 443) in security group of instance, so that it can accept incoming requests over HTTPS/WSS.


4) Cloudflare DNS mapping and origin CA certificate.

In cloudflare DNS configuration section for my domain, I have added below configuration:


Name is the what I want to access my api with. e.g: myapi.mydomain.com
The IP address is elastic IP I have associated with EC2 instance.
Proxy status is "proxied". Let's discuss a bit more about this proxied option.

One good feature of cloudflare is, it provides free ssl for your site. Generally you have to pay for getting ssl certificate for your domain but if you are using cloudflare you don't have to do that as using below technique cloudflare add ssl to your site.



we are using Full (Strict) ssl mode in our site, Where cloudflare generates an Origin CA cert for us which is installed in tomcat server.xml configuration. Cloudflare consider that as trusted certificate. In our case traffic is proxied via cloudflare so the Browser will see the cloudflare signed certificate.

The summary is, traffic between Browser and cloudflare is secured with a cert valid for world while traffic between cloudflare and Origin Server(tomcat in our case) is signed with ckoudflare Origin CA certficate which only cloudflare consider a valid cert, not the outside world. These are steps to generate and configure cloudflare origin CA certificate.


5) Running the application

After all this configuration and setup I am starting the tomcat server and trying to access my websocket api using websocket client with this URL.

wss://myapi.mydomain.com?param1=hello

Once I connect to the websocket from client, after 2-3 minutes I see that the server is closing the connection from server side and I see below log in tomcat:

java.io.IOException: Unable to unwrap data, invalid status [CLOSED]
java.io.IOException: java.io.IOException: Unable to unwrap data, invalid status [CLOSED]
  at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:315)
  at org.apache.tomcat.websocket.WsRemoteEndpointImplBase.sendMessageBlock(WsRemoteEndpointImplBase.java:258)
  at org.apache.tomcat.websocket.WsSession.sendCloseMessage(WsSession.java:612)
  at org.apache.tomcat.websocket.WsSession.doClose(WsSession.java:497)
  at org.apache.tomcat.websocket.WsSession.doClose(WsSession.java:459)
  at org.apache.tomcat.websocket.server.WsHttpUpgradeHandler.upgradeDispatch(WsHttpUpgradeHandler.java:176)
  at org.apache.coyote.http11.upgrade.UpgradeProcessorInternal.dispatch(UpgradeProcessorInternal.java:54)
  at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:59)
  at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
  at org.apache.tomcat.util.net.Nio2Endpoint$SocketProcessor.doRun(Nio2Endpoint.java:1675)
  at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
  at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
  at java.base/java.lang.Thread.run(Thread.java:830)
Caused by: java.io.IOException: Unable to unwrap data, invalid status [CLOSED]
  at org.apache.tomcat.util.net.SecureNio2Channel$1.completed(SecureNio2Channel.java:959)
  at org.apache.tomcat.util.net.SecureNio2Channel$1.completed(SecureNio2Channel.java:898)
  at java.base/sun.nio.ch.Invoker.invokeUnchecked(Invoker.java:127)
  at java.base/sun.nio.ch.Invoker$2.run(Invoker.java:219)
  at java.base/sun.nio.ch.AsynchronousChannelGroupImpl$1.run(AsynchronousChannelGroupImpl.java:112)
  ... 4 more

I have searched a lot to find the possible cause for the error but couldn't found anything which can solve my problem. I have switched from HTTPS to HTTP and it has started working, it was not dropping connection in that case. So after doing lot of debugging and research I suspected on the cloudflare "proxied" behavior for HTTPS which may not be suitable for Websocket protocol which is a full duplex TCP connection between 2 system. Here cloudflare acts as a proxy between 2 system when we want to use HTTPS. To confirm my doubt I made below 2 changes and the error got resolved.

6) Resolution steps:

    1) Remove cloudflare proxied setting and make it DNS only.

I changed DNS config like below: 


With this option cloudflare will stop acting as a proxy for our api URL mapped to EC2 instance and it will only add DNS entry for routing.

Now next question can come to your mind is, how we will have HTTPS support for our api if we don't use cloudflare inbuilt SSL support? The answer is second step.

  2) Buy valid SSL certificate for your domain and configure in tomcat.

I have removed cloudflare origin CA certificate and place the new certificate I bought from GoDaddy. There are free options also available like freessl and letsencrypt where you can buy certificate for your site.

After applying above 2 steps, my secure webscoket server connection using domain name URL started working fine and the connection was up and running without a problem. Each webscoket connection remain established and drop or disconnect from server side was not found after that.

Now the question is, Do the cloudflare is not supporting proxy for webscoket ? No, it is not like that. It is supporting as per what is mentioned here: Link1, Link2 . Here it is clearly mentioned that secure websocket connection with proxied mode should work without a proeblem. But I am not sure what is the cause of issue in my case. It might be possible that tomcat's Websocket library has issue with this proxied mode. But that's a guess.

Update - Actual cause and resoultion:
I found the actual reason in this stackoverflow post:

https://stackoverflow.com/questions/39668410/whats-disconnecting-my-websocket-connection-cloudflare-apaches-mod-proxy

Here it is mentioned that if the connection remain idle for 100 second then cloudflare is closing it. That's exactly what is happening in my case. Below are possible solutions:

1) Buy their enterprise plan and change the setting as mentioned for timeout.
2) Add heartbeat/ping logic in your websocket API to keep it alive.

That's it for now...
I would love to hear your experience on this topic.

Please post your comments and doubts!!!