In this post, I will explain how Thrift works internally by following the consecutive steps throughout the generated code. The code snippets in this post are based on the generated Thrift code, but are simplified to only show the core functionality.
- The first step is to generate the Thrift code from the IDL]
- The second step goes into detail how the client makes the remote procedure call (RPC) to the server
- Finally, the third step explains how the server receives the RPC message and returns the reply.
- Additional information on client and server transport
Generating the Java code
The Thrift IDL example that comes with Thrift 0.9 defines a Calculator service. I will only focus on the add
function for the synchronous clients. Additional Thrift features such as remote exceptions, one way calls or async RPC’s are easy to understand once the basic add
function is understood.
The Java code that underlies the client and server are generated using:
The generated code is located in gen-java.
#Client side
The constructor of the client passes the TBinaryProtocol. This protocol will be used for all communication (in- and outgoing) between client and server.
The client makes a RPC to the server using 2 arguments and expects an integer as result.
The generated code transforms this function call to a sequence of sending and receving information. The main information to be send is the name of the function to be called “ add” and the arguments: 1 and 2.
The arguments (similar for the results) for each services are wrapped into a Java class. The class add_args (derived from TBase) is a placeholder for the arguments of the RPC. The function sendBase is implemented by the TServiceClient, the parent of Calculator.Client
The sendBase writes the header of the RPC to the protocol, then the instance of the argument class (add_args) writes the values (1 and 2) to the TProtocol, the parent class writes the tail of the RPC and finally, the tranport layer is called to send the message to the server.
As mentioned before, the argument class add_args is a placeholder for the arguments of the RPC. In this case “add”, it holds 2 integers. The class has convenience methods, such as getters, setters, and deals with optional fields etc. The main functionality of the class is to write the values of the arguments to a TProtocol. Since a TProtocol supports different schemes (StandardScheme and TupleScheme), the argument class must implement a read/write for each scheme
In this example, the concrete TProtocol used is TBinaryProtocol. Each TProtocol is associated with a TTransport.
The TTransport is implemented by TSocket. In this example, the TTransport is implemented by TSocket. This class communicates over sockets and uses standard Java IO streams.
Server side
The server side of a Thrift code base consists of 2 components: 1. The handler which implements the actual service (ping, add, …) 2. The server which takes care of the communication with the different clients.
The handler
The handler is the only class that needs to be coded by the end user. This class implements the services defined in the thrift IDL. These services are part of an interface :
The server
The server connects the handler, the processor, the protocols and the transports.
The Processor exposes a map of names associated with functions (ProcessFunction). Each name corresponds to the name of a function (“ping”, “add”, …) and the associated functions will wrap the call to the appropriate service in the handler.
Inside the Calculator.Processor, there is a class for each service (ProcessFunction). Each class implements the getResult function which is responsible for calling the service in the handler.
The TServerSocket creates an instance of a standard Java server socket.
The TSimpleServer starts listening to the TServerSocket.
The server waits for incoming messages. The message is decoded, the appropriate services in the handler are called and the result is returned. Building the results follows the same logics as building the arguments (add_args and add_result classes).
The dispatching of the incoming message to the service in the handler is done by the Processor.
For the function add
, fn is an instance of type Calculator.Processor.add<...>
which extends the ProcessFunction
.
The process function is templated with the handler class (CalculatorHandler) and the argument class (Calculator.add_args)
Additional Information
Server
The server binds the TProtocol and the TTransport together: it listens to incoming messages using the choosen protocol and passes the message to the processor.
- TSimpleServer: single threaded, blocking IO server
- TThreadPoolServer: multi threaded, blocking IO server
- TNonblockingServer: multi threaded, non-blocking IO server
TTransport
The transport used on the client side must correspond to the one used on the server side. The following scheme is limited to socket based transports.
- TSocket: simple socket communication
- TSSLTransport: secure socket communication
- TSaslTransport: simple authentication and security layer. The SASL mechanism supports a series of challenges& responses, such as ANONYMOUS, PLAIN, DIGEST-MD5 and GSSAPI. The GSSAPI supports Kerberos Here is an example. For the Java world, the javax.security.sasl module is used, I haven’t found a Thrift C/C++ SASL client.