I started to read famous Martin Fowler book (Patterns of Enterprise Application Architecture)
I have to mention that I am reading the book translated into my native language so it might be a reason of my misunderstanding.
I found their definitions (back translation into English):
Response time - amount of time to process some external request
Latency - minimal amount of time before getting any response.
For me it is the same. Could you please highlight the difference?
Response latency is defined as the time in seconds that elapses between the delivery of the noncontingent electrical stimulus (end of the stimulus) and the animal's response on the wheel. From: Neurobiology of Addiction, 2006.
Response time is the sum of processing time and encountered latencies. Processing time is usually the time taken by the server from receiving the last byte of the request and returning the first byte of the response.
ThroughputThroughput measures the overall performance of the system. For transaction processing systems, throughput is typically measured in transactions per second (TPS) or transactions per minute (TPM). Response timeResponse time measures the performance of an individual transaction or query.
API latency is how long it takes your infrastructure to respond to an API request. In other words, it's the period of time between when the request arrives at your server and when the client receives the first byte of the response.
One way of looking at this is to say that transport latency + processing time = response time.
Transport latency is the time it takes for a request/response to be transmitted to/from the processing component. Then you need to add the time it takes to process the request.
As an example, say that 5 people try to print a single sheet of paper at the same time, and the printer takes 10 seconds to process (print) each sheet.
The person whose print request is processed first sees a latency of 0 seconds and a processing time of 10 seconds - so a response time of 10 seconds.
Whereas the person whose print request is processed last sees a latency of 40 seconds (the 4 people before him) and a processing time of 10 seconds - so a response time of 50 seconds.
As Martin Kleppman says in his book Designing Data Intensive Applications:
Latency is the duration that a request is waiting to be handled - during which it is latent, awaiting service. Used for diagnostic purposes ex: Latency spikes
Response time is the time between a client sending a request and receiving a response. It is the sum of round trip latency and service time. It is used to describe the performance of application.
This article is a good read on the difference, and is best summarized with this simple equation,
Latency + Processing Time = Response Time
where
If processing time is reasonably short, which in well designed systems is the case, then for practical purposes response time and latency could be the same in terms of perceived passage of time. That said, to be precise, use the defined terms and don't confuse or conflate the two.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With