|
| 1 | ++++++++++++++++++++ |
| 2 | +asyncio performance |
| 3 | ++++++++++++++++++++ |
| 4 | + |
| 5 | +Random notes about tuning asyncio for performance. Performance means two |
| 6 | +different teams which might be incompatible: |
| 7 | + |
| 8 | +* Number of concurrent requests per second |
| 9 | +* Request latency in seconds: min/average/max time to complete a request |
| 10 | + |
| 11 | + |
| 12 | +Architecture: Worker processes |
| 13 | +============================== |
| 14 | + |
| 15 | +Because of its GIL, CPython is basically only able to use 1 CPU. To increase |
| 16 | +the number of concurrent requests, one solution is to spawn multiple worker |
| 17 | +processes. See for example: |
| 18 | + |
| 19 | +* `Gunicorn <http://docs.gunicorn.org/en/stable/design.html>`_ |
| 20 | +* `API-Hour <http://pythonhosted.org/api_hour/>`_ |
| 21 | + |
| 22 | + |
| 23 | +Stream limits |
| 24 | +============= |
| 25 | + |
| 26 | +* `limit parameter of StreamReader/open_connection() |
| 27 | + <https://docs.python.org/dev/library/asyncio-stream.html#streamreader>`_ |
| 28 | +* `set_write_buffer_limits() low/high water mark on writing for transports |
| 29 | + <https://docs.python.org/dev/library/asyncio-protocol.html#asyncio.WriteTransport.set_write_buffer_limits>`_ |
| 30 | + |
| 31 | +aiohttp uses ``set_writer_buffer_limits(0)`` for backpressure support and |
| 32 | +implemented their own buffering, see: |
| 33 | + |
| 34 | +* `aio-libs/aiohttp#1369 <https://github.com/aio-libs/aiohttp/pull/1478/files>`_ |
| 35 | +* `Some thoughts on asynchronous API design in a post-async/await world |
| 36 | + <https://vorpus.org/blog/some-thoughts-on-asynchronous-api-design-in-a-post-asyncawait-world/>`_ |
| 37 | + (November, 2016) by Nathaniel J. Smith |
| 38 | + |
| 39 | + |
| 40 | +TCP_NODELAY |
| 41 | +=========== |
| 42 | + |
| 43 | +Since Python 3.6, asyncio now sets the ``TCP_NODELAY`` option on newly created |
| 44 | +sockets: disable the Nagle algorithm for send coalescing. Disable segment |
| 45 | +buffering so data can be sent out to peer as quickly as possible, so this is |
| 46 | +typically used to improve network utilisation. |
| 47 | + |
| 48 | +See `Nagle's algorithm <https://en.wikipedia.org/wiki/Nagle%27s_algorithm>`_. |
| 49 | + |
| 50 | +TCP_QUICKACK |
| 51 | +============ |
| 52 | + |
| 53 | +(This option is not used by asyncio by default.) |
| 54 | + |
| 55 | +The ``TCP_QUICKACK`` option can be used to send out acknowledgements as early |
| 56 | +as possible than delayed under some protocol level exchanging, and it's not |
| 57 | +stable/permanent, subsequent TCP transactions (which may happen under the hood) |
| 58 | +can disregard this option depending on actual protocol level processing or any |
| 59 | +actual disagreements between user setting and stack behaviour. |
| 60 | + |
| 61 | + |
| 62 | +Tune the Linux kernel |
| 63 | +===================== |
| 64 | + |
| 65 | +Linux TCP sysctls: |
| 66 | + |
| 67 | +* ``/proc/sys/net/ipv4/tcp_mem`` |
| 68 | +* ``/proc/sys/net/core/rmem_default`` and ``/proc/sys/net/core/rmem_max``: |
| 69 | + The default and maximum amount for the receive socket memory |
| 70 | +* ``/proc/sys/net/core/wmem_default`` and ``/proc/sys/net/core/wmem_max``: |
| 71 | + The default and maximum amount for the send socket memory |
| 72 | +* ``/proc/sys/net/core/optmem_max``: The maximum amount of option memory |
| 73 | + buffers |
| 74 | +* ``net.ipv4.tcp_no_metrics_save`` |
| 75 | +* ``net.core.netdev_max_backlog``: Set maximum number of packets, queued on the |
| 76 | + INPUT side, when the interface receives packets faster than kernel can |
| 77 | + process them. |
0 commit comments