[703]pymongo中AutoReconnect异常的正确避免方式-白红宇

[703]pymongo中AutoReconnect异常的正确避免方式

阅读量：656 次

发布时间：2019-03-13

本文共 6937 字，大约阅读时间需要 23 分钟。

问题来源

在windows系统运行一下代码，会出现问题。非windows系统可以exit了。

import timefrom pymongo import MongoClientmongodb_setting = dict(    host='127.0.0.1',    port=27017,    username='root',    password='root',    authSource='admin',)database_name = 'test'db = MongoClient(**mongodb_setting)[database_name]table = db['test_table']def get_some_info():    now = time.time()    table.find_one({})    print(time.time() - now)def do_something():    get_some_info()  # 第一次查询    time.sleep(600)  # do something other    get_some_info()  # 第二次查询do_something()

当第二次查询时，会抛出异常pymongo.errors.AutoReconnect

官方文档中的描述是:

exception pymongo.errors.AutoReconnect(message=’’, errors=None)

Raised when a connection to the database is lost and an attempt to auto-reconnect will be made.

In order to auto-reconnect you must handle this exception, recognizing that the operation which caused it has not necessarily succeeded. Future operations will attempt to open a new connection to the database (and will continue to raise this exception until the first successful connection is made).

大致是意思是，pymongo会自动重连mongodb，但是我们必须手动处理这个异常。

至今我还是没明白，既然你都自动重连了，为什么要我们去处理这个异常？，求大神指点！

DEBUG后查到抛出异常位置,pool.py 262行

def _raise_connection_failure(address, error, msg_prefix=None):    """Convert a socket.error to ConnectionFailure and raise it."""    host, port = address    # If connecting to a Unix socket, port will be None.    if port is not None:        msg = '%s:%d: %s' % (host, port, error)    else:        msg = '%s: %s' % (host, error)    if msg_prefix:        msg = msg_prefix + msg    if isinstance(error, socket.timeout):        raise NetworkTimeout(msg)    elif isinstance(error, SSLError) and 'timed out' in str(error):        # CPython 2.6, 2.7, PyPy 2.x, and PyPy3 do not distinguish network        # timeouts from other SSLErrors (https://bugs.python.org/issue10272).        # Luckily, we can work around this limitation because the phrase        # 'timed out' appears in all the timeout related SSLErrors raised        # on the above platforms. CPython >= 3.2 and PyPy3.3 correctly raise        # socket.timeout.        raise NetworkTimeout(msg)    else:        raise AutoReconnect(msg)

解决思路

1、老老实实按照官方文档说的，去捕获AutoReconnect异常，然后再次发出相同的请求。这个工作量很大，基本要重写每一个的函数，例如insert_one(),find_one()之类的。（个人理解，有更好的方法麻烦告知，谢谢！）

2、插个话题，按照方法1去捕获AutoReconnect异常的时候。每次抛出该异常前，必须忍受20s的等待异常时间。例如当运行find_one()方法，20s后才会抛出AutoReconnect异常，然后我们处理这个异常，再次运行一次find_one()方法，耗时大概0.020s，所以一次查询用了20多秒的时间，这样很痛苦。查询mongo_client.py中的class MongoClient的初始化函数，看看超时选项

- `connectTimeoutMS`: (integer or None) Controls how long (in            milliseconds) the driver will wait during server monitoring when            connecting a new socket to a server before concluding the server            is unavailable. Defaults to ``20000`` (20 seconds).          - `serverSelectionTimeoutMS`: (integer) Controls how long (in            milliseconds) the driver will wait to find an available,            appropriate server to carry out a database operation; while it is            waiting, multiple server monitoring operations may be carried out,            each controlled by `connectTimeoutMS`. Defaults to ``30000`` (30            seconds).

默认connectTimeoutMS为20s，我之前的方法是，把connectTimeoutMS，socketTimeoutMS都设置为1000ms，然后处理NetworkTimeout异常，而不再是AutoReconnect异常。也是很痛苦的事（被windows害惨了）

3、最终还是回到socket的连接上找问题。出现AutoReconnect异常说明从连接池中拿到的连接已经失效，如果连接池里的连接一直保持着跟mongodb服务器的连接，就不会有自动重连的异常。说明socket的心跳检查有问题。而socket心跳跟几个参数有关：

TCP_KEEPIDLE : 多少秒socket连接没有数据通信，发送keepalive探测分组，单位是秒

TCP_KEEPINTVL : 如果没有响应，多少秒后重新发送keepalive探测分组

TCP_KEEPCNT : 多少次没有响应，则关闭连接

解决方案

从源代码中查找出响应代码，在pool.py中的126行，关键函数为 _set_keepalive_times(sock)

_MAX_TCP_KEEPIDLE = 300_MAX_TCP_KEEPINTVL = 10_MAX_TCP_KEEPCNT = 9if sys.platform == 'win32':    try:        import _winreg as winreg    except ImportError:        import winreg    try:        with winreg.OpenKey(                winreg.HKEY_LOCAL_MACHINE,                r"SYSTEM\CurrentControlSet\Services\Tcpip\Parameters") as key:            _DEFAULT_TCP_IDLE_MS, _ = winreg.QueryValueEx(key, "KeepAliveTime")            _DEFAULT_TCP_INTERVAL_MS, _ = winreg.QueryValueEx(                key, "KeepAliveInterval")            # Make sure these are integers.            if not isinstance(_DEFAULT_TCP_IDLE_MS, integer_types):                raise ValueError            if not isinstance(_DEFAULT_TCP_INTERVAL_MS, integer_types):                raise ValueError    except (OSError, ValueError):        # We could not check the default values so do not attempt to override.        def _set_keepalive_times(dummy):            pass    else:        def _set_keepalive_times(sock):            idle_ms = min(_DEFAULT_TCP_IDLE_MS, _MAX_TCP_KEEPIDLE * 1000)            interval_ms = min(_DEFAULT_TCP_INTERVAL_MS,                              _MAX_TCP_KEEPINTVL * 1000)            if (idle_ms < _DEFAULT_TCP_IDLE_MS or                    interval_ms < _DEFAULT_TCP_INTERVAL_MS):                sock.ioctl(socket.SIO_KEEPALIVE_VALS,                           (1, idle_ms, interval_ms))else:    def _set_tcp_option(sock, tcp_option, max_value):        if hasattr(socket, tcp_option):            sockopt = getattr(socket, tcp_option)            try:                # PYTHON-1350 - NetBSD doesn't implement getsockopt for                # TCP_KEEPIDLE and friends. Don't attempt to set the                # values there.                default = sock.getsockopt(socket.IPPROTO_TCP, sockopt)                if default > max_value:                    sock.setsockopt(socket.IPPROTO_TCP, sockopt, max_value)            except socket.error:                pass    def _set_keepalive_times(sock):        _set_tcp_option(sock, 'TCP_KEEPIDLE', _MAX_TCP_KEEPIDLE)        _set_tcp_option(sock, 'TCP_KEEPINTVL', _MAX_TCP_KEEPINTVL)        _set_tcp_option(sock, 'TCP_KEEPCNT', _MAX_TCP_KEEPCNT)

在windows系统和非win系统函数定义_set_keepalive_times()都不一样，我们先看windows系统。

1.先查找系统注册表中 SYSTEM\CurrentControlSet\Services\Tcpip\Parameters位置的两个键KeepAliveTime和KeepAliveInterval。我win10系统打开一看，根本就没有这两个键，所以_set_keepalive_times被定义为pass，没有心跳一段时间后就会造成AutoReconnect异常！

2.添加两个以上的键和值后，还需要与默认值对比，设置的是毫秒

_MAX_TCP_KEEPIDLE = 300_MAX_TCP_KEEPINTVL = 10…………        def _set_keepalive_times(sock):            idle_ms = min(_DEFAULT_TCP_IDLE_MS, _MAX_TCP_KEEPIDLE * 1000)            interval_ms = min(_DEFAULT_TCP_INTERVAL_MS,                              _MAX_TCP_KEEPINTVL * 1000)            if (idle_ms < _DEFAULT_TCP_IDLE_MS or                    interval_ms < _DEFAULT_TCP_INTERVAL_MS):                sock.ioctl(socket.SIO_KEEPALIVE_VALS,                           (1, idle_ms, interval_ms))

只有其中一个值比默认值大，才会执行sock.ioctl()。（ps:我被这个判断坑惨了！）

也就是说KeepAliveInterval要大于10 * 1000

或者KeepAliveTime大于300 * 1000

最终方案

win系统开发太多坑了

步骤：

1、win键+R,然后输入 regedit 回车

2、找到地址

计算机\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters

3、空白处右键 > 新建 > QWORD

4、键名KeepAliveTime，值 60000（十进制）

5、键名KeepAliveInterval，值 20000（十进制）

完事！

来源：https://blog.csdn.net/dslkfajoaijfdoj/article/details/83717238

你可能感兴趣的文章

Nginx配置实例-反向代理实例：根据访问的路径跳转到不同端口的服务中