Forums

Need Help with Troubleshooting Machine Learning Model Deployment on My Web Project

Here's your text translated into English for a forum post:

Subject: Need Help with Troubleshooting Machine Learning Model Deployment on My Web Project

Hello everyone,

I'm currently working on a school project where I've built a website that predicts the number of wildfires based on various weather parameters. The website allows users to choose from one of six pre-trained machine learning models or to train their own models using a subset of the weather parameters.

However, I've encountered a problem. When I select the neural network or XGBoost model and attempt a prediction or if I try training it, it loads for about 5 minutes before displaying a "something went wrong" message. I'm not sure if this issue is due to excessive computational demand, requiring more web workers, or if there's another solution I could try.

Below are the server logs from the time of the issue when I was trying to predict something:

2023-12-19 18:09:09 Tue Dec 19 18:09:09 2023 - received message 0 from emperor
2023-12-19 18:09:09 SIGINT/SIGTERM received...killing workers...
2023-12-19 18:09:10 worker 2 buried after 1 seconds
2023-12-19 18:09:11 worker 1 buried after 2 seconds
2023-12-19 18:09:11 goodbye to uWSGI.
2023-12-19 18:09:11 VACUUM: unix socket /var/sockets/taminator.pythonanywhere.com/socket removed.
2023-12-19 18:09:26 *** Starting uWSGI 2.0.20 (64bit) on [Tue Dec 19 18:09:14 2023] ***
2023-12-19 18:09:26 compiled with version: 9.4.0 on 22 July 2022 18:35:26
2023-12-19 18:09:26 os: Linux-5.15.0-1044-aws #49~20.04.1-Ubuntu SMP Mon Aug 21 17:09:32 UTC 2023
2023-12-19 18:09:26 nodename: blue-liveweb5
2023-12-19 18:09:26 machine: x86_64
2023-12-19 18:09:26 clock source: unix
2023-12-19 18:09:26 pcre jit disabled
2023-12-19 18:09:26 detected number of CPU cores: 4
2023-12-19 18:09:26 current working directory: /home/Taminator
2023-12-19 18:09:26 detected binary path: /usr/local/bin/uwsgi
2023-12-19 18:09:26 *** dumping internal routing table ***
2023-12-19 18:09:26 [rule: 0] subject: path_info regexp: \.svgz$ action: addheader:Content-Encoding:gzip
2023-12-19 18:09:26 *** end of the internal routing table ***
2023-12-19 18:09:26 chdir() to /home/Taminator/
2023-12-19 18:09:26 your processes number limit is 512
2023-12-19 18:09:26 your memory page size is 4096 bytes
2023-12-19 18:09:26 detected max file descriptor number: 123456
2023-12-19 18:09:26 building mime-types dictionary from file /etc/mime.types...
2023-12-19 18:09:26 567 entry found
2023-12-19 18:09:26 lock engine: pthread robust mutexes
2023-12-19 18:09:26 thunder lock: disabled (you can enable it with --thunder-lock)
2023-12-19 18:09:26 uwsgi socket 0 bound to UNIX address /var/sockets/taminator.pythonanywhere.com/socket fd 3
2023-12-19 18:09:26 Python version: 3.10.5 (main, Jul 22 2022, 17:09:35) [GCC 9.4.0]
2023-12-19 18:09:26 *** Python threads support is disabled. You can enable it with --enable-threads ***
2023-12-19 18:09:26 Python main interpreter initialized at 0x55b8a2542e70
2023-12-19 18:09:26 your server socket listen backlog is limited to 100 connections
2023-12-19 18:09:26 your mercy for graceful operations on workers is 60 seconds
2023-12-19 18:09:26 setting request body buffering size to 65536 bytes
2023-12-19 18:09:26 mapped 501384 bytes (489 KB) for 2 cores
2023-12-19 18:09:26 *** Operational MODE: preforking ***
2023-12-19 18:09:26 initialized 54 metrics
2023-12-19 18:09:26 2023-12-19 18:09:18.831578: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-12-19 18:09:26 2023-12-19 18:09:18.879280: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-19 18:09:26 2023-12-19 18:09:18.879325: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-19 18:09:26 2023-12-19 18:09:18.880973: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-19 18:09:26 2023-12-19 18:09:18.889248: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-12-19 18:09:26 2023-12-19 18:09:18.889528: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.#012To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-12-19 18:09:26 2023-12-19 18:09:22.081113: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-12-19 18:09:26 /usr/local/lib/python3.10/site-packages/flask_sqlalchemy/__init__.py:872: FSADeprecationWarning: SQLALCHEMY_TRACK_MODIFICATIONS adds significant overhead and will be disabled by default in the future.  Set it to True or False to suppress this warning.#012  warnings.warn(FSADeprecationWarning(
2023-12-19 18:09:26 /home/Taminator/.local/lib/python3.10/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator LinearRegression from version 1.3.0 when using version 1.3.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:#012https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations#012  warnings.warn(
2023-12-19 18:09:26 /home/Taminator/.local/lib/python3.10/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator DecisionTreeRegressor from version 1.3.0 when using version 1.3.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:#012https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations#012  warnings.warn(
2023-12-19 18:09:26 /home/Taminator/.local/lib/python3.10/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator RandomForestRegressor from version 1.3.0 when using version 1.3.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:#012https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations#012  warnings.warn(
2023-12-19 18:09:26 /home/Taminator/.local/lib/python3.10/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator SVR from version 1.3.0 when using version 1.3.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:#012https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations#012  warnings.warn(
2023-12-19 18:09:26 /home/Taminator/.local/lib/python3.10/site-packages/sklearn/base.py:348: InconsistentVersionWarning: Trying to unpickle estimator StandardScaler from version 1.3.0 when using version 1.3.2. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to:#012https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations#012  warnings.warn(
2023-12-19 18:09:26 WSGI app 0 (mountpoint='') ready in 11 seconds on interpreter 0x55b8a2542e70 pid: 1 (default app)
2023-12-19 18:09:26 *** uWSGI is running in multiple interpreter mode ***
2023-12-19 18:09:26 gracefully (RE)spawned uWSGI master process (pid: 1)
2023-12-19 18:09:26 spawned uWSGI worker 1 (pid: 27, cores: 1)
2023-12-19 18:09:26 spawned 2 offload threads for uWSGI worker 1
2023-12-19 18:09:26 spawned uWSGI worker 2 (pid: 30, cores: 1)
2023-12-19 18:09:26 metrics collector thread started
2023-12-19 18:09:26 spawned 2 offload threads for uWSGI worker 2
2023-12-19 18:09:40 announcing my loyalty to the Emperor...
2023-12-19 18:09:40 announcing my loyalty to the Emperor...

I would greatly appreciate any insights or suggestions on how to resolve this issue. Thank you in advance for your help!

It looks like the InconsistentVersionWarnings are the issue. Check the versions of packages mentioned in those messages and ensure that you are using the correct ones. Also see https://help.pythonanywhere.com/pages/MachineLearningInWebsiteCode/

I updated sklearn to 1.3.0 and still no changes. But I was able to get the following Error code: 504-backend. I checked the server logs and found the following. The error logs show no errors.

2023-12-22 13:45:34 announcing my loyalty to the Emperor...
2023-12-22 13:56:28 Fri Dec 22 13:56:27 2023 - *** HARAKIRI ON WORKER 2 (pid: 42, try: 1) ***
2023-12-22 13:56:28 Fri Dec 22 13:56:27 2023 - HARAKIRI !!! worker 2 status !!!
2023-12-22 13:56:28 Fri Dec 22 13:56:27 2023 - HARAKIRI [core 0] 10.0.0.75 - POST / since 1703252786
2023-12-22 13:56:28 Fri Dec 22 13:56:27 2023 - HARAKIRI !!! end of worker 2 status !!!
2023-12-22 13:56:28 DAMN ! worker 2 (pid: 42) died, killed by signal 9 :( trying respawn ...
2023-12-22 13:56:28 Respawned uWSGI worker 2 (new pid: 49)
2023-12-22 13:56:28 spawned 2 offload threads for uWSGI worker 2

If it's a 5XX error then there definitely should be something in the error.log? Maybe you were a bit quick in checking. Sometimes the logs can take a minute to arrive

Yeah I overlooked something in the error logs.

2023-12-22 13:45:44,704: Exception on / [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 558, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 822, in _read_query_result
    result.read()
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 1200, in read
    first_packet = self.connection._read_packet()
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 748, in _read_packet
    raise err.OperationalError(
pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')
**NO MATCH**
The above exception was the direct cause of the following exception:
**NO MATCH**
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2077, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1525, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1523, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1509, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "/home/Taminator/mysite/website.py", line 272, in index
    user_id = get_user_id()
  File "/home/Taminator/mysite/website.py", line 196, in get_user_id
    if current_user and current_user.is_authenticated:
  File "/usr/local/lib/python3.10/site-packages/werkzeug/local.py", line 278, in __get__
    obj = instance._get_current_object()
  File "/usr/local/lib/python3.10/site-packages/werkzeug/local.py", line 407, in _get_current_object
    return self.__local()  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/flask_login/utils.py", line 26, in <lambda>
    current_user = LocalProxy(lambda: _get_user())
  File "/usr/local/lib/python3.10/site-packages/flask_login/utils.py", line 384, in _get_user
    current_app.login_manager._load_user()
  File "/usr/local/lib/python3.10/site-packages/flask_login/login_manager.py", line 355, in _load_user
    user = self._user_callback(user_id)
  File "/home/Taminator/mysite/website.py", line 44, in load_user
    return User.query.get(int(user_id))
  File "<string>", line 2, in get
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/deprecations.py", line 401, in warned
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 943, in get
    return self._get_impl(ident, loading.load_on_pk_identity)
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/query.py", line 947, in _get_impl
    return self.session._get_impl(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 2896, in _get_impl
    return db_load_fn(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/loading.py", line 530, in load_on_pk_identity
    session.execute(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/orm/session.py", line 1696, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1631, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/util/compat.py", line 207, in raise_
    raise exception
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.10/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 558, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 822, in _read_query_result
    result.read()
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 1200, in read
    first_packet = self.connection._read_packet()
  File "/home/Taminator/.local/lib/python3.10/site-packages/pymysql/connections.py", line 748, in _read_packet
    raise err.OperationalError(
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query')
[SQL: SELECT user.id AS user_id, user.email AS user_email, user.password AS user_password, user.checkbox_value AS user_checkbox_value, user.default_model AS user_default_model 
FROM user 
WHERE user.id = %(pk_1)s]
[parameters: {'pk_1': 1}]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

[edited by admin]

Have a look at this help page, esp. the section "Dealing with OperationalError 2013".

I've noticed that the max_allowed_packet size is different from what I expected. Currently, on my local machine where everything operates smoothly, this setting is at 64MB. How can I increase this value to accommodate larger data transfers?

You cannot increase the max_allowed_packet size.