Forums

Help with updating MySQL with Scheduled Tasks

The main thing that I'm trying to do is scrape some news sites every few minutes. I've tried everything I could in views.py and running a scheduled task without SQL which should've returned updated values to views.py, but to no avail. My task logs would show that the tasks were being successfully run, but the result was that new values would be passed to my template only two or three times. I read on the forums that one might be able to use MySQL in a task to scrape content from a site, update the MySQL field, and then return it to a template. However, when I wrote the following code in my scheduled task, I got these error outputs. If anyone can help me with scraping in a task (with or without SQL), it would certainly help with my blood pressure lol. The first code is the code I had without making a virtual environment, and the second is one with a virtual environment.

no virtual environment

from models import newsarticlestest
from datetime import datetime

def save_current_datetime():
    now = datetime.now()
    now_str = now.strftime("%Y-%m-%d %H:%M:%S")
    new_entry = newsarticlestest(tester=now_str)
    new_entry.save()

save_current_datetime()

Traceback (most recent call last):
  File "/home/finsee/mysite/pages/sqltester.py", line 1, in <module>
    from models import newsarticlestest
  File "/home/finsee/mysite/pages/models.py", line 7, in <module>
    class salarydata(models.Model):
  File "/usr/local/lib/python3.10/site-packages/django/db/models/base.py", line 127, in __new__
    app_config = apps.get_containing_app_config(module)
  File "/usr/local/lib/python3.10/site-packages/django/apps/registry.py", line 260, in get_containing_app_config
    self.check_apps_ready()
  File "/usr/local/lib/python3.10/site-packages/django/apps/registry.py", line 137, in check_apps_ready
    settings.INSTALLED_APPS
  File "/usr/local/lib/python3.10/site-packages/django/conf/__init__.py", line 87, in __getattr__
    self._setup(name)
  File "/usr/local/lib/python3.10/site-packages/django/conf/__init__.py", line 67, in _setup
    raise ImproperlyConfigured(
django.core.exceptions.ImproperlyConfigured: Requested setting INSTALLED_APPS, but settings are not configured. You must either define the environment variable DJANGO_SETTINGS_MODULE or call settings.configure() before accessing settings.

2023-12-04 13:30:37 -- Completed task, took 22.85 seconds, return code was 1.

making the virtual environment

import os
import django
os.environ.setdefault('DJANGO_SETTINGS_MODULE', '/home/finsee/mysite/mysite/settings.py')
django.setup()
from models import newsarticlestest
from datetime import datetime

def save_current_datetime():
    now = datetime.now()
    now_str = now.strftime("%Y-%m-%d %H:%M:%S")
    new_entry = newsarticlestest(tester=now_str)
    new_entry.save()

save_current_datetime()

Traceback (most recent call last):
  File "/home/finsee/mysite/pages/sqltester.py", line 4, in <module>
    django.setup()
  File "/usr/local/lib/python3.10/site-packages/django/__init__.py", line 19, in setup
    configure_logging(settings.LOGGING_CONFIG, settings.LOGGING)
  File "/usr/local/lib/python3.10/site-packages/django/conf/__init__.py", line 87, in __getattr__
    self._setup(name)
  File "/usr/local/lib/python3.10/site-packages/django/conf/__init__.py", line 74, in _setup
    self._wrapped = Settings(settings_module)
  File "/usr/local/lib/python3.10/site-packages/django/conf/__init__.py", line 183, in __init__
    mod = importlib.import_module(self.SETTINGS_MODULE)
  File "/usr/local/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named '/home/finsee/mysite/mysite/settings'

2023-12-04 13:02:17 -- Completed task, took 9.52 seconds, return code was 1.

[edit by admin: formatting]

If you want to run Django code outside your website (which is a good way to do what you're trying to do) I'd recommend that instead of trying to set the DJANGO_SETTINGS_MODULE environment variable and set thing up manually, you look into creating a custom Django management command -- there's a bit of boilerplate that you need to write, but it's much less fiddly and error-prone than setting things up manually, and at the end you'll be able to schedule a task to do something like

cd /home/finsee/mysite/; python manage.py my_custom_command

...and it will do the work you need to do.

I was really hoping to just have some scraping done in the backround on PA. I've since found a workaround where I have my django cache cleared in my views.py and I was able to display my data in the template, albeit the display still isn't working right, I think itll do

Generally it's better to separate code doing scraping from the code that is serving your views so it looks like directon Giles suggested is the right one. Take a look at https://blog.pythonanywhere.com/198/ for an example that is not Django, but probably still relevant as a pattern.