Home Python Adventures In Import-land, Half II

Adventures In Import-land, Half II

0
Adventures In Import-land, Half II

[ad_1]

KeyError: 'GOOGLE_APPLICATION_CREDENTIALS‘”

It was approach too early within the morning for this error. See in case you can spot the issue. I hadn’t had my espresso earlier than attempting to debug the code I’d written the evening earlier than, so it can in all probability take you much less time than it did me.

app.py:

from dotenv import load_dotenv
from file_handling import initialize_constants

load_dotenv()
#...

file_handling.py:

import os
from google.cloud import storage

UPLOAD_FOLDER=None
DOWNLOAD_FOLDER = None

def initialize_cloud_storage():
    """
    Initializes the Google Cloud Storage shopper.
    """
    os.environ["GOOGLE_APPLICATION_CREDENTIALS"]
    storage_client = storage.Consumer()
    bucket_name = #redacted
    return storage_client.bucket(bucket_name)

def set_upload_folder():
    """
    Determines the atmosphere and units the trail to the add folder accordingly.
    """
    if os.environ.get("FLASK_ENV") in ["production", "staging"]:
        UPLOAD_FOLDER = os.path.be a part of("/tmp", "add")
        os.makedirs(UPLOAD_FOLDER, exist_ok=True)
    else:
        UPLOAD_FOLDER = os.path.be a part of("src", "upload_folder")
    return UPLOAD_FOLDER

def initialize_constants():
    """
    Initializes the worldwide constants for the appliance.
    """
    UPLOAD_FOLDER = initialize_upload_folder()
    DOWNLOAD_FOLDER = initialize_cloud_storage()
    return UPLOAD_FOLDER, DOWNLOAD_FOLDER
  
DOWNLOAD_FOLDER=initialize_cloud_storage()

def write_to_gcs(content material: str, file: str):
    "Writes a textual content file to a Google Cloud Storage file."
    blob = DOWNLOAD_FOLDER.blob(file)
    blob.upload_from_string(content material, content_type="textual content/plain")

def upload_file_to_gcs(file_path:str, gcs_file: str):
    "Uploads a file to a Google Cloud Storage bucket"
    blob = DOWNLOAD_FOLDER.blob(gcs_file)
    with open(file_path, "rb") as f:
        blob.upload_from_file(f, content_type="software/octet-stream")

See the issue?

This was simply the dialogue of a latest Pybites article.

When app.py imported initialize_constants from file_handling, the Python interpreter ran

DOWNLOAD_FOLDER = initialize_cloud_storage()

and seemed for GOOGLE_APPLICATION_CREDENTIALS from the atmosphere path, however load_dotenv hadn’t added them to the atmosphere path from the .env file but.

Sometimes, configuration variables, secret keys, and passwords are saved in a file known as .env after which learn as atmosphere variables somewhat than as pure textual content utilizing a bundle akin to python-dotenv, which is what’s getting used right here.

So, I had a number of choices.

I may name load_dotenv earlier than importing from file_handling:

from dotenv import load_dotenv
load_dotenv()

from file_handling import initialize_constants

However that’s not very Pythonic.

I may name initialize_cloud_storage inside each upload_file_to_gcs and write_to_gcs

def write_to_gcs(content material: str, file: str):
    "Writes a textual content file to a Google Cloud Storage file."
    DOWNLOAD_FOLDER = initialize_cloud_storage()
    blob = DOWNLOAD_FOLDER.blob(file)
    blob.upload_from_string(content material, content_type="textual content/plain")

def upload_file_to_gcs(file_path:str, gcs_file: str):
    "Uploads a file to a Google Cloud Storage bucket"
    DOWNLOAD_FOLDER = initialize_cloud_storage()
    blob = DOWNLOAD_FOLDER.blob(gcs_file)
    with open(file_path, "rb") as f:
        blob.upload_from_file(f, content_type="software/octet-stream")

However this violates the DRY precept. Plus we actually shouldn’t be initializing the storage shopper a number of occasions. The truth is, we already are initializing it twice in the way in which the code was initially written.

Going International

So what about this?

DOWNLOAD_FOLDER = None
 
def initialize_constants():
    """
    Initializes the worldwide constants for the appliance.
    """
    international DOWNLOAD_FOLDER
    UPLOAD_FOLDER = initialize_upload_folder()
    DOWNLOAD_FOLDER = initialize_cloud_storage()
    return UPLOAD_FOLDER, DOWNLOAD_FOLDER

Right here, we’re defining DOWNLOAD_FOLDER as having international scope.

This may work right here.

This may work right here, as a result of upload_file_to_gcs and write_to_gcs are in the identical module. But when they have been in a unique module, it might break.

Why does it matter?

Properly, let’s return to how Python handles imports. Keep in mind that Python runs any code exterior of a operate or class at import. That applies to variable (or fixed) project, as effectively. So if upload_file_to_gcs and write_to_gcs have been in one other module and importing DOWNLOAD_FOLDER from file_handling,p it might be importing it whereas assigned a worth of None. It wouldn’t matter that by the point it was wanted, it wouldn’t be assigned to None any longer. Inside this different module, it might nonetheless be None.

What could be vital on this state of affairs could be one other operate known as get_download_folder.

def get_download_folder():
    """
    Returns the present worth of the Google Cloud Storage bucket
    """
    return DOWNLOAD_FOLDER

Then, on this different module containing the upload_file_to_gcs and write_to_gcs capabilities, I might import get_download_folder as an alternative of DOWNLOAD_FOLDER. By importing get_download_folder, you may get the worth of DOWNLOAD_FOLDER after it has been assigned to an precise worth, as a result of get_download_folder gained’t run till you explicitly name it. Which, presumably wouldn’t be till after you’ve let initialize_cloud_storage do its factor.

I’ve one other a part of my codebase the place I’ve performed this. On my web site, I’ve a instrument that helps authors create finetunes of GPT 3.5 from their books. This Finetuner is BYOK, or ‘convey your personal key’ that means that customers provide their very own OpenAI API key to make use of the instrument. I selected this route as a result of charging authors to fine-tune a mannequin after which charging them to make use of it, eternally, is simply not one thing that advantages both of us. This fashion, they’ll take their finetuned mannequin and use it an any of the a number of different BYOK AI writing instruments which might be on the market, and I don’t have to take care of writing software program on high of all the pieces else. So the webapp’s type accepts the consumer’s API key, and after a sound type submit, begins a thread of my Finetuner software.

This software begins within the training_management.py module, which imports set_client and get_client from openai_client.py and passes the consumer’s API key to set_client instantly. I can’t import shopper straight, as a result of shopper is None till set_client has been handed the API key, which occurs after import.

from openai import OpenAI

shopper = None

def set_client(api_key:str):
    """
    Initializes OpenAI API shopper with consumer API key
    """
    international shopper
    shopper = OpenAI(api_key = api_key)

def get_client():
    """
    Returns the initialized OpenAI shopper
    """
    return shopper

When the operate that begins a nice tuning job begins, it calls get_client to retrieve the initialized shopper. And by transferring the API shopper initialization into one other module, it turns into accessible for use for an AI-powered chunking algorithm I’m engaged on. Nothing superb. Principally, simply producing scene beats from every chapter to make use of because the immediate, with the precise chapter because the response. It wants work nonetheless, however it’s accessible for authors who wish to attempt it.

A Class Act

Now, we may go one step farther from right here. The code we’ve settled on to date depends on international names. Maybe we are able to get away with this. DOWNLOAD_FOLDER is a continuing. Properly, type of. Keep in mind, it’s outlined by initializing a connection to a cloud storage container. It’s really a category. By rights, we needs to be encapsulating all of this logic within one other class.

So what may that seem like? Properly, it ought to initialize the add and obtain folders, and expose them as properties, after which use the capabilities write_to_gcs and upload_file_to_gcs as strategies like this:

class FileStorageHandler:
    def __init__(self):
        self._upload_folder = self._set_upload_folder()
        self._download_folder = self._initialize_cloud_storage()
    
    @property
    def upload_folder(self):
        return self._upload_folder
    
    @property
    def download_folder(self):
        return self._download_folder

    def _initialize_cloud_storage(self):
        """
        Initializes the Google Cloud Storage shopper.
        """
        os.environ["GOOGLE_APPLICATION_CREDENTIALS"]
        storage_client = storage.Consumer()
        bucket_name = #redacted
        return storage_client.bucket(bucket_name)

    def _set_upload_folder(self):
        """
        Determines the atmosphere and units the trail to the add folder accordingly.
        """
        if os.environ.get("FLASK_ENV") in ["production", "staging"]:
            upload_folder = os.path.be a part of("/tmp", "add")
            os.makedirs(upload_folder, exist_ok=True)
        else:
            upload_folder = os.path.be a part of("src", "upload_folder")
        return upload_folder

    def write_to_gcs(self, content material: str, file_name: str):
        """
        Writes a textual content file to a Google Cloud Storage file.
        """
        blob = self._download_folder.blob(file_name)
        blob.upload_from_string(content material, content_type="textual content/plain")

    def upload_file_to_gcs(self, file_path: str, gcs_file_name: str):
        """
        Uploads a file to a Google Cloud Storage bucket.
        """
        blob = self._download_folder.blob(gcs_file_name)
        with open(file_path, "rb") as file_obj:
            blob.upload_from_file(file_obj)

Now, we are able to initialize an occasion of FileStorageHandler in app.py and assign UPLOAD_FOLDER and DOWNLOAD_FOLDER to the properties of the category.

from dotenv import load_dotenv
from file_handling import FileStorageHandler

load_dotenv()

folders = FileStorageHandler()

UPLOAD_FOLDER = folders.upload_folder
DOWNLOAD_FOLDER = folders.download_folder

Key take away

Within the instance, the error arose as a result of initialize_cloud_storage was known as on the high degree in file_handling.py. This resulted in Python trying to entry atmosphere variables earlier than load_dotenv had an opportunity to set them.

I had been considering of module degree imports as “all the pieces on the high runs at import.” However that’s not true. Or somewhat, it’s true, however not correct. Python executes code based mostly on indentation, and capabilities are indented inside the module. So, it’s truthful to say that each line that isn’t indented is on the high of the module. The truth is, it’s even known as that: top-level code, which is outlined as mainly something that’s not a part of a operate, class or different code block.

And top-level code runs runs when the module is imported. It’s not sufficient to bury an expression beneath some capabilities, it can nonetheless run instantly when the module is imported, whether or not you might be prepared for it to run or not. Which is admittedly what the argument in opposition to international variables and state is all about, managing when and the way your code runs.

Understanding top-level code execution at import helped solved the preliminary error and design a extra strong sample.

Subsequent steps

The draw back with utilizing a category is that if it will get known as once more, a brand new occasion is created, with a brand new connection to the cloud storage. To get round this, one thing to look into could be to implement one thing known as a Singleton Sample, which is exterior of the scope of this text.

Additionally, the code presently doesn’t deal with exceptions which may come up throughout initialization (e.g., points with credentials or community connectivity). Including strong error dealing with mechanisms will make the code extra resilient.

Talking of robustness, I might be remiss if I didn’t level out {that a} correctly abstracted initialization technique ought to retrieve the bucket identify from a configuration or .env file as an alternative of leaving it hardcoded within the technique itself.



[ad_2]

LEAVE A REPLY

Please enter your comment!
Please enter your name here