This repository provides APIs for HS code classification using machine learning models built with FastAPI and Flask. The APIs enable users to classify descriptions into HS codes with predictions generated by trained Logistic Regression and Artificial Neural Network (ANN) models.
ANN_model.h5
: Trained Artificial Neural Network (ANN) model.ann_label_encoder.pkl
: Label encoder for the ANN model.ann_vectorizer.pkl
: Vectorizer for the ANN model.count_vectorizer.pkl
: Bag of words vectorizer used in Logistic Regression.label_encoder.pkl
: General label encoder.lr_bow.pkl
: Trained Logistic Regression model with bag of words.main.py
: FastAPI implementation of the HS code classification API.main_flask.py
: Flask implementation of the HS code classification API.
-
Client Request:
- The client sends a POST request to the
/predict
endpoint with a JSON body containing a list of descriptions.
- The client sends a POST request to the
-
Request Handling:
- The API (built using either FastAPI or Flask) receives the request and routes it to the
predict
function.
- The API (built using either FastAPI or Flask) receives the request and routes it to the
-
Preprocessing:
- Descriptions are preprocessed using a custom function that:
- Converts the text to lowercase.
- Removes special characters.
- Tokenizes the text.
- Removes stop words.
- Lemmatizes the tokens.
- Joins the tokens back into a cleaned string.
- Descriptions are preprocessed using a custom function that:
-
Vectorization:
- The preprocessed descriptions are transformed using the appropriate vectorizer (
ann_vectorizer.pkl
for ANN orcount_vectorizer.pkl
for Logistic Regression).
- The preprocessed descriptions are transformed using the appropriate vectorizer (
-
Prediction:
- The model (either ANN or Logistic Regression) makes predictions based on the vectorized descriptions and outputs probabilities for each class (HS code).
-
Top-N Prediction Extraction:
- The top 3 class indices and their respective probabilities are extracted from the model's output.
-
Label Decoding:
- The top class indices are decoded into human-readable HS codes using the label encoder.
-
Response Construction:
- The API constructs a response containing the original descriptions and their corresponding top 3 predictions. Each prediction includes the HS code and its associated probability.
-
Response:
- The predictions are returned as a JSON response to the client.
This endpoint accepts a list of product descriptions and returns the top 3 HS codes with their respective probabilities.
- URL:
/predict
- Method:
POST
- Content-Type:
application/json
- Python 3.7+
- FastAPI
- Flask
- scikit-learn
- TensorFlow
To clone the repository, use the following commands:
git clone https://github.com/Muhammad-Talha4k/hs_code_classification_api_with_fast-flask.git
cd hs_code_classification_api_with_fast-flask
Send a POST request to the /predict endpoint with descriptions in JSON format with this request body:
{
"descriptions": [
"Product description 1",
"Product description 2"
]
}
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.