Understanding MLops in the Context of Movie Website
I. Introduction to MLops (Machine Learning Operations)
A. Key Aspects
- Complexity
- Error Handling
- Handling New Data
- Deterministic Models
- Zero Drift
B. Context: Movie Website Architecture
- Database (SQL)
- Backend
- Frontend
- Recommender System (Model)
- Database → Backend → Frontend
- Model: Retrieves data, calculates ratings, selects top movies
- Frontend displays recommendations to users
- Deterministic and Zero Drift:
- Changes in the database (e.g., new movies) don't affect other parts due to the deterministic nature of the model.
- White box model: Explains why specific recommendations are made.
II. Introduction of Machine Learning
A. Complexity Increase
- Loss of customers, decreased engagement
- Move from recommending only top movies to personalized recommendations.
- Personalized Recommendation:
- Consider user history or collaborative recommender system.
- Example: User A and B have similar preferences; recommend a movie liked by B to A.
- Technology Stack Changes:
- Backend: Node, JS, or Python files
- Frontend: HTML, CSS, JS
- Emphasis on collecting quality data for personalized recommendations.
B. Introduction of Data Warehouse
- Collecting user data on preferences and interactions.
- Introduction of Data Warehouse for handling data separately from the main database.
- Data Warehouse Importance:
- Building models separate from the main database.
- Integration of external data (e.g., IMDb ratings) for model enhancement.
- Extraction, Transformation, and Load
- API used to transfer data to the Data Warehouse.
C. Collaboration and Complexity
- More team collaboration.
- Involvement of data science teams in model building.
III. Error Handling and Drift Management
A. Error Debugging
- Data, Code, Model
- Debugging becomes critical in a more complex system.
B. Handling New Data
- Use ETL to bring new data into the Data Warehouse.
- Data Science Team: Retraining the model based on the updated data.
C. Probabilistic Nature and Governance
- Model may recommend content inappropriate for certain audiences.
- Necessity for updating privacy policies.
D. Drift in Data
- Personal events (e.g., breakup) can shift preferences.
- External factors (e.g., celebrity endorsement) influence movie popularity.
IV. Multi-Model Approach
- Introduction of multiple models for different purposes (e.g., recommendations, relevant ads).
V. Conclusion
- MLops introduces complexity in the form of personalized recommendations, data warehousing, collaboration, error handling, and drift management.
- Requires a multi-disciplinary approach involving data engineering, data science, and continuous adaptation.