Need feedback on my Senior Thesis: An automated MLOps pipeline for AI news classification & summarization [D]

Need feedback on my Senior Thesis: An automated MLOps pipeline for AI news classification & summarization [D]

Hi everyone,

I'm currently a senior (4th-year undergrad) working on my graduation thesis. For my project, I decided to build an automated MLOps system that aggregates, classifies, and summarizes AI-related news.

Here’s a quick breakdown of how the system works:

  1. Data Ingestion: The system automatically scrapes news articles at scheduled intervals.
  2. Classification: It categorizes the scraped articles into four labels: Market, Solution & Use Case, Deep Dive, and Noise.
  3. Summarization: It then passes the relevant articles through the Gemini API to generate concise summaries.

I've attached a diagram of my current deployment architecture below.

https://preview.redd.it/wqv7mg1e1kvg1.png?width=2410&format=png&auto=webp&s=60268ce337edaec085297f0b6d4566b0671b7efb

My Ask: To be completely honest, I feel like my current setup is still a bit basic/rudimentary. Since I don't have professional experience in building production MLOps pipelines yet, I'm a bit nervous about presenting this and would really appreciate a reality check from you all.

  • What am I missing in this architecture?
  • Are there any best practices, tools, or steps (e.g., monitoring, CI/CD, data validation) I should add to make it more robust?
  • Any suggestions to level this up before my final defense?

I'm open to any critiques or advice you might have. Thank you so much in advance for your time and help!

submitted by /u/bigcityboys
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top