Skip to content
View msns-1927's full-sized avatar
πŸ’­
Data Scientist πŸ‘Ύ
πŸ’­
Data Scientist πŸ‘Ύ

Block or report msns-1927

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MSNS-1927/README.md

Hi thereπŸ‘‹, I'm Siva ;

About me 🌟 :

  • πŸ“ˆ I’m interested in Data Science.
  • 🎯 Currently working on Real-World Projects to skill up myself.
  • ✌️ I’m willing to collaborate on Machine Learning Projects.
  • πŸŽ“ Pursuing a Bachelor of Technology in Artificial Intelligence and Data Science.
  • πŸ“‘ How to contact me:
  • πŸŒ€ Pronouns : He/Him
  • πŸ₯΄ Fun fact : I'll cook when I'm stressed out.

My Projects πŸ₯΅ :

  • Multi-Label Toxic Comment Detection :
    • This project implements a multi-label toxic comment classification system using classical Natural Language Processing (NLP) and machine learning techniques. The goal is to automatically identify different types of toxic behavior in online comments, where a single comment may belong to multiple categories such as toxic, obscene, insult, threat, severe toxic, and identity hate.
    • The workflow includes data cleaning, feature engineering, and text vectorization using TF-IDF with unigrams and bigrams. A One-Vs-Rest Logistic Regression model is trained to handle the multi-label nature of the problem. Since the dataset is highly imbalanced, the model is evaluated using F1-score and ROC-AUC metrics rather than accuracy to ensure reliable performance assessment.
    • This project demonstrates a complete end-to-end machine learning pipeline for text classification, with a focus on interpretability, proper evaluation, and best practices in handling real-world NLP data. [https://github.com/msns-1927/Multi-Label-Toxic-Comment-Detection]
  • TelcoVision : AI-Powered Market Churn Prediction System :
    • Developed a machine learning system to predict telecom subscriber churn across India (2009–2025). Used XGBoost, LightGBM, and Ensemble models with circle-wise and operator-wise data for accurate forecasting.Built an interactive dashboard for real-time churn monitoring, risk segmentation, and business impact insights.Delivered a production-ready solution supporting data-driven retention and revenue optimization strategies. [https://github.com/msns-1927/telecom_subscriptions_churn_prediction_sys]
  • YouTube Video Downloader :
    • Developed a Python desktop application with Tkinter enabling easy YouTube video/audio downloads in multiple formats. Integrated yt-dlp and ffmpeg for efficient downloads, audio extraction, and conversion. Designed a simple interface with folder selection, error handling, and progress feedback, making content downloading accessible without command-line complexity. [https://github.com/msns-1927/yt_vd_downloader]
  • TED Talks Data Analysis & Visualization :
    • Preprocessed and analyzed the TED Talks dataset to explore patterns in speaker popularity and engagement.
    • Built and tuned binary classification models using hyperparameter optimization, achieving an accuracy improvement of 15% over baseline models.
    • Visualized topic trends and feature importance insights to guide future recommendation systems.
  • Marigold Harvesting Project (Community Service) :
    • Led a data collection and optimization initiative to improve yield prediction and resource allocation for agricultural efficiency.

Skills πŸͺ„ :

  • Programming : Python, R(Basics), SQL.
  • Machine Learning : Supervised/Unsupervised Learning, Feature Engineering, Model Evaluation, Ensemble Models, Model Deployment.
  • Frameworks & Tools : Numpy, Pandas, Matplotlib, Scikit-Learn, XGBoost, LightGBM, Jupyter Notebook, PyCharm, MS Office
  • Visualization & Databases : Power BI, Looker Studio, Excel, PostgreSQL, pgAdmin4
  • Soft Skills : Communication, Time Management, Team Leadership, Analytical Thinking, Attention to detail.

Experience ✨ :

  • Data Scientist with Advanced Gen AI @ Innomatics Research Labs (Nov 2025 - Present) :
    • Actively engaging in assigned projects, products, and trainings related to Data Science, Artificial Intelligence, Machine Learning, and GenAI.
    • Applying theoretical concepts and putting best efforts into executing complex tasks to continuously upskill in advanced analytics and model development.
    • Acknowledged the highly confidential nature of the association and the requirement to keep all business information strictly confidential.
  • Associate Project Manager @ Excelerate (Oct 2025 - Nov 2025) :
    • Led team efforts to develop a comprehensive project plan for a Global Career Fair.
    • Directed the team's research and assessment of AI project management tools for task automation, resource allocation and risk management.
    • Compiled and presented the team's consolidated research and testing insights, delivering actionable AI integration recommendations to project associates.
  • Data Scientist @ Evoastra Ventures Inc. (Sep 2025 - Oct 2025) :
    • Delivered business-impact technical solutions by analyzing large datasets and optimizing data-driven workflows.
    • Collaborated closely with project leads to improve process efficiency by 20% through clear progress updates and automation.
    • Contributed to live projects involving machine learning pipelines and data visualization dashboards.
  • Data Analyst Associate @ Excelerate (Aug 2025 - Sep 2025) :
    • Extracted, cleaned, and analyzed large-scale datasets using PostgreSQL, pgAdmin4, and Excel to identify key performance trends.
    • Developed dynamic dashboards in Google Looker Studio, improving stakeholder decision visibility by 30%.
    • Executed performance reports that enhanced understanding of business KPIs, leading to more data-informed planning.

Popular repositories Loading

  1. yt_vd_downloader yt_vd_downloader Public

    A simple Tkinter-based YouTube downloader built with Python and yt-dlp. Supports multiple resolutions and audio formats with a user-friendly interface.

    Python

  2. telecom_subscriptions_churn_prediction_sys telecom_subscriptions_churn_prediction_sys Public

    "A machine learning solution for predicting telecom subscriber churn in India (2009–2025). Features data cleaning, engineered metrics, and accurate models (XGBoost, LightGBM), plus a dashboard for …

    Jupyter Notebook

  3. MSNS-1927 MSNS-1927 Public

  4. innomatics_tasks innomatics_tasks Public

    Praticed Programs

    Jupyter Notebook

  5. Multi-Label-Toxic-Comment-Detection Multi-Label-Toxic-Comment-Detection Public

    β€œBuilt a multi-label toxic comment classification system using TF-IDF features and a One-Vs-Rest Logistic Regression model, evaluated with F1-score and ROC-AUC to handle class imbalance.”

    Jupyter Notebook