Skip to content
View sntk-76's full-sized avatar

Block or report sntk-76

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sntk-76/README.md

Hi there, I'm Sina Tavakoli

Data Engineer Data Scientist Python Developer

Banner


Welcome to My GitHub Profile

I'm a data-driven Python developer with a strong academic foundation and professional experience at the intersection of data science, data engineering, and machine learning. Holding a Bachelor's in Electrical Engineering and currently pursuing a Master's in ICT, I specialize in building scalable, reliable, and impactful data solutions that empower decision-making and automation.

With over 4 years of hands-on experience in Python programming, I’ve successfully contributed to projects ranging from cloud-native data pipelines to deep learning-based sentiment analysis. I thrive in environments where I can combine analytical thinking with engineering rigor to deliver end-to-end solutions that drive real value.


Core Competencies

  • Programming: Python, SQL, Bash, Pandas, NumPy, Scikit-learn, TensorFlow
  • Data Engineering: Apache Airflow, Docker, Terraform, dbt, ETL/ELT pipelines, cloud storage (GCP & AWS)
  • Data Science: Machine Learning, Data Cleaning, Feature Engineering, Predictive Modeling
  • Tools: GitHub, Jupyter, VS Code, Tableau, Power BI, Google BigQuery
  • Platforms: Linux, GCP (BigQuery, Cloud Storage), AWS (S3, Lambda), WSL

Selected Projects

Category Project Description
Data Engineering energy_forecast_pipeline A cloud-native batch pipeline built with Terraform, Airflow, Spark, BigQuery, and Prophet to ingest, clean, forecast, and visualize Germany's energy consumption. Includes dbt modeling and Power BI dashboards.
Data Engineering Retail Data Pipeline A scalable pipeline built with Docker, Airflow, and BigQuery for ingestion, transformation, and visualization of retail data. Demonstrates CI/CD and modular ETL.
Deep Learning Data Mining Reddit post classifier using LSTM-based deep learning for popularity prediction.
NLP Abstract-Based Sentiment Analysis Extracted sentiment from research abstracts using pretrained NLP models.
Deep Learning Breast Cancer Detection Developed SVM and Random Forest models for early-stage cancer detection.
Data analysis Customer Segmentation Applied clustering (KMeans, DBSCAN) to segment customer behaviors.
Game Development Snake Game A modular Python implementation of the classic Snake game using Pygame.
Data Analysis Google Play Store Analysis Cleaned and visualized app metadata to explore rating and monetization trends.
ML Classification Project 1 Trained multiple classifiers (Logistic, Random Forest, XGBoost) with tuning.
ML Car Price Estimation Used regression techniques to estimate car prices based on key features.
Data Analysis Market Analysis Built dashboards and reports to visualize market trends and demand shifts.
Data Analysis Analyzing Ukraine War Analyzed public datasets to identify patterns in geopolitical event data.
Computer Vision Corrupted Images & Patches Detected and corrected corrupted image regions using OpenCV and CNNs.
Computer Vision Road Sign Detection Trained a YOLOv3-based object detector for road sign classification.
Computer Vision Line Detection Used edge detection and Hough Transform for identifying linear features.
Data Analysis Olympics Game Network Visualized network relationships among Olympic sports and athletes.

Currently Mastering

  • Data Engineering Tools: Terraform, Airflow, Docker, BigQuery, Kafka
  • Scalable Architectures: Modular, production-grade pipelines for batch/stream processing
  • MLOps & Deployment: CI/CD, model versioning, container orchestration
  • Cloud Platforms: GCP and AWS with a focus on data infrastructure and cost-efficiency

Areas of Interest

  • Designing end-to-end data platforms for analytics and ML
  • Large-scale data cleaning and transformation
  • Data pipeline orchestration using Airflow and dbt
  • ML model deployment and lifecycle management
  • Cloud-native architectures for data-intensive systems

Let's Connect

LinkedIn
Kaggle
Email


GitHub Stats

Sina's GitHub Stats Top Languages


GitHub Trophies

Sina's GitHub Trophies


Useful Links

Pinned Loading

  1. Retail-Data-Pipeline Retail-Data-Pipeline Public

    Jupyter Notebook 3

  2. Data-Mining Data-Mining Public

    Jupyter Notebook 1 4

  3. energy-forecast-pipeline energy-forecast-pipeline Public

    Jupyter Notebook 3