← Back to Home

CS491‑2 Senior Project Design Logbook

Emre Akgül
September 16, 2024 – May 2, 2025

Week of October 1, 2024

Week of October 7, 2024

Week of October 14, 2024

Week of October 21, 2024

Week of November 1, 2024

Week of November 4, 2024

Week of November 11, 2024

Week of November 18, 2024

Week of December 1, 2024

Week of December 8, 2024

Week of December 15, 2024

Week of December 22, 2024

Week of December 30, 2024

Week of January 1, 2025

  • It was final period + spring break, we take a break on project for this period.

Week of February 1, 2025

  • Realized dataset was insufficiently rich and complex for optimal modeling performance; began considering strategies for dataset enhancement.
  • Continued standard machine learning work to better understand model limitations.
  • Participated in weekly meeting with Prof. Altay Güvenir.

Week of February 8, 2025

  • Began collaborating with Alara to improve data acquisition using SEC EDGAR API, significantly enriching the dataset.
  • Downloaded financial reports from 2001 to 2025 to improve dataset quality.
  • Participated in weekly meeting with Prof. Altay Güvenir.

Week of February 15, 2025

  • Started part-time role as an LLM researcher, limiting project progress temporarily due to simultaneous work commitments.

Week of February 22, 2025

  • Continued balancing new role responsibilities with minimal project work.

Week of March 1, 2025

  • Began integrating LLM and chunking strategies into the data pipeline to extract structured financial data directly from textual reports.
  • Participated in weekly meeting with Prof. Altay Güvenir.

Week of March 8, 2025

  • Successfully enriched dataset with chunked text data, extending dataset coverage back to 2001.
  • Encountered challenges with missing data points, began manual correction and data filling process.

Week of March 15, 2025

  • Completed painful manual dataset enrichment; initiated training with updated dataset.
  • Noted performance improvement in predictive models over baseline heuristics (latest quarter, latest year).
  • Participated in weekly meeting with Prof. Altay Güvenir.

Week of March 22, 2025

  • Implemented extensive hyperparameter optimization for predictive models, faced difficulties due to financial data characteristics.
  • Experimented extensively with various validation schemas to find an effective evaluation strategy.

Week of April 1, 2025

  • Selected validation strategy of quarterly training with validation on 2023 data, showing good predictive correlation with 2024.
  • Participated in weekly meeting with Prof. Altay Güvenir.

Week of April 8, 2025

  • Evaluated models: Random Forest, XGBoost, LightGBM, and CatBoost. Found Random Forest consistently outperformed others.
  • Identified overfitting issues with boosting methods despite high regularization.

Week of April 15, 2025

  • Finalized Random Forest model achieving a 15% better mean absolute error (net income) and 30% better RMSE (earnings per share) compared to baselines.
  • Participated in weekly meeting with Prof. Altay Güvenir.

Week of April 22, 2025

  • Prepared final model documentation and started compiling results for project presentation.

Week of April 28, 2025

  • Reviewed final model and performance metrics, ensuring readiness for final presentation and report submission.
  • Participated in final project review meeting with Prof. Altay Güvenir.