← Back to Home
Week of October 1, 2024
- Started initial research on financial forecasting and reviewed related literature.
- Began exploring SEC EDGAR API for fetching financial datasets.
- Participated in project meeting with Atakan Erdem.
- Attended seminar: "Introduction to CS491-2 Senior Design Project" by Selim Aksoy.
Week of October 7, 2024
- Encountered problematic issues with SEC EDGAR API due to inconsistent data and documentation.
- Started searching for alternative sources and datasets.
- Participated in weekly meeting with Prof. Altay Güvenir.
- Attended seminar: "Atakan Erdem and Mert Bıçakçı Introduction" by Atakan Erdem and Mert Bıçakçı.
Week of October 14, 2024
- Discovered a Kaggle dataset covering NASDAQ financial data from 2008–2023.
- Performed initial analysis to assess data completeness and quality.
- Participated in weekly meeting with Prof. Altay Güvenir.
- Attended seminar: "Innovation Life Cycle" by Atakan Erdem.
Week of October 21, 2024
- Identified major data gaps and inconsistencies in Kaggle dataset, traced back to original SEC EDGAR API issues.
- Began extensive data cleaning and preprocessing.
- Attended seminar: "The Role of Documentation in (OO) Software Development" by Uğur Doğrusöz.
Week of November 1, 2024
- Continued intensive data cleaning and validation to create a robust dataset for net income loss predictions.
- Participated in weekly meeting with Prof. Altay Güvenir.
- Attended seminar: "La Chanson de Shannon: Shannon's Song" by Fazlı Can.
Week of November 4, 2024
- Successfully established a somewhat reliable dataset structure for net income prediction.
- Started investigating Kaggle competitions related to financial forecasting.
Week of November 11, 2024
- Joined a Kaggle competition focused on financial predictions to gain insights into effective modeling techniques.
- Experimented with initial feature engineering approaches.
- Participated in weekly meeting with Prof. Altay Güvenir.
- Attended seminar: "Tools and Processes in Software Development Lifecycle" by Murat Ergun.
Week of November 18, 2024
- Explored various tabular data models including Random Forest, XGBoost, LightGBM, and CatBoost.
- Evaluated their preliminary performance on financial forecasting tasks.
- Attended seminar: "Technology Entrepreneurship and Investment Ecosystem" by Numan Numan.
Week of December 1, 2024
- Started studying multimodal neural network architectures for potential integration of textual data from reports with tabular data.
- Investigated relevant literature and existing implementations.
- Participated in weekly meeting with Prof. Altay Güvenir.
- Attended seminar: "AI-Driven Mobile Apps: Building Profitable Solutions on a Budget" by Melih Gurgah.
Week of December 8, 2024
- Explored text-tabular and image-tabular multimodal architectures, focusing particularly on text-tabular designs.
- Began drafting initial architecture designs suitable for our data.
- Attended seminar: "Computational Law" by Dilek Küçük.
Week of December 15, 2024
- Implemented a basic multimodal text-tabular neural network.
- Identified issues with handling multi-chunk report data.
- Participated in weekly meeting with Prof. Altay Güvenir.
- Attended seminar: "Bitcoin's First ZK Rollup" by Murat Karademir and Ömer Talip Akalın.
Week of December 22, 2024
- Revised dataset structure and prepared documentation for smoother onboarding and future work.
- Participated in project retrospectives and planning sessions for upcoming semester.
- Participated in weekly meeting with Prof. Altay Güvenir and project meeting with Atakan Erdem; attended senior design seminar.
Week of December 30, 2024
- Finalized semester work; conducted dataset quality assurance checks.
- Prepared initial presentations and reports summarizing semester progress.
- Participated in weekly meeting with Prof. Altay Güvenir and project meeting with Atakan Erdem; attended senior design seminar.
Week of January 1, 2025
- It was final period + spring break, we take a break on project for this period.
Week of February 1, 2025
- Realized dataset was insufficiently rich and complex for optimal modeling performance; began considering strategies for dataset enhancement.
- Continued standard machine learning work to better understand model limitations.
- Participated in weekly meeting with Prof. Altay Güvenir.
Week of February 8, 2025
- Began collaborating with Alara to improve data acquisition using SEC EDGAR API, significantly enriching the dataset.
- Downloaded financial reports from 2001 to 2025 to improve dataset quality.
- Participated in weekly meeting with Prof. Altay Güvenir.
Week of February 15, 2025
- Started part-time role as an LLM researcher, limiting project progress temporarily due to simultaneous work commitments.
Week of February 22, 2025
- Continued balancing new role responsibilities with minimal project work.
Week of March 1, 2025
- Began integrating LLM and chunking strategies into the data pipeline to extract structured financial data directly from textual reports.
- Participated in weekly meeting with Prof. Altay Güvenir.
Week of March 8, 2025
- Successfully enriched dataset with chunked text data, extending dataset coverage back to 2001.
- Encountered challenges with missing data points, began manual correction and data filling process.
Week of March 15, 2025
- Completed painful manual dataset enrichment; initiated training with updated dataset.
- Noted performance improvement in predictive models over baseline heuristics (latest quarter, latest year).
- Participated in weekly meeting with Prof. Altay Güvenir.
Week of March 22, 2025
- Implemented extensive hyperparameter optimization for predictive models, faced difficulties due to financial data characteristics.
- Experimented extensively with various validation schemas to find an effective evaluation strategy.
Week of April 1, 2025
- Selected validation strategy of quarterly training with validation on 2023 data, showing good predictive correlation with 2024.
- Participated in weekly meeting with Prof. Altay Güvenir.
Week of April 8, 2025
- Evaluated models: Random Forest, XGBoost, LightGBM, and CatBoost. Found Random Forest consistently outperformed others.
- Identified overfitting issues with boosting methods despite high regularization.
Week of April 15, 2025
- Finalized Random Forest model achieving a 15% better mean absolute error (net income) and 30% better RMSE (earnings per share) compared to baselines.
- Participated in weekly meeting with Prof. Altay Güvenir.
Week of April 22, 2025
- Prepared final model documentation and started compiling results for project presentation.
Week of April 28, 2025
- Reviewed final model and performance metrics, ensuring readiness for final presentation and report submission.
- Participated in final project review meeting with Prof. Altay Güvenir.