Along the globalization of business and education as well as the development of technology, the demand of learning a new language online has surged during the past decade. In addition to extra time, more people are willing to invest money in learning in pursuit of...
[Read More]
US Stock Risk and Return Prediction
Applying Risk Factors to Machine Learning Models to Predict Medium-term Return Rate and Volatility
Stock market data have been heavily investigated to explore the trend of securities’ return and their risk. Factor models are the most canonical and widely used models for asset pricing and security selection for portfolios. In this project, we aim to utilize various factors and...
[Read More]
Impact Analysis of Hosting Olympics Games on Real Estate Prices by Location and Timestamp
A Project Submitted to Citadel Data Open 2020
We analyzed real estate market data in order to better understand the impact of the Olympic Games on the economy of host cities, taking the 2012 London Olympics and the 2000 Sydney Olympics as case studies. We chose these two cities because they represent different...
[Read More]
An Achromatic Approach of Compressing CNN Filters by Clustering Pattern-Specific Receptive Fields
Abstract CNN Transfer learnings have been widely used in computer vision such as image classification or pattern detection. Such models are based on specific architectures that have solved similar problems, and pretrained weights are loaded for faster training as well as better performance. However, a...
[Read More]
Mini Project----Human-generated and Machine-generated Language Classification
A Basic Sequence Classification Problem using LSTM
Sequence classification is a type of basic problem in natural language processing. This mini project illustrates the basic methods of conducting sequence classification using LSTM model. Such algorithm can be used to detect spam comments or reviews on the internet.
[Read More]
Additive Manufacturing Melt Pool Physics Prediction Using Physical Simulation Data
A Brown Datathon First Prize Winning Project
Additive Manufacturing(AM), widely known as 3D printing, normally utilizes physical simulation processes based on numerical PDE and its thermal mathematical model. Sometimes,microstructure simulations, however, could be difficult to scale to macro level for part level prediction. In this project, we used machine learning algorithms, specifically...
[Read More]
Handwritten Bengali Grapheme Classification
A Word Recognition Problem
Bengali is the 5th most spoken language in the world with hundreds of millions of speakers. Considering this, there’s a significant business and educational interest in developing AI that can optically recognize images of the language handwritten. In this project, we used trasfer leanring based...
[Read More]
NBA Data Visualization and Virtual Match Simulation
Using Dash User Interface to Visualze Player/Team Performance based on 2019-2020 NBA Player Data
In this data science project, an entire data engineering pipeline was built from scratch: Obtaining raw data from the website, storing data inside the database (MongoDB), retrieving and processing data from the database and visualizing data based on users’ queries. Specifically, the data we are...
[Read More]
Breast Cancer Cell Classification
Prediction Accuracy and Sensitivity Analysis on Different Models for Breast Cancer Wisconsin Dataset
Forecasting breast cancer can significantly increase the survival rate of patients, and classifying the sample tumor cells as malignant or benign is one of the best and most direct ways to make accurate predictions. Breast Cancer Wisconsin from UCI Machine Learning Repository was chosen as...
[Read More]