The Spark Foundation-GRIP-Intern
Jan21 batch intern @The Spark Foundation
Directory Tree
├── Student prediction T1 TSF.ipynb
├── Cluster visulization T2 TSF.ipynb
├── EDASampleSuperstore T3 TSF.ipynb
├── EDAGlobal Terrorism T4 TSF.ipynb
├── Indian Premier League T5 TSF.ipynb
├── Decision Tree Algorithm T6 TSF.ipynb
└── Stock Market T7 TSF.ipynb
Table of Content
- Task-1 Prediction using Supervised ML
- Task-2 Prediction using Unsupervised ML, Visulization
- Task-3
- Task-4
- Task-5
- Task-6
- Task-7
Task-1 Prediction using Supervised ML
Problem statments
● predicted score if a student studies for 9.25 hrs/ day
● dataset: http://bit.ly/w-data
Solution
The predicted score is : [92.80850057]
.ipynb: https://karanmehra7107.github.io/TSF-GRIP-Intern-Tasks/Student%20prediction%20T1%20TSF.html
Task-2 Prediction using Unsupervised ML, Visulization
Problem statments
● predict the optimum number of clusters and represent it visually.
● dataset: https://bit.ly/3kXTdox
Solution
Domain knoladge
.ipynb: https://karanmehra7107.github.io/TSF-GRIP-Intern-Tasks/Cluster%20visulization%20T2%20TSF.html
Task-3 EDA
Problem statments
● Perform ‘Exploratory Data Analysis’ on dataset ‘SampleSuperstore’
● As a business manager, try to find out the weak areas where you can work to make more profit.
● What all business problems you can derive by exploring the data?
● dataset: https://bit.ly/3i4rbWl
Solution
.ipynb: https://karanmehra7107.github.io/TSF-GRIP-Intern-Tasks/SampleSuperstore%20T3.html
Task-4 EDA
Problem statments
● Perform ‘Exploratory Data Analysis’ on dataset ‘Global Terrorism’
● As a security/defense analyst, try to find out the hot zone of terrorism.
● What all security issues and insights you can derive by EDA?
● dataset: https://bit.ly/2TK5Xn5
Solution
Task-5 EDA
Problem statments
● Perform ‘Exploratory Data Analysis’ on dataset ‘Indian Premier League’
● As a sports analysts, find out the most successful teams, players and factors contributing win or loss of a team.
● Suggest teams or players a company should endorse for its products.
● dataset: https://bit.ly/34SRn3b
Solution
Task-6 Prediction using Decision Tree Algorithm
Problem statments
● Create the Decision Tree classifier and visualize it graphically.
● The purpose is if we feed any new data to this classifier, it would be able to predict the right class accordingly.
● dataset: https://bit.ly/3kXTdox
Solution
Task-7 Stock Market Prediction using Numerical and Textual Analysis
Problem statments
● Objective: Create a hybrid model for stock price/performance prediction using numerical analysis of historical stock prices, and sentimental analysis of news headlines
● Stock to analyze and predict - SENSEX (S&P BSE SENSEX)
● dataset:
Download historical stock prices from https://finance.yahoo.com/?guccounter=1
Download textual (news) data from https://bit.ly/36fFPI6
Solution
Team
License
Copyright 2020 Karan Mehra
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.