Data Science Advanced Assignment

Need help with similar Python questions?

Ask A Question

Question: Data Science Advanced Assignment

Asked
Modified
Viewed 80

Hi there,

1) Combine data from 3 datasets - api csv ans web scrapping

2) 07, 6:08 PM data wrangling using filling missing values, generating new features, imputation, user defined functions. Use cluster analysis to explore the data. Graphs and plots 15 medium to complex using (groupby, pivot table cross tabulation). Prediction model using machine learning techniques, e.g. Regression, Naive Bayes, KNN

3) Combine the results from 4 models and make a prediction .


Cheers!

More Instructions
Project 3 Deadline: Submit by midnight Monday, 20th of October 2019. Evaluation: 35% of your final course grade. Late Submission: No late submissions accepted since this is the last week of the semester. Work This assignment is to be done in groups of up three students. You will need to fill out and submit a form (to be provided) indicating your contribution to the project. You will be asked to evaluate your group members’ as well as your contribution to the project. Identical grades are not guaranteed for each student in a group. Purpose: To work in a group setting and to apply all machine learning and data mining skills learned so far on a real-world problem. Build a software package that demonstrates the application of your work and present this to the class. Learning outcomes 1 - 5 from the course outline. Project outline: Create a data science artefact/deliverable which could consist of a notebook or a standalone application. This artefact will apply machine learning and data mining techniques on a chosen real-world problem domain. Investigate what kind of ongoing research is taking place on the chosen domain, and either replicate parts of this research or attempt something novel by modifying or extending the algorithms, or by augmenting the applications with a richer set of features. Some possible domains and ideas are: 1. Financial and market data analysis: time-series analysis, time-series forecasting, stock market prediction. 2. Recommender engine: create application for making recommendations based on user preferences. 3. Fitness data: analysis of your personal or some group’s FitBit data. 4. Twitter: sentiment analysis, text classification, semantic analysis, network visualization, geospatial visualization, data storage etc. 5. Facebook: network visualization, geospatial visualisation, network analysis, natural language processing, data storage etc. 6. Data journalism: data visualization, infographics, text summarisation and classification, natural language semantic analysis 7. Interesting real-time correlations: Twitter discussions about financial instruments and their shifts in price index etc. 8. A Kaggle dataset related. 9. Process mining. 10. ...or something entirely different. This project makes a considerable proportion of your total mark. Therefore, your final work must be substantial. Form your groups early and come up with topics for your group at the earliest possible stage so that you can commence work on development. You are required to register your project and your team composition on the class Google Doc. You are encouraged to use Python; however, this is not an absolute pre-requisite for all parts of your project. If you are building a GUI based application, Python does possess libraries that facilitate this; however, you can use Qt or technologies like .NET which can call your Python methods that implement your application. Project Requirements: Project details: · Submit all your application code, experimental code in a mixture of .py and Notebook files as is appropriate for each project. Each project should submit at least one Notebook that contains all the key findings and summaries. · Present and demonstrate your project to the class in a 15 minute presentation. · Submit a document outlining the contribution that each person has made to the project. List in detail what each person has done and the percentage of the total contribution. Not all team members will necessarily receive the same mark. Marking criteria: Marks will be awarded for different components of the project using the following rubric: Component Marks Project presentation and demonstration 20% Originality 15% Project python code, Notebooks, application of data science, substance and difficulty of the work undertaken. 35% If you have any questions or concerns about this assignment, please ask the lecturer sooner rather than closer to the submission deadline. 2 | Page 1 | Page Project 3 Deadline: Submit by midnight Monday, 20 th of October 2019. Evaluation: 35% of your final course grade. Late Submission : No late submissions accepted since this is the last week of the semester. Work This assignment is to be done in groups of up three students. You will need to fill out and submit a form (to be provided) indicating your contribution to the project. You will be asked to evaluate your group members’ as well as your contribution to the pr oject. Identical grades are not guaranteed for each student in a group. Purpose: To work in a group setting and to apply all machine learning and data mining skills learned so far on a real - world problem. Build a software package that demonstrates the application of your work and present this to the class. Learning outcomes 1 - 5 from the course outline. Project outline: Create a data science artefact/deliverable which could consist of a notebook or a standalone application. This artefact will apply machine learning and data mining techniques on a chosen real - world problem domain. Investigate what kind of ongoing res earch is taking place on the chosen domain, and either replicate parts of this research or attempt something novel by modifying or extending the algorithms, or by augmenting the applications with a richer set of features. Some possible domains and ideas are: 1. Financial and market data analysis: time - series analysis, time - series forecasting, stock market prediction. 2. Recommender engine: create application for making recommendations based on user preferences. 3. Fitness d ata: analysis of your personal or some group’s FitBit data. 4. Twitter: sentiment analysis, text classification, semantic analysis, network visualization, geospatial visualization, data storage etc. 5. Facebook: network visualization, geospatial visualisatio n, network analysis, natural language processing, data storage etc. 6. Data journalism: data visualization, infographics, text summarisation and classification, natural language semantic analysis 7. Interesting real - time correlations: Twitter discussions abo ut financial instruments and their shifts in price index etc. 8. A Kaggle dataset related. 9. Process mining. 10. ...or something entirely different. This project makes a considerable proportion of your total mark. Therefore, your final work must be sub stantial. Form your groups early and come up with topics for your group at the earliest possible stage so that you can commence work on development. You are required to register your project and your team composition on the class Google Doc. You are e ncouraged to use Python; however, this is not an absolute pre - requisite for all parts of your project. If you are building a GUI based application, Python does possess libraries that facilitate this; however, you can use Qt or technologies like .NET which can call your Python methods that implement your application. Project Requirements: Project details: - Submit all your application code, experimental code in a mixture of .py and Notebook files as is appropriate for each project. Each project should submit at least one Notebook that contains all the key findings and summaries. - Present and demonstrate your project to the class in a 15 minute presentation. 1 | Page Project 3 Deadline: Submit by midnight Monday, 20 th of October 2019. Evaluation: 35% of your final course grade. Late Submission: No late submissions accepted since this is the last week of the semester. Work This assignment is to be done in groups of up three students. You will need to fill out and submit a form (to be provided) indicating your contribution to the project. You will be asked to evaluate your group members’ as well as your contribution to the project. Identical grades are not guaranteed for each student in a group. Purpose: To work in a group setting and to apply all machine learning and data mining skills learned so far on a real-world problem. Build a software package that demonstrates the application of your work and present this to the class. Learning outcomes 1 - 5 from the course outline. Project outline: Create a data science artefact/deliverable which could consist of a notebook or a standalone application. This artefact will apply machine learning and data mining techniques on a chosen real-world problem domain. Investigate what kind of ongoing research is taking place on the chosen domain, and either replicate parts of this research or attempt something novel by modifying or extending the algorithms, or by augmenting the applications with a richer set of features. Some possible domains and ideas are: 1. Financial and market data analysis: time-series analysis, time-series forecasting, stock market prediction. 2. Recommender engine: create application for making recommendations based on user preferences. 3. Fitness data: analysis of your personal or some group’s FitBit data. 4. Twitter: sentiment analysis, text classification, semantic analysis, network visualization, geospatial visualization, data storage etc. 5. Facebook: network visualization, geospatial visualisation, network analysis, natural language processing, data storage etc. 6. Data journalism: data visualization, infographics, text summarisation and classification, natural language semantic analysis 7. Interesting real-time correlations: Twitter discussions about financial instruments and their shifts in price index etc. 8. A Kaggle dataset related. 9. Process mining. 10. ...or something entirely different. This project makes a considerable proportion of your total mark. Therefore, your final work must be substantial. Form your groups early and come up with topics for your group at the earliest possible stage so that you can commence work on development. You are required to register your project and your team composition on the class Google Doc. You are encouraged to use Python; however, this is not an absolute pre-requisite for all parts of your project. If you are building a GUI based application, Python does possess libraries that facilitate this; however, you can use Qt or technologies like .NET which can call your Python methods that implement your application. Project Requirements: Project details: - Submit all your application code, experimental code in a mixture of .py and Notebook files as is appropriate for each project. Each project should submit at least one Notebook that contains all the key findings and summaries. - Present and demonstrate your project to the class in a 15 minute presentation.
Answers 0

No answers posted

Post your Answer - free or at a fee

Login to your tutor account to post an answer

Posting a free answer earns you +20 points.

Login

Ask a question for free and get answers to get Python assignment help with a similar task to this question.