Sas Final Data Mining Project Due 5/3/3021

Posted Under: Information Technology

Ask A Question
DESCRIPTION
Posted
Modified
Viewed 28
I need my SAS final data mining project finished. I need screenshots of steps of explanations to them in a word document before 5:30pm 5/3/2021 EST. Attached is the instructional file. If you have any questions my number 678-614-4380

This order does not have tags, yet.

Attachments
1 Data Mining Final Project The data mining project involve the application of data mining techniques discussed in class to one data set. The goal of the project is to go through the full data mining cycle with respect to a particular data set, including the specification of the business problem to be solved, the specification of the data mining tasks to be performed, preprocessing and transformation of the data, application of several data mining methods and the discovery of patterns, evaluation of patterns, and recommendation of specific actions with respect to relevant findings. Project Dataset: Please download a Bank Marketing data set (bank.csv) from http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip and import the data set into SAS Enterprise Miner (Please see Appendix A for instructions on how to download the data set and perform data importation). This data set was collected from a Portuguese bank that used its own contact-center to do direct marketing campaigns in order to motivate and attract the deposit clients. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the bank term deposit would be (“yes”) or would not be (“no”) subscribed. The data set contains 4521 instances and 17 variables (16 input variables and 1 target variable). Please refer to the bank-names.txt file (downloadable at http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip) for a detailed description of the data set. Project Deliverable: Data Mining Final Report (Submitted through the Assignment Submission Folder on CourseDen) This is a comprehensive description of your project, which should fully describe the work done for the whole data mining analysis, not just the end results. Often, the whole data mining analysis may iterate the data mining processes several times, not just one-shot. The process may include data preparation, data exploration and preprocessing, data mining methodologies, results analysis, conclusions, lessons learned and so on. Therefore, presenting merely the SAS Enterprise Miner output report will receive a very low score for the report. You should demonstrate your work using not only textual descriptions but also detailed screen shots in your report. The project final report should include the following: 1. Cover page: your name, project name, and source of the dataset. 2. Objectives: Clear statement of objectives of the data mining project; the problem that you are investigating and summarize your goals for this project. 3. Data preparation: Discussion of the structure and characteristics of the data. After importing the data set into SAS Enterprise Miner, please set the appropriate roles (e.g., Input, Target, Rejected, and etc.) and levels (e.g., Interval, Nominal, Binary, Ordinal, and http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip 2 etc.) for the variables in the data set. Please provide both detailed textual descriptions and relevant screen shots of data importation and the roles and levels of the variables in the data set. 4. Data exploration and preprocessing: Discussion of the processes and results of any exploratory data analysis and data visualization performed on the data. Examination of different data preparation and transformation approaches to improve results for the given analysis tasks. What data exploration steps were performed? What are results of data exploration? What preprocessing and transformation were done to make the data amenable for data mining? Describe your reasoning behind the performed data exploration, preprocessing and transformation. Please provide both detailed textual descriptions and relevant screen shots regarding the issues of data exploration, preprocessing and transformation. For example, you should perform data exploration to determine whether there are any unusual values, whether there is any missing data, and whether data transformation is required, and then perform certain data preprocessing and/or transformation (such as data replacement and/or filtering, data imputation, variable transformation, and etc.) when necessary. More specifically, you should examine the distributions of the variables by creating histograms for the variables in your data set (right click the “File Import” node, select “Edit Variables”, select the variable(s) you want to explore, and click the “Explore” button). Please provide both detailed textual descriptions and relevant screen shots of the histograms of the variables. According to the histograms, are there any unusual values in any variables? Do you need to change the unusual values using the replacement node or remove the unusual cases using the filter node? If you decide to use the replacement node and/or the filter node, please provide both detailed textual descriptions and relevant screen shots of data replacement and/or filtering. After that, please perform data partition. Again, please provide both detailed textual descriptions and relevant screen shots of data partition. Are there any missing values? Please provide both detailed textual descriptions and relevant screen shots to show whether there are missing values or not. Do you need to impute any missing values? Please provide both detailed textual descriptions and relevant screen shots if data imputation is done. According to the histograms, do you find any skewed distributions? Do you need to transform any variables due to skewed distributions? Please provide both detailed textual descriptions and relevant screen shots if skewed distributions are discovered and variable transformations are done. 5. Data mining process: The exploration of multiple data mining methods on the targeted data set. Experimentation with different parameters to optimize the results of the chosen data mining techniques. The use of a variety of relevant techniques to determine the best approach to accomplish the data analysis tasks. Please provide both detailed textual descriptions and relevant screen shots of the data mining process. For example, please choose certain data mining techniques, such as Decision Tree, Regression, Neural Networks, and etc., to develop multiple data mining models on the data set and experiment with different parameters to optimize the model results. Please 3 provide both detailed textual descriptions and relevant screen shots of model development. After that, please perform model comparison to select the best performing model using the model comparison node. Please also provide both detailed textual descriptions and relevant screen shots of model comparison. 6. Results and conclusions: Thorough discussion and analysis of data mining results, including an analysis of how the approaches used worked in accomplishing the project objectives. Draw conclusions from your results. Please provide both detailed textual descriptions and relevant screen shots of the data mining results and conclusions. For example, you should explain the results of the data mining models (e.g., the validation ASEs of the models, the number of leaves in the optimal tree for the decision tree model, the variables used for the splits in the decision tree model, the significant variables included in the regression model, and etc.) as well as the results of model comparison (e.g., which model is selected as the best performing model based on which criterion?). Please provide both detailed textual descriptions and relevant screen shots of the model results and the model comparison results. After that, please draw conclusions from the results by determining which factors are the best predictors of bank term deposit subscription and discussing their implications for successful bank marketing strategies. 7. (Graduate Students Only) Ethical issues in data mining: Explore the impact that data mining could have on privacy and the laws surrounding the privacy of personal data (4-5 pages, double-spaced). 8. Lessons learned and future work: Discuss what you have learned through the project and what concepts and techniques you learned in class are used in the project; Discuss potential extensions and future work. 9. References, if any. Grading Criteria: 1. The writing quality of the report, such as the completeness of the contents with regard to the above requirements, and the coherence and correctness of the writing. 2. The effort made in data exploration and preprocessing 3. Data mining skills and strategies 4. Comprehensiveness of data analysis results and explanations. Business background related and in-depth discussions are encouraged. 5. Completeness and timeliness of the report Issues You Should Tackle During Project Accomplishment: 1. How to conduct data exploration and preprocessing 2. How to select right variables for the models 3. How to combine different data mining skills for the project, such as applying the stepwise regression for neural network variable selection. 4. How to explain the data mining results 4 Appendix A – How to Import the Bank Marketing Data Set into SAS Enterprise Miner Please go to http://archive.ics.uci.edu/ml/machine-learning-databases/00222/ and click the “bank.zip” link to download and save the “bank.zip” zip folder on your computer. Then, please extract the “bank.zip” zip folder into a regular folder, which contains three files: the bank.csv file, the bank-full.csv file, and the bank-names.txt file. The bank-full.csv file will not be used for this data mining project and can be deleted from your computer. Please open the bank-names.txt file to read the description of the data set and develop an understanding of the objectives of this project. After that, please follow the steps below to import the data set from the bank.csv file into SAS Enterprise Miner for this project. 1. Double click the bank.csv file to open it in Excel. 2. Click letter “A” on top of the first column to select the entire column A in Excel, then click the “Data” tab, and click “Text to Columns”. 3. Select the “Delimited” radio button, and click the “Next” button. 4. Uncheck the “Tab” checkbox, check the “Semicolon” checkbox, select “ " ” as the Text Qualifier, and click the “Next” button. 5. Select the “General” radio button, enter $A$1 as the Destination, and click the “Finish” button. 6. Click the “File” menu, select “Save As”, click “Browse”, navigate to the folder where you want to save the file, select “Excel Workbook (*.xlsx)” in the “Save as type:” box, enter a file name in the “File name:” box, and click the “Save” button. 7. Start SAS Enterprise Miner, create a new project, and then create a new diagram. 8. Click the “Sample” tab and add a “File Import” node to your new diagram. 9. Select the “File Import” node in the diagram workspace, go to the property panel, click the button of the "Import File" property in the "Train" section. 10. Select “My Computer” and click on “Browse...”. 11. Select the excel file you want to import, click “Preview” to make sure the data set will be properly imported into SAS Enterprise Miner, and click “OK”. 12. Right click the File Import node, and click “Run”. 13. Right click the “File Import” node in the diagram workspace and choose “Edit Variables” to set the roles and the levels for the variables in the dataset. 14. To view the dataset after the “File Import” node is run, go to the Property panel, click the button of the “Exported Data” property, select the “Train” data set, and click the “Browse” button to view the data. http://archive.ics.uci.edu/ml/machine-learning-databases/00222/ 5 15. To explore the distribution(s) of particular variable(s) in the data set, right click the “File Import” node, select “Edit Variables”, select the variable(s) you want to explore, click the “Explore” button, and you will be able to view the variable histogram(s). 16. After you make sure the data set has been properly imported, you can proceed with data analysis by using the “File Import” node as the Data Source node.
Explanations and Answers 0

No answers posted

Post your Answer - free or at a fee

Login to your tutor account to post an answer

Posting a free answer earns you +20 points.

Login

NB: Post a homework question for free and get answers - free or paid homework help.

Get answers to: Sas Final Data Mining Project Due 5/3/3021 or similar questions only at Tutlance.

Related Questions