jupyter notebook1代写 exercises代写 homeworks代写 assignment代写

Hands-on assignment #3, due Monday 3/4/2019 @ 8am

jupyter notebook1代写 Credit: The data and ideas behind thes exercises and homeworks are from the NIH LINCS DCIC Crowdsourcing Portal

Credit: The data and ideas behind these exercises and homeworks are from the NIH LINCS DCIC Crowdsourcing Portal and Ma’ayan Lab @ Mt Sinai, New York. http://www.maayanlab.net/crowdsourcing/megatask1.php

The overarching goal is to predict adverse drug reactions.jupyter notebook1代写

This assignment builds on the in- class examples on the ADR (adverse drug effect) prediction and hands-on HW2.

This is a group assignment. You can work in a group consisting of 1, 2, 3 or 4 members. Each group will make one submission via canvas. Please state the names of all your team members in your submission.jupyter notebook1代写

This assignment focuses on classification and feature selection methods, and will be graded out of 10 points.

Upload 3 files for this assignment:

A Jupyter notebook file named hw3.ipynb containing the R code and answers for the 5 questions.jupyter notebook1代写

Please use “#” (comment lines) and markdown cells in your notebook to indicate the question number and to extensively document your code.

A spreadsheet in tab-delimited text format representing the cross validation results of your methods for each side effect.

Using the data “gene_expression_n438x978.txt” and “ADRs_HLGT_n438x232.txt” to answer all questions in this assignment. You can assume the files “gene_expression_n438x978.txt” and “ADRs_HLGT_n438x232.txt” are in your working directory.

In class, we discussed many techniques for classification and feature selection in the context of personalized medicine. We illustrated how to apply these methods to the breast cancer data in class.jupyter notebook1代写

Feature selection methods	Classification methods
none	k-nearest neighbor (k-NN)
t-test	Support vector machine (SVM)
Signal-to-noise (S2N)	Bayesian Model Averaging (BMA)
BSS/WSS	Decision trees
Correlation with the class vector	Boosting, bagging and other ensemble methods
jupyter notebook1代写	Golub’s method on the AML/ALL data

(8 points) Experiment combinations jupyter notebook1代写

of the above feature selection and classification methods and apply to the ADR data to predict side effects. Evaluate the performance using

fold cross validation, repeated 3 times. Note that you need to perform feature selection in each fold and each run of your cross validation results. In other words, you will perform feature selection and classification a total of 30

Each combination of feature selection + classification is worth 1 point. For example,

t-test with p-value < 0.01 as feature selection and k-NN with k=10 as classification method will earn you 1point.jupyter notebook1代写
t-test with p-value < 0.001 as feature selection and k-NN with k=10 as classification method will earn you another 1point.
No feature selection and k-NN with k=12 will earn you an additional 1

So, your group can try 8 combinations to earn up to 8 points.

Different input parameter settings count as different combinations.jupyter notebook1代写

Submit a spreadsheet in tab-delimited text format representing a table that consists of 232 rows and 8 columns. Each column represents a combination of the methods you tried. Each row represents a side effect. Each entry in this table is the average prediction accuracy from 10-fold cross validation, repeated 5 times.

(2 points) Compare the prediction accuracy of the methods you tried in your report. In particular, address the following questions:

Which side effects you can predict with the highest accuracy in eachcombination?
Which side effects you predict with the lowest accuracy in eachcombination?
Some side effects have unbalanced class sizes. Did you do anything aboutthat? Why? If so, is your method effective?jupyter notebook1代写
Which feature selection and/or classification method would you consider as the “winner” in your empirical study? You can include the results from HW2 in your discussion.
Any interesting negativeresults?

其他代写：algorithm代写 analysis代写 app代写 assembly代写 assignment代写 C++代写 code代写 course代写 dataset代写 java代写 web代写编程代写考试助攻 program代写 cs作业代写 source code代写 finance代写 data代写 essay代写 function代写

合作平台：essay代写论文代写写手招聘英国留学生代写

Hands-on assignment #3, due Monday 3/4/2019 @ 8am

The overarching goal is to predict adverse drug reactions.jupyter notebook1代写

Upload 3 files for this assignment:

(8 points) Experiment combinations jupyter notebook1代写

So, your group can try 8 combinations to earn up to 8 points.

关键字：