Разделы презентаций


Seminar 1 Introduction to Data Science

Grades50% - home assignments, 50% - group project96-100% - 10, 90-95% - 9, 80-89% - 8, 75-79% - 7, 65-74% - 6, 55-64% - 5, 45-54% - 4, 35-44% - 3,

Слайды и текст этой презентации

Слайд 1Seminar 1 Introduction to Data Science

Mikhail Kamrotov
Data Analysis in R

Seminar 1  Introduction to Data ScienceMikhail KamrotovData Analysis in R

Слайд 2Grades
50% - home assignments, 50% - group project
96-100% - 10,

90-95% - 9, 80-89% - 8, 75-79% - 7, 65-74%

- 6, 55-64% - 5, 45-54% - 4, 35-44% - 3, 25-34% - 2, 0-24% - 1
You can work in pairs
Best solutions could be presented in class (5 minute talk) to get some extra points
Grades50% - home assignments, 50% - group project96-100% - 10, 90-95% - 9, 80-89% - 8, 75-79%

Слайд 3Definition
Data analysis is the process of transforming raw data into

usable information, often presented in the form of a published

analytical article, in order to add value to the statistical output. (OECD)
Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making (Wikipedia)
Both miss one important step – collecting data.
Most theories are about modeling, but 80% of the time a data scientist spends on data collection and cleansing
DefinitionData analysis is the process of transforming raw data into usable information, often presented in the form

Слайд 4Data analysis techniques
Data mining
automatic discovery of useful information in

large data repositories
Descriptive statistics
summarizing features of data
Exploratory data analysis
finding

new features in data
Confirmatory data analysis
hypotheses testing
Predictive analytics
deriving predictions from data
Text analytics
extracting information from textual (i.e. unstructured) data
Data analysis techniquesData mining automatic discovery of useful information in large data repositories Descriptive statisticssummarizing features of

Слайд 5Two cultures of data analysis
Data is generated by a black

box
Input variables x (independent variables) go in one side (time

you spend on your home assignments)
On the other side the response variables y come out (your grades)
Two main goals: prediction and information
Two approaches: data modeling culture and algorithmic modeling culture

Two cultures of data analysisData is generated by a black boxInput variables x (independent variables) go in

Слайд 6Data modeling culture

Starts with assuming a data model for the

inside of the black box
The values of the parameters

are estimated from the data and the model then used for information and/or prediction
Model validation: goodness-of-fit tests


Data modeling cultureStarts with assuming a data model for the inside of the black box The values

Слайд 7Algorithmic modeling culture
Considers the inside of the box complex and

unknown
Tries to find a function f(x) - an algorithm that

operates on x to predict the responses y
Model validation: predictive accuracy
Algorithmic modeling cultureConsiders the inside of the box complex and unknownTries to find a function f(x) -

Слайд 8Why do you need to learn data analysis
Valuable skill that

is highly remunerative
Things sometimes are not as obvious as they

seem at first sight
Ability to verify results produced by your colleagues
The only way to make scientific contribution and verify theories, especially in social sciences
Why do you need to learn data analysisValuable skill that is highly remunerativeThings sometimes are not as

Слайд 9Data manipulation by Tim Cook
 https://www.statschat.org.nz/2013/09/11/cumulative-totals-tend-to-increase/ 

Data manipulation by Tim Cook https://www.statschat.org.nz/2013/09/11/cumulative-totals-tend-to-increase/ 

Слайд 10Even academic superstars may be wrong
http://theconversation.com/the-reinhart-rogoff-error-or-how-not-to-excel-at-economics-13646

Even academic superstars may be wronghttp://theconversation.com/the-reinhart-rogoff-error-or-how-not-to-excel-at-economics-13646

Слайд 11A lot of fraud in science (especially in social sciences)
https://www.financial-math.org/blog/2015/10/is-research-in-finance-and-economics-reproducible/

A lot of fraud in science (especially in social sciences)https://www.financial-math.org/blog/2015/10/is-research-in-finance-and-economics-reproducible/

Слайд 12Random chance plays a huge role in social sciences
 http://www.tylervigen.com/spurious-correlations 

Random chance plays a huge role in social sciences http://www.tylervigen.com/spurious-correlations 

Слайд 13Intuition might be wrong
Simpson’s paradox: graduate admissions to UCB

Intuition might be wrongSimpson’s paradox: graduate admissions to UCB

Слайд 14Intuition might be wrong
Simpson’s paradox: graduate admissions to UCB

Intuition might be wrongSimpson’s paradox: graduate admissions to UCB

Слайд 15Intuition might be wrong, part 2
Monty Hall problem
https://en.wikipedia.org/wiki/Monty_Hall_problem
Humans vs birds: birds

win (Herbranson, 2010)

Intuition might be wrong, part 2Monty Hall problemhttps://en.wikipedia.org/wiki/Monty_Hall_problemHumans vs birds: birds win (Herbranson, 2010)

Слайд 16R
R is a language of statistical computing
Modern social sciences speak

mostly this language (and Python as well)
R download link:  https://cran.r-project.org 
RStudio

download: https://www.rstudio.com/products/rstudio/download/#download 


RR is a language of statistical computingModern social sciences speak mostly this language (and Python as well)R

Слайд 17P.S.
Calling Bullshit is a highly recommended online course at the

University of Washington http://callingbullshit.org/syllabus.html#Introduction 


P.S.Calling Bullshit is a highly recommended online course at the University of Washington http://callingbullshit.org/syllabus.html#Introduction 

Обратная связь

Если не удалось найти и скачать доклад-презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое TheSlide.ru?

Это сайт презентации, докладов, проектов в PowerPoint. Здесь удобно  хранить и делиться своими презентациями с другими пользователями.


Для правообладателей

Яндекс.Метрика