| |
|
H100 - Statistical Literacy, Honors Prerequisites: MATH M014 or equivalent. Permission of the Honors College is required. Credits: 3 Semester: Description: How to be an informed consumer of statistical analysis. Experiments and observational studies, summarizing and displaying data, relationships between variables, quantifying uncertainty, drawing statistical inferences. S100 cannot be taken for credit if credit has already been received for any statistics course (in any department) numbered 300 or higher.
|
|
K310 - Statistical Techniques Prerequisites: N & M P: MATH M119 or equivalent. Credits: 3 Semester: Spring 2010 Description: Introduction to probability and statistics. Elementary probability theory, conditional probability, independence, random variables, discrete and continuous probability distributions, measures of central tendency and dispersion. Concepts of statistical inference and decision: estimation, hypothesis testing, Bayesian inference, statistical decision theory. Special topics discussed may include regression and correlation, time series, analysis of variance, nonparametric methods. Credit given for only one of K310 or S300, ANTH A306, CJUS K300, ECON E370 or S370, MATH K300 or K310, POLS Y395, PSY K300 or K310, SOC S371, or SPEA K300.
|
|
K310 - Statistical Techniques Prerequisites: N & M P: MATH M119 or equivalent. Credits: 3 Semester: Description: Introduction to probability and statistics. Elementary probability theory, conditional probability, independence, random variables, discrete and continuous probability distributions, measures of central tendency and dispersion. Concepts of statistical inference and decision: estimation, hypothesis testing, Bayesian inference, statistical decision theory. Special topics discussed may include regression and correlation, time series, analysis of variance, nonparametric methods. Credit given for only one of K310 or S300, ANTH A306, CJUS K300, ECON E370 or S370, MATH K300 or K310, POLS Y395, PSY K300 or K310, SOC S371, or SPEA K300.
|
|
S100 - Statistical Literacy Prerequisites: MATH M014 or equivalent Credits: 3 Semester: Fall 2009 Description: How to be an informed consumer of statistical analysis. Experiments
and observational studies, summarizing and displaying data, relationships between variables,
quantifying uncertainty, drawing statistical inferences. S100 cannot be taken for credit if credit has
already been received for any statistics course (in any department) numbered 300 or higher.
|
|
S100 - Statistical Literacy Prerequisites: MATH M014 or equivalent Credits: 3 Semester: Summer 2009 Description: How to be an informed consumer of statistical analysis. Experiments
and observational studies, summarizing and displaying data, relationships between variables,
quantifying uncertainty, drawing statistical inferences. S100 cannot be taken for credit if credit has
already been received for any statistics course (in any department) numbered 300 or higher.
|
|
S100 - Statistical Literacy Prerequisites: MATH M014 or equivalent Credits: 3 Semester: Spring 2010 Description: How to be an informed consumer of statistical analysis. Experiments and observational studies, summarizing and displaying data, relationships between variables, quantifying uncertainty, drawing statistical inferences. S100 cannot be taken for credit if credit has already been received for any statistics course (in any department) numbered 300 or higher.
|
|
S300 - Introduction to Applied Statistical Methods Prerequisites: MATH M014 or equivalent Credits: 4 Semester: Fall 2009 Description: Introduction to methods for analyzing quantitative data.
Graphical and numerical descriptions of data, probability models of data, inferences about populations from random samples. Regression and analysis of variance. Lecture and laboratory.
Credit given for only one of the following: S300, CJUS K300, ECON E370 or S370, LAMP L316, MATH K300 or K310, PSY K300 or K310, SOC S371, SPEA K300.
|
|
S300 - Introduction to Applied Statistical Methods Prerequisites: MATH M014 or equivalent Credits: 4 Semester: Spring 2010 Description: Introduction to methods for analyzing quantitative data.
Graphical and numerical descriptions of data, probability models of data, inferences about populations from random samples. Regression and analysis of variance. Lecture and laboratory. Credit given for only one of the following: S300, CJUS K300, ECON E370 or S370, LAMP L316, MATH K300 or K310, PSY K300 or K310, SOC S371, SPEA K300.
|
|
S301 - Applied Statistical Methods for Business Prerequisites: MATH M118 or equivalent. Credits: 3 Semester: Spring 2010 Description: Introduction to methods for analyzing data arising in business, designed to prepare business students for the Kelley School’s Integrative Core. Graphical and numerical descriptions of data, probability models, fundamental principles of estimation and hypothesis testing, applications to linear regression and quality control. Microsoft Excel is used to perform analyses.
3 hours lecture. Credit given for only one of the following: STAT S300 or S301 or S310, CJUS K300, ECON E370 or S370, LAMP L316,
MATH K300 or K310, PSY K300 or K310, SOC S371, SPEA K300.
|
|
S320 - Introduction to Statistics Prerequisites: MATH M212 or M301 or M303 Credits: 3 Semester: Spring 2010 Description: S320 introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics
include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual data sets.
Prerequisites: No previous knowledge of probability is assumed; S320 is recommended for students who wish to take a single, self-contained semester of statistics that emphasizes analyzing data. We will use several basic concepts from calculus; hence, S320 has a prerequisite of MATH M212.
Who Should Take This Course?
As reflected by the large number of introductory statistics courses at IU, there are a great many different ways to begin the study of statistics. The best way to have a positive experience with statistics is to take a course that provides the kind of experience that you want to have.
The Department of Statistics offers three introductory statistics courses. STAT S100 emphasizes quantitative reasoning skills and statistical literacy. It should make you a more critical consumer of the quantitative information
that you encounter in newspapers, magazines, etc.; however, it is not the purpose of S100 to introduce you to a variety of methods for analyzing experimental data.
Both STAT S300 and STAT S320 emphasize using statistical methods to analyze data. Such "methods" courses come in a variety of flavors. Most describe recipes for analyzing data and use a statistical software package in which these recipes have been implemented. S300 is a good example of such courses. Many other departments at IU offer an introductory statistics course of this type.
S320 provides greater emphasis on understanding fundamental principles of statistical inference than does S300. S320 differs from typical methods courses in the following respects:
* Greater emphasis on why a method works. Many courses explain how, but provide little explanation of why.
* Greater depth, less breadth. Many courses provide superficial coverage of a great many topics; S320 covers fewer topics, but in considerably more detail. Students desiring knowledge of procedures not covered in this course are strongly encouraged to take additional statistics courses. S320 is the gateway to majoring in statistics.
* More math. S320 is not a theoretical course (like STAT S420) and it does not use sophisticated mathematics. However, S320 does introduce a good deal of mathematical notation and it does assume that students are comfortable plugging numbers into formulas.
* Interactive computing. Rather than use a statistical computer package as a "black box," S320 relies on computer tools that simplify the computational burden but which require the student to understand how the analysis is to be performed.
In a nutshell: Students in the empirical sciences collect and analyze data, often using computer software that they don't understand. S320 was designed for students who really want to *understand* what they're doing when they perform such analyses.
|
|
S420 - Introduction to Statistical Theory Prerequisites: STAT S320 and MATH M463, or consent of instructor Credits: 3 Semester: Spring 2010 Description: Fundamental concepts and principles of data reduction and statistical inference, including the method of maximum likelihood, the method of least squares, and Bayesian inference. Theoretical justification of statistical procedures introduced in S320.
|
|
S425 - Nonparametric Theory and Data Analysis Prerequisites: STAT S420 and S432, or consent of instructor Credits: 3 Semester: Spring 2010 Description: Survey of methods for statistical inference that do not rely on parametric probability models. Statistical functionals, bootstrapping, empirical likelihood. Nonparametric density and curve estimation. Rank and permutation tests.
|
|
S426 - Bayesian Theory and Data Analysis Prerequisites: Consent of instructor Credits: 3 Semester: Fall 2009 Description: Introduction to the theory and practice of Bayesian inference. Prior and posterior probability distributions. Data collection, model formulation, computation, model checking, sensitivity analysis.
|
|
S431 - Applied Linear Models I Prerequisites: STAT S320 and MATH M301, or consent of instructor Credits: 3 Semester: Description: Part I of a 2-semester sequence on linear models, emphasizing linear regression and the analysis of variance, including topics from the design of experiments and culminating in the general linear model.
|
|
S432 - Applied Linear Models II Prerequisites: STAT S431, or consent of instructor Credits: 3 Semester: Spring 2010 Description: Part II of a two semester sequence on linear models, emphasizing linear regression and the analysis of variance, including topics from the design of experiments and culminating in the general linear model.
|
|
S437 - Categorical Data Analysis Prerequisites: Consent of instructor Credits: 3 Semester: Description: The analysis of crossclassified categorical data. Loglinear models; regression models in which the response variable is binary, ordinal, nominal, or discrete. Logit, probit, multinomial logit models; logistic and Poisson regression.
|
|
S439 - Multilevel Models Prerequisites: STAT S420 and S432, or consent of instructor Credits: 3 Semester: Description: Introduction to the general multilevel model with an emphasis on applications. Discussion of hierarchical linear models, and generalizations to nonlinear models. How such models are conceptualized, parameters estimated and interpreted. Model fit via software. Major emphasis throughout the course will be on how to choose an appropriate model and computational techniques.
|
|
S440 - Multivariate Data Analysis Prerequisites: STAT S420 and STAT S432, or consent of instructor Credits: 3 Semester: Description: Elementary treatment of multivariate normal distributions, classical inferential techniques for multivariate normal data, including Hotelling’sT2 and MANOVA. Discussion of analytic techniques such as principal component analysis, canonical correlation analysis, discriminant analysis, and factor analysis.
|
|
S445 - Covariance Structure Analysis Prerequisites: STAT S420 and S440, or consent of instructor Credits: 3 Semester: Description:
Path analysis. Introduction to multivariate multiple regression, confirmatory factor analysis, and latent variables. Structural equation models with and without latent variables. Mean-structure and multi-group analysis.
|
|
S450 - Time Series Analysis Prerequisites: Consent of instructor Credits: 3 Semester: Spring 2010 Description: Techniques for analyzing data collected at different points in time. Probability models, forecasting methods, analysis in both time and frequency domains, linear systems, state-space models, intervention analysis, transfer function models and the Kalman filter. Topics also include: Stationary processes, autocorrelations, partial autocorrelations, autoregressive, moving average, and ARMA processes, spectral density of stationary processes, periodograms and estimation of spectral density.
|
|
S455 - Longitudinal Data Analysis Prerequisites: STAT S420 and S432, or consent of instructor Credits: 3 Semester: Description: Introduction to methods for longitudinal data analysis; repeated measures data. The analysis of change - models for one or more response variables, possibly censored. Association of measurements across time for both continuous and discrete responses.
|
|
S460 - Sampling Prerequisites: STAT S420 and S432, or consent of instructor Credits: 3 Semester: Description: Design of surveys and analysis of sample survey data. Simple random sampling, ratio and regression estimation, stratified and cluster sampling, complex surveys, nonresponse bias.
|
|
S470 - Exploratory Data Analysis Prerequisites: STAT S420 and S432, or consent of instructor Credits: 3 Semester: Description: How do you analyze data? When faced with data from various sources, of various types, what questions should one ask, and what clues can we find in the data to further our understanding?
Statistics, broadly defined, is the science of and art of analyzing data. Many statistical procedures require formal probability model structures with parameters, and statistical methods offer tools for estimating those model parameters. Sometimes the assumptions governing those models hold, but often they do not. What analyses can provide insight into the data and the underlying mechanisms while being insensitive to model assumptions? Nonparametric methods are distribution-free, but some prior analysis is needed to understand the data.
Exploratory data analysis is a philosophy of analyzing data. The ubiquity of data and the emergence of "data mining" makes this course essential for anyone who wants to analyze data. In this course, we will learn many different tools for data analysis as well as the commands and programs in R (free statistical software) for conducting these analyses. Some prior familiarity with statistical methods is assumed. Those who have had formal statistics courses can take the course at a higher level, where connections between EDA tools and mathematical statistical methods will be developed. This course is valuable to anyone who has data to analyze. It is also a lot of fun; students learn a lot.
Course objectives: Introduce philosophy of exploratory data analysis; Teach tools for the analysis of data; Provide opportunities for analyzing data (R/S-Plus); Demonstrate the value of oral/written communication skills; Offer experience in preparing oral and written reports of data analyses.
Topics:
The philosophy of exploratory versus confirmatory data analysis
Summarizing batches of data: Stem-and-leaf diagrams, boxplots, qq plots, Data Transformations (ladder of re-expressions), Jackknife and bootstrap, Two-way and three-way analyses (median polish), Standardization, Fitting robust-resistant lines (least absolute deviations), Analyzing count data
|
|
S475 - Statistical Learning and High-Dimensional Data Analysis Prerequisites: STAT S440 or consent of instructor Credits: 3 Semester: Description: "Data-analytic methods for exploring the structure of high-dimensional data. Graphical methods, linear and nonlinear dimension reduction techniques, manifold learning. Supervised, semisupervised, and unsupervised learning."
This course surveys various data-analytic approaches to detecting structure in multivariate data sets. Many of the topics covered are active areas of research in multivariate statistics and machine learning. High-dimensional data sets arise in many applications, e.g., gene expression levels from a microarray experiment. Techniques for high-dimensional data are useful in a wide variety of disciplines; I plan to emphasize applications to bioinformatics and text mining.
Here is a rough outline of the topics that I expect to cover:
1. Multivariate Data. Data matrices, proximity matrices and graphs. Labeled and unlabeled data.
2. Graphical Methods for Exploring Multivariate Data. Scatterplots in two and three dimensions, grand tours, projection pursuit. Parallel coordinates. Brushing.
3. Dimension Reduction. Linear techniques: principal component analysis, biplots and $h$-plots, principal coordinate analysis. Spectral techniques for manifold learning: Isomap, Locally Linear Embedding, Laplacian eigenmaps, diffusion maps. Nonspectral embedding techniques and their application to dimension reduction.
4. Supervised Learning. Linear/quadratic discriminant analysis, nearest neighbor methods, distance/metric learning, support vector machines. Multiple kernel learning.
5. Unsupervised learning. K-means clustering, self-organizing maps, iterative denoisong.
Text: I will rely on my own lecture notes and various talks, technical reports, and papers from the literature.
The essential prerequisite for this course is some familiarity with linear algebra (vectors, matrices, eigenvalues, etc.). We will use a high-level statistical programming language (R), so some previous experience with a computer programming language would be helpful. Previous exposure to classical multivariate statistical methods is helpful, but not essential.
For more information, please visit the course web page:
http://mypage.iu.edu/~mtrosset/675.html
If you are uncertain whether or not you have the background to take this course, please contact me at .
Computer Science graduate students may count STAT S675 as an Area 5 (Artificial Intelligence) course for the purpose of fulfilling their area
distribution requirements. Any student who intends to do so should notify Amr Sabry .
|
|
S481 - Functional Data Analysis Prerequisites: Consent of instructor Credits: 3 Semester: Fall 2009 Description: This course will introduce students to methods for analyzing functional data --- i.e., data that are entire curves, rather than single observations or vectors of several measurements. Such data objects involve numerous highly-related points per object, and the methods for analyzing them make explicit use of data objects as functions. In contrast to multivariate analysis (multiple values per object that different degrees of associations) and longitudinal analysis or ``panel studies'' (where measurements are repeated on an individual at only a few time points), the methods here treat the object as a function. We discuss these methods which include graphical displays, summaries, analogs of conventional analyses (analysis of variance, principal components), with an emphasis on applications. The two required books will address both these goals (methodology and applications). The course is valuable for both data researchers whose data are entire functions (gait,
spectra, etc.) as well as students interested in participating in a relatively new and important area of statistical research.
Required textbooks:
James Ramsay and Bernard Silverman,
Functional Data Analysis (FDA), Springer.
James Ramsay and Bernard Silverman,
Applied Functional Data Analysis (AFDA), Springer.
|
|
S481 - Topics in Applied Statistics Prerequisites: Consent of instructor Credits: 3 Semester: Description: Network Science
Network science is concerned with the relationships between individuals, organizations, groups, and other "social" entities. This methodological and theoretical approach to the social world has gained interest in fields across the social, behavioral and political sciences - and shares much in terms of methods with network studies in the natural sciences. At the core of the field is attention to the interconnected nature of actors and their relationships. This type of approach requires a different set of assumptions and analytical tools than standard statistical methods.
This course will primarily focus on statistical methodology for relational data measured on groups of social actors. Topics to be discussed include an introduction to graph theory and the use of directed graphs to study structural theories of actor interrelations; structural and locational properties of actors, such as centrality, prestige, and prominence; subgroups and cliques; equivalence of actors, including structural equivalence, blockmodels, and an introduction to role algebras; an introduction to local analyses, including dyadic and triad analysis; and statistical global analyses, using models such as pl, p*, and their relatives. The course will also introduce data collection and harvesting methods, egocentric analysis, and the use of popular networks analysis and visualization software packages.
This is not a course in network theory; it is a course in methodology, with emphasis on statistical approaches.
Students are expected to attend lectures and register for lab sessions. Assignments will include regular lab exercises and a final network project. Students should have completed at least two upper level statistics courses or contact the instructors for permission to enroll in the course.
|
|
S481 - Multivariate Methods II Prerequisites: Credits: Semester: Spring 2010 Description: This course requires knowledge in basic statistical theory, in particular the basics in multivariate statistical analysis. The actual content of this course depends on the audience and the teaching method depends on the number of students in the class. The course material will be published on the web. Thus, no textbook is required. The topics will, as a rule, be taught in depth and detail. The content will be chosen from the following list of topics/subtopics:
(I) Advanced multivariate models: (i) Conditional models, (ii) Covariance analysis, (iii) Growth curve models, (iv) Symmetry normal models, (v) Graphical normal/discrete models, (iv) Missing data normal models, (v) Mixture models.
(II) Eigenvalues: (i) Canonical correlations, (ii) Principal component analysis, (iii) Factor analysis models, (iv) Discriminant/classification analysis, (v) Testing and eigenvalues in multivariate normal models.
(III) Exponential families: (i) Generalized linear models, (ii) Multivariate logistic regression, (iii) Rasch measurement models, (iv) Conjugated priors, (v) EM/scorings algorithm, (vi) Testing in exponential families, (vii) Models in the Wishart distributions.
References: A.J. Izenman (2008) Modern Multivariate Statistical Techniques, Springer.
Brian S. Everitt (2005) An R and S-Plus Companion to Multivariate Analysis, Springer.
|
|
S481 - Topics in Applied Statistics - Time Series II Prerequisites: Consent of instructor Credits: Semester: Description: Time Series II
In the first Time Series I course in Fall 2008, we learned about time series from a dynamical systems and discrete time perspective. In this course, we will build on these skills by approaching time series analysis from a primarily spectral or frequency-based approach.
Topics will include:
Basic calculus review, complex numbers and variables, Fourier analysis, digital filters, spectral estimation, linear filtering in the frequency domain, noise models, sampling, aliasing, the discrete and fast Fourier transforms (DFT & FFT), Gibbs phenomenon, signal quantization, and the impact of these concepts on estimation and signal detection.
Advanced topics may include:
Wavelets, independent component analysis, image analysis (2D Fourier analysis), multivariate time series, nonlinear processes.
|
|
S482 - Statistical Model Selection Prerequisites: Consent of instructor Credits: 3 Semester: Spring 2010 Description: M-estimates are a broad class of statistical estimates obtained as the solution to an empirical optimization process. Typically, the population parameter is defined as the minimizer of a population risk function and its estimate is defined as the minimizer of the empirical risk. While M-estimates are known to enjoy many desirable properties, goodness of fit alone is not an adequate method for selecting the best among models of different "complexity." On the one hand, "simpler" models can be more revealing of the structure in the data. On the other hand, they are often restricted versions of more "complex" models and hence will never be preferred based on goodness of fit alone. In this course, we cover model selection techniques with an emphasis on variable selection in generalized linear models. We review classical variable selection methods such as AIC, BIC, and Mallows' Cp and discuss some of the computational issues involved. In addition, we introduce some alternative measures of the complexity of a model and review how they can be used for model selection purposes. Finally, we briefly review some of the issues specific to high-dimensional data sets and how they can be addressed.
|
|
S490 - Statistical Consulting Prerequisites: Consent of instructor Credits: 4 Semester: Spring 2010 Description: This class will cover necessary skills for effective statistical consulting, and focus on applications derived from real consulting situations. Students will have the opportunity to engage with clients and present the results of their consultations in class. Along the way students will learn practical methods of data analysis, questions that need to be considered for future studies, and ways of presenting data for different purposes. An important part of the course will be the attention on a real problem from Google, Inc. Sources of data will include:
(1) IU's Statistics Consulting Center (Stephanie Dickinson, Michael Trosset)
(2) Case studies (see textbook)
(3) Designing experiments for evaluating marketing strategies under consideration at Google, Inc. (Dr. James Koehler, Senior Statistician)
|
|
S495 - Readings in Statistics Prerequisites: Consent of instructor Credits: 1-3 Semester: Description: Supervised reading of a topic in statistics. May be repeated with different topics for a maximum of 12 credit hours.
|
|