| |
|
S501 - Statistical Methods I: Introduction to Statistics Prerequisites: One undergraduate course in statistics. Credits: 3 Semester: Spring 2010 Description: This course takes a systematic approach to the exposition of the general linear model — focusing on correlation, simple linear and multiple regression. Students are introduced to the use of statistical analysis software. The first third of the course consists of a review of statistics, data analysis tools, significance tests, and confidence intervals. Students learn how to think creatively about the use of statistical methods in their own research.
|
|
S501 - Statistical Methods I: Introduction to Statistics Prerequisites: One undergraduate course in statistics. Credits: 3 Semester: Fall 2009 Description: This course takes a systematic approach to the exposition of the general linear model — focusing on correlation, simple linear and multiple regression. Students are introduced to the use of statistical analysis software. The first third of the course consists of a review of statistics, data analysis tools, significance tests, and confidence intervals. Students learn how to think creatively about the use of statistical methods in their own research.
|
|
S503 - Statistical Methods IIb: Generalized Linear Models and Categorical Data Prerequisites: STAT S501, or one graduate course in statistics Credits: 3 Semester: Fall 2009 Description: This course takes a systematic approach to the exposition of the generalized linear model — focusing on categorical data. Of primary
concern will be models for which the response variable is categorical. Such models include probit, logit, ordered logit, and Poisson regression, among others. Students learn how to think creatively about the use of statistical methods in their own research.
|
|
S503 - Statistical Methods IIb: Generalized Linear Models and Categorical Data Prerequisites: STAT S501, or one graduate course in statistics Credits: 3 Semester: Spring 2010 Description: This course introduces techniques for categorical data analysis, focusing on models in which the dependent variable is either binary, ordinal, nominal or count. Such models include probit, logit, ordered logit and probit, multinominal logit, Poisson regression, negative binomial regression, and zero-inflated count models. Students learn how to apply these techniques in their own research.
|
|
S520 - Introduction to Statistics Prerequisites: MATH M212, M301, M303, or the equivalent. Credits: 3 Semester: Description: Basic concepts of data analysis and statistical inference, applied to 1-sample and 2- sample location problems, the analysis of variance, and linear regression. Probability models and statistical methods applied to practical situations and actual data sets from various disciplines. Elementary statistical theory, including the plug-in principle, maximum likelihood, and the method of least squares.
|
|
S520 - Introduction to Statistics Prerequisites: MATH M212, M301, M303, or the equivalent. Credits: 3 Semester: spring 2010 Description: Basic concepts of data analysis and statistical inference, applied to 1-sample and 2- sample location problems, the analysis of variance, and linear regression. Probability models and statistical methods applied to practical situations and actual data sets from various disciplines. Elementary statistical theory, including the plug-in principle, maximum likelihood, and the method of least squares.
S520 provides a strong introduction to elementary statistical methodology and a gentle introduction to elementary statistical theory. It meets concurrently with S320, but includes supplementary material not covered in that course. S520 introduces material that is covered in greater depth in S620 (Introduction to Statistical Theory), but less mathematically and in the context of actual experiments and data. It fulfills the theory requirement for the M.S. degree in Applied Statistics (currently under review).
|
|
S620 - Introduction to Statistical Theory Prerequisites: STAT S320 and MATH M463, or consent of instructor Credits: 3 Semester: Description: Fundamental concepts and principles of data reduction and statistical inference, including the method of maximum likelihood, the method of least squares, and Bayesian inference. Theoretical justification of statistical procedures introduced in S320.
|
|
S625 - Nonparametric Theory and Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Spring 2010 Description: Survey of methods for statistical inference that do not rely on parametric probability models. Statistical functionals, bootstrapping, empirical likelihood. Nonparametric density and curve estimation. Rank and permutation tests.
|
|
S626 - Bayesian Theory and Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Fall 2009 Description: Introduction to the theory and practice of Bayesian inference. Prior and posterior probability distributions. Data collection, model formulation, computation, model checking, sensitivity analysis.
|
|
S631 - Applied Linear Models I Prerequisites: STAT S320 and MATH M301 or M303 or S303, or consent of instructor Credits: 3 Semester: Fall 2009 Description: Part I of a 2-semester sequence on linear models, emphasizing linear regression and the analysis of variance, including topics from the design of experiments and culminating in the general linear model.
|
|
S632 - Applied Linear Models II Prerequisites: STAT S631, or consent of instructor Credits: 3 Semester: Spring 2010 Description: Part II of a two semester sequence on linear models, emphasizing linear regression and the analysis of variance, including topics from the design of experiments and culminating in the general linear model.
|
|
S637 - Categorical Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Description: The analysis of crossclassified categorical data. Loglinear models; regression models in which the response variable is binary, ordinal, nominal, or discrete. Logit, probit, multinomial logit models; logistic and Poisson regression.
|
|
S640 - Multivariate Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Fall 2009 Description: Elementary treatment of multivariate normal distributions, classical inferential techniques for multivariate normal data, including Hotelling’s T˛ and MANOVA. Discussion of analytic techniques such as principal component analysis, canonical correlation analysis, discriminant analysis, and factor analysis.
|
|
S645 - Covariance Structure Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Description: Path analysis. Introduction to multivariate multiple regression, confirmatory factor analysis, and latent variables. Structural equation models with and without latent variables. Mean-structure and multi-group analysis.
Course is equivalent to EDUC Y645.
|
|
S650 - Time Series Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Spring 2010 Description: Techniques for analyzing data collected at different points in time. Probability models, forecasting methods, analysis in both time and frequency domains, linear systems, state-space models, intervention analysis, transfer function models and the Kalman filter. Topics also include: Stationary processes, autocorrelations, partial autocorrelations, autoregressive, moving average, and ARMA processes, spectral density of stationary processes, periodograms and estimation of spectral density.
Course is equivalent to MATH M568.
|
|
S655 - Longitudinal Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Description: Introduction to methods for longitudinal data analysis; repeated measures data. The analysis of change - models for one or more response variables, possibly censored. Association of measurements across time for both continuous and discrete responses.
Course is equivalent to EDUC Y655
|
|
S660 - Sampling Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Description: Design of surveys and analysis of sample survey data. Simple random sampling, ratio and regression estimation, stratified and cluster sampling, complex surveys, nonresponse bias.
|
|
S670 - Exploratory Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Fall 2009 Description: How do you analyze data? When faced with data from various sources, of various types, what questions should one ask, and what clues can we find in the data to further our understanding?
Statistics, broadly defined, is the science of and art of analyzing data. Many statistical procedures require formal probability model structures with parameters, and statistical methods offer tools for estimating those model parameters. Sometimes the assumptions governing those models hold, but often they do not. What analyses can provide insight into the data and the underlying mechanisms while being insensitive to model assumptions? Nonparametric methods are distribution-free, but some prior analysis is needed to understand the data.
Exploratory data analysis is a philosophy of analyzing data. The ubiquity of data and the emergence of "data mining" makes this course essential for anyone who wants to analyze data. In this course, we will learn many different tools for data analysis as well as the commands and programs in R (free statistical software) for conducting these analyses. Some prior familiarity with statistical methods is assumed. Those who have had formal statistics courses can take the course at a higher level, where connections between EDA tools and mathematical statistical methods will be developed. This course is valuable to anyone who has data to analyze. It is also a lot of fun; students learn a lot.
Course objectives: Introduce philosophy of exploratory data analysis; Teach tools for the analysis of data; Provide opportunities for analyzing data (R/S-Plus); Demonstrate the value of oral/written communication skills; Offer experience in preparing oral and written reports of data analyses.
Topics:
The philosophy of exploratory versus confirmatory data analysis
Summarizing batches of data: Stem-and-leaf diagrams, boxplots, qq plots, Data Transformations (ladder of re-expressions), Jackknife and bootstrap, Two-way and three-way analyses (median polish), Standardization, Fitting robust-resistant lines (least absolute deviations), Analyzing count data
|
|
S675 - Statistical Learning and High-Dimensional Data Analysis Prerequisites: Two statistics courses at the graduate level, or consent of instructor Credits: 3 Semester: Fall 2009 Description: "Data-analytic methods for exploring the structure of high-dimensional data. Graphical methods, linear and nonlinear dimension reduction techniques, manifold learning. Supervised, semisupervised, and unsupervised learning."
This course surveys various data-analytic approaches to detecting structure in multivariate data sets. Many of the topics covered are active areas of research in multivariate statistics and machine learning. High-dimensional data sets arise in many applications, e.g., gene expression levels from a microarray experiment. Techniques for high-dimensional data are useful in a wide variety of disciplines; I plan to emphasize applications to bioinformatics and text mining.
Here is a rough outline of the topics that I expect to cover:
1. Multivariate Data. Data matrices, proximity matrices and graphs. Labeled and unlabeled data.
2. Graphical Methods for Exploring Multivariate Data. Scatterplots in two and three dimensions, grand tours, projection pursuit. Parallel coordinates. Brushing.
3. Dimension Reduction. Linear techniques: principal component analysis, biplots and $h$-plots, principal coordinate analysis. Spectral techniques for manifold learning: Isomap, Locally Linear Embedding, Laplacian eigenmaps, diffusion maps. Nonspectral embedding techniques and their application to dimension reduction.
4. Supervised Learning. Linear/quadratic discriminant analysis, nearest neighbor methods, distance/metric learning, support vector machines. Multiple kernel learning.
5. Unsupervised learning. K-means clustering, self-organizing maps, iterative denoisong.
Text: I will rely on my own lecture notes and various talks, technical reports, and papers from the literature.
The essential prerequisite for this course is some familiarity with linear algebra (vectors, matrices, eigenvalues, etc.). We will use a high-level statistical programming language (R), so some previous experience with a computer programming language would be helpful. Previous exposure to classical multivariate statistical methods is helpful, but not essential.
For more information, please visit the course web page:
http://mypage.iu.edu/~mtrosset/675.html
If you are uncertain whether or not you have the background to take this course, please contact me at .
Computer Science graduate students may count STAT S675 as an Area 5 (Artificial Intelligence) course for the purpose of fulfilling their area
distribution requirements. Any student who intends to do so should notify Amr Sabry .
|
|
S681 - Topics in Applied Statistics - Functional Data Analysis Prerequisites: Consent of instructor Credits: 3 Semester: Fall 2009 Description: This course will introduce students to methods for analyzing functional data --- i.e., data that are entire curves, rather than single observations or vectors of several measurements. Such data objects involve numerous highly-related points per object, and the methods for analyzing them make explicit use of data objects as functions. In contrast to multivariate analysis (multiple values per object that different degrees of associations)and longitudinal analysis or ``panel studies'' (where measurements are repeated on an individual at only a few time points), the methods here treat the object as a function. We discuss these methods which include graphical displays, summaries, analogs of conventional analyses (analysis of variance, principal components), with an emphasis on applications. The two required books will address both these goals (methodology and applications). The course is valuable for both data researchers whose data are entire functions (gait,
spectra, etc.) as well as students interested in participating in a relatively new and important area of statistical research.
Required textbooks:
James Ramsay and Bernard Silverman,
Functional Data Analysis (FDA), Springer.
James Ramsay and Bernard Silverman,
Applied Functional Data Analysis (AFDA), Springer.
|
|
S681 - Statistical Methodology IIa - Experimental Design and ANOVA Prerequisites: Credits: Semester: Spring 2010 Description:
|
|
S681 - Multivariate Methods II Prerequisites: Two graduate courses, one in multivariate statistical analysis and in general statistical theory, or consent of instructor. Credits: 3 Semester: Spring 2010 Description: This course requires knowledge in basic statistical theory, in particular the basics in multivariate statistical analysis. The actual content of this course depends on the audience and the teaching method depends on the number of students in the class. The course material will be published on the web. Thus, no textbook is required. The topics will, as a rule, be taught in depth and detail. The content will be chosen from the following list of topics/subtopics:
(I) Advanced multivariate models: (i) Conditional models, (ii) Covariance analysis, (iii) Growth curve models, (iv) Symmetry normal models, (v) Graphical normal/discrete models, (iv) Missing data normal models, (v) Mixture models.
(II) Eigenvalues: (i) Canonical correlations, (ii) Principal component analysis, (iii) Factor analysis models, (iv) Discriminant/classification analysis, (v) Testing and eigenvalues in multivariate normal models.
(III) Exponential families: (i) Generalized linear models, (ii) Multivariate logistic regression, (iii) Rasch measurement models, (iv) Conjugated priors, (v) EM/scorings algorithm, (vi) Testing in exponential families, (vii) Models in the Wishart distributions.
References: A.J. Izenman (2008) Modern Multivariate Statistical Techniques, Springer.
Brian S. Everitt (2005) An R and S-Plus Companion to Multivariate Analysis, Springer.
|
|
S681 - Topics in Applied Statistics - Time Series II Prerequisites: Consent of instructor Credits: 3 Semester: Description: Time Series II
This course is cross-listed with PSY-P657
In the first Time Series I course in Fall 2008, we learned about time series from a dynamical systems and discrete time perspective. In this course, we will build on these skills by approaching time series analysis from a primarily spectral or frequency-based approach.
Topics will include:
Basic calculus review, complex numbers and variables, Fourier analysis, digital filters, spectral estimation, linear filtering in the frequency domain, noise models, sampling, aliasing, the discrete and fast Fourier transforms (DFT & FFT), Gibbs phenomenon, signal quantization, and the impact of these concepts on estimation and signal detection.
Advanced topics may include:
Wavelets, independent component analysis, image analysis (2D Fourier analysis), multivariate time series, nonlinear processes.
|
|
S682 - Statistical Model Selection Prerequisites: Credits: Semester: Spring 2010 Description: M-estimates are a broad class of statistical estimates obtained as the solution to an empirical optimization process. Typically, the population parameter is defined as the minimizer of a population risk function and its estimate is defined as the minimizer of the empirical risk. While M-estimates are known to enjoy many desirable properties, goodness of fit alone is not an adequate method for selecting the best among models of different "complexity." On the one hand, "simpler" models can be more revealing of the structure in the data. On the other hand, they are often restricted versions of more "complex" models and hence will never be preferred based on goodness of fit alone. In this course, we cover model selection techniques with an emphasis on variable selection in generalized linear models. We review classical variable selection methods such as AIC, BIC, and Mallows' Cp and discuss some of the computational issues involved. In addition, we introduce some alternative measures of the complexity of a model and review how they can be used for model selection purposes. Finally, we briefly review some of the issues specific to high-dimensional data sets and how they can be addressed.
|
|
S682 - Topics in Mathematical Statistics -- Statistical Theory I Prerequisites: S620 and some knowledge of elementary measure theory, and consent of the instructor Credits: 3 Semester: Description: Mathematically rigorous introduction to major areas of statistical theory and practice, including statistical models, sufficiency, likelihood inference, estimation and testing, Bayesian inference, decision theory, equivariance, and optimality of test statistics. The statistical program package "R" will be introduced and used.
|
|
S682 - Topics in Mathematical Statistics - Multivariate Statistical Analysis Prerequisites: Credits: Semester: Description: (3 cr.) P: STAT S721 and S722, or consent of instructor. Multivariate normal distributions. Tensor notation. Multivariate linear normal models (MANOVA), estimation and testing. Wishart distributions and models. Inference for the covariance matrix, including multivariate Bartlett's test, test of block independence, and test of sphericity. Box approximations. Eigenvalues, including canonical correlations and principal components/factor analysis.
|
|
S682 - Topics in Mathematical Statistics - Introduction to Graphical Models Prerequisites: Consent of instructor Credits: Semester: Description: INTRODUCTION TO GRAPHICAL MARKOV MODELS IN MULTIVARIATE STATISTICAL ANALYSIS.
A central aspect of statistical science is the assessment of dependences among a set of stochastic variables. The familiar concepts of correlation, regression, and prediction are manifestations of this idea, and many aspects of causal relationships ultimately rest on representations of multivariate dependence.
Graphical Markov models (GMM) use graphs, either undirected, directed, or mixed, to represent multivariate dependences in a visual and computationally efficient manner. By representing each variable as a node in a graph a GMM is usually constructed by specifying local dependences for each node of the graph in terms of its immediate neighbors, parents, or both. A GMM can thus represent a highly varied and complex system of multivariate dependences by means of the global structure of the graph. The local specification permits efficiencies in modeling, inference, and probabilistic calculations.
For a fixed graph model, the classical methods of statistical inference may be utilized. In many applied domains, however, such as expert systems for medical diagnosis, weather forecasting, or the analysis of gene-expression data, the graph is unknown and is itself the first goal of the analysis. This poses numerous challenges, including the following:
• The numbers of possible graphs and models grow superexponentially in the number of variables.
• Distinct graphs G may be Markov equivalent statistically indistinguishable.
• Conversely, the same graph may possess different Markov interpretations.
Furthermore, in applications, GMMs represent one of the most interdisciplinary topics of contemporary statistical science. Applications arise in a host of areas, e.g., computer science (expert systems, robotics, data-mining, machine learning), electrical engineering (automatic speech recognition systems, error-correcting codes), genetics (modelling gene-expression data), epidemiology (causal models), econometrics (structural equations), and behavioral science (modelling social networks).
References:
Cox,D.R. and Wermuth, N. (1996) Multivariate Dependencies: Models, Analysis, and Interpretation. Chapman and Hall, London.
Edwards, D. (2000). Introduction to Graphical Modeling, 2nd ed. Springer, New York.
Lauritzen, S.L. (1996) Graphical models. Oxford University Press, Oxford.
Whittaker, J.L. (1990) Graphical models in Applied Multivariate Statistics. Wiley, New York.
|
|
S690 - Statistical Consulting Prerequisites: Consent of instructor Credits: 4 Semester: Spring 2010 Description: This class will cover necessary skills for effective statistical consulting, and focus on applications derived from real consulting situations. Students will have the opportunity to engage with clients and present the results of their consultations in class. Along the way students will learn practical methods of data analysis, questions that need to be considered for future studies, and ways of presenting data for different purposes. An important part of the course will be the attention on a real problem from Google, Inc. Sources of data will include:
(1) IU's Statistics Consulting Center (Stephanie Dickinson, Michael Trosset)
(2) Case studies (see textbook)
(3) Designing experiments for evaluating marketing strategies under consideration at Google, Inc. (Dr. James Koehler, Senior Statistician)
|
|
S695 - Readings in Statistics Prerequisites: Credits: Semester: Description: (1-3 cr.) P: Consent of instructor. Supervised reading of a topic in statistics. May be repeated with different topics for a maximum of 12 credit hours.
|
|
S710 - Statistical Computing Prerequisites: STAT S620, or consent of instructor. Credits: 3 Semester: Description: This course will cover two aspects of statistical computing.
The first aspect will cover the use of R, a statistical computing software environment for performing statistical procedures and making graphical displays of data. Some previous exposure to R and to statistics procedures will be assumed (e.g., regression, analysis of variance, basic plotting); in this course we will focus instead on some less familiar but very useful methods (e.g., random number generation for simulations, diagnostic plots for validating model assumptions, robust methods of regression, bootstrapping for standard errors, generalized additive models, visualizng multivariate data). The second aspect will focus on some of the consequences of using the computer's finite arithmetic on statisical results (e.g., periods of random number algorithms, matrix computations, expediting calculations for smoothing algorithms such as loess, etc.). The two books required for this course address these two aspects.
Textbooks:
(1) John Maindonald and John Braun, Data Analysis and Graphics Using R, Cambridge University Press
(2) Ronald Thisted, Elements of Statistical Computing, Chapman and Hall
|
|
S721 - Advanced Statistical Theory I Prerequisites: P: S620, some knowledge of elementary measure theory, and consent of the instructor. Credits: 3 Semester: Description: Mathematical introduction to major areas of statistical theory and practice, including statistical models, sufficiency, likelihood inference, estimation and testing, Bayesian inference, decision theory, equivariance, and optimality of test statistics.
|
|
S722 - Statistical Theory II Prerequisites: P: S721 or consent of instructor. Credits: 3 Semester: Description: A continuation of S721. A mathematically rigorous introduction to major areas of statistical theory and practice including multinomial models, canonical linear models, exponential families, asymptotic theory, and general linear models.
|
|
S730 - Theory of Linear Models Prerequisites: P: STAT S620, or consent of instructor. Credits: 3 Semester: Description: Theory of the general linear model. Distribution theory, linear hypotheses, the Gauss-Markov theorem, testing and confidence regions. Application to regression and to analysis of variance.
|
|
S740 - Multivariate Statistical Theory Prerequisites: P: STAT S721 and S722, or consent of the instructor. Credits: 3 Semester: Description: Multivariate normal distributions. Multivariate linear normal models, estimation and testing. Wishart distributions and models.
Inference for the covariance matrix. Eigenvalues, including canonical correlations and principal components/factor analysis.
|
|
S781 - Advanced Topics in Applied Statistics Prerequisites: P: Consent of the instructor. Credits: 3 Semester: Description: Careful study of an advanced statistical topic from an applied perspective. As topics vary, this course may be repeated for credit.
|
|
S782 - Advanced Topics in Mathematical Statistics Prerequisites: P: Consent of the instructor. Credits: 3 Semester: Description: Careful study of an advanced statistical topic from a mathematical or theoretical perspective. As topics vary, this course may be repeated for credit.
|
|
S799 - Research in Statistics Prerequisites: P: Consent of the instructor. Credits: Variable 1 Semester: Description:
|
|