Ciclo de Palestras 2010 – 2º Semestre

Palestras do Departamento de Metodos Estatísticos - Instituto de Matemática - UFRJ

2° semestre de 2012
As palestras são realizadas na sala C-116 do Centro Tecnológico as 15:30 h, a menos que ocorra aviso em contrário.


We introduce a new and general set of identifiability conditions for factor models which handles the ordering problem associated with current common practice. In addition, the new class of parsimonious Bayesian factor analysis leads to a factor loading matrix representation which is an intuitive and easy to implement factor selection scheme. We argue that the structuring the factor loadings matrix is in concordance with recent trends in applied factor analysis. Our MCMC scheme for posterior inference makes several improvements over the existing alternatives while outlining various strategies for conditional posterior inference in a factor selection scenario. Four applications, two based on synthetic data and two based on well known real data, are introduced to illustrate the applicability and generality of the new class of parsimonious factor models, as well as to highlight features of the proposed sampling schemes. We want to discuss the advantages of some resampling penalties in a general density estimation framework. We will see how they can be calibrate when the data are independent, or at least mixing. We will also discuss the slope heuristic and justify that it can also be used in a non necessarily independent framework. We obtain for all these methods asymptotically optimal oracle inequalities under a few conditions on the collections of models.


The last 10 years have seen a large increase in statistical methodology for diffusions, and computationally intensive Bayesian methods using data augmentation have been particulary prominent. This activity has been fuelled by existing and emerging applications in economics, biology, genetics, chemistry, physics and engineering. However diffusions have continuous sample paths so may natural continuous time phenomena require more general classes of models. Jump-diffusions have considerable appeal as exible families of stochastic models. Bayesian inference for jump-diffusion models motivates new methodological challenges, in particular requires the construction of novel simulation schemes for use within data augmentation algorithms and within discretely observed data. In this paper we propose a new methodology for exact simulation of jump-diffusion processes. Such method is based on the recently introduced Exact Algorithm for exact simulation of diffusions. We also propose a simulation-based method to make likelihood-based inference for discretely observed jump-diffusions in a Bayesian framework. Simulated examples are presented to illustrate the proposed methodology.


Apresentaremos alguns modelos bivariados de volatilidade estocástica comumente utilizado no estudo de series de tempo em finanças. Estes modelos serão aplicados às médias semanais de máximos diários de ozônio da Cidade do México. Os modelos serão utilizados para analisar os dados de pares de regiões nas quais a cidade esta dividida. Estes resultados foram obtidos conjuntamente com Jorge. A. Achcar e Henrique C. Zozolotto.


Multi-gigabyte data sets challenge and frustrate R users even on well-equipped hard-ware. Use of C/C++ can provide efficiencies, but is cumbersome for interactive data analysis and lacks the flexibility and power of R’s rich statistical programming environment. The package bigmemory and its sister packages biganalytics, synchronicity, and bigalgebra bridge this gap, implementing massive matrices and supporting their basic manipulation and exploration. The data structures may also be file-backed, allowing users to more easily manage and analyze data sets larger than available RAM and potentially share them across nodes of a cluster. These features of the Bigmemory Project open the door for powerful and memory-efficient parallel analyses and data mining of massive data sets.


The Brownian Web (BW) is a family of coalescing Brownian motion starting from every point in space time RxR. In this work, we consider a set A of fixed points belonging to circles of radio k, k=1,2,…,n where n is a non negative integer number. Then, we consider a family of coalescing random walks starting from A. After diffusive scaling we obtain the convergence in distribution of this coalescing random walk to what we call “The Radial Brownian Web” on a metric space which is the restriction of the metric space where the “usual” BW takes its values.


Our article deals with Bayesian inference for a general state space model with the simulated likelihood computed by the particle filter. We show empirically that the partially or fully adapted particle filters can be much more efficient than the standard particle filter, especially when the signal to noise ratio is high. This is especially important because using the particle filter within Markov chain Monte Carlo sampling is O(T2), where T is the sample size. We also show that an adaptive independent Metropolis Hastings proposal for the unknown parameters based on a mixture of normals can be much more efficient than the usual optimal random walk methods because the simulated likelihood is not continuous in the parameters and the cost of constructing a good adaptive proposal is negligible compared to the cost of evaluating the simulated likelihood. Independent Metropolis-Hastings proposals are also attractive because they are easy to run in parallel on multiple processors. The article also shows that the proposed adaptive independent Metropolis Hastings sampler converges to the posterior distribution. We also show that the marginal likelihood of any state space model can be obtained in an efficient and unbiased manner by using the particle filter making model comparison straightforward. Obtaining the marginal likelihood is often difficult using other methods. Finally, we prove that the simulated likelihood obtained by the auxiliary particle filter is unbiased. This result is fundamental to using the particle filter for Markov chain Monte Carlo sampling.


When modeling the effect of a covariate X on a dependent variable Y, in many practical cases it can be natural to assume a monotone relationship between Y and X. In this talk, we will study an estimator that only assumes this monotonicity, and not any other parametric form of the regression function. We will consider the local limit behavior of this non-parametric estimator and present a theorem about its adaptivity and local optimality.


In this talk, an M-estimation-based criterion is proposed for carrying out change point analysis and variable selection simultaneously in linear models with a possible change point. Under some weak conditions, this criterion is shown to be strongly consistent in the sense that with probability one, it chooses the smallest true model for large sample size. Its byproducts include consistent estimates of the regression coefficients regardless if there is a change point. In case that there is a change point, its byproducts also include a consistent estimate of the change point parameter. In addition, an algorithm is given which has significantly reduced the computation time needed by the proposed criterion for the same precision. Data examples are also presented, which include results from a simulation study and a real data example.
Based on a joint work with Prof. Rao and Mr. Shi.

17/11 (excepcionalmente as 13:30 h)

O DARPA Grand Challenge é uma competição na qual veículos autônomos tem que completar uma rota de 300 milhas repleta de obstáculos. Os veículos têm que completar o trajeto de forma completamente autônoma sem nenhuma influência externa. Nosso artigo tem como inspiração este problema, mas trabalha somente em uma pequena parte do problema. Neste trabalho propomos um algoritmo eficiente para encontrar a trajetória suave mais curta entre dois pontos evitando obstáculos colocados. Os obstáculos são medidos através de um mecanismo markoviano que corrige o sensor utilizando a medição anterior através de um filtro de Kalman.


Este estudo apresenta estimadores consistentes para os parâmetros de um modelo autoregressivo vetorial sujeito a erros de medição. A distribuição assintótica dos estimadores é derivada. No caso de dados com erros de medida, os métodos existentes na literatura não podem ser utilizados, pois sob a hipótese nula (não-causalidade de Granger) o modelo se torna não-identificável. Conduzimos estudos de simulação que indicam uma interferência drástica do erro de medição nas conclusões dos testes de hipóteses. O método é aplicado a dados de fMRI (functional magnetic resonance imaging), para detectar os fluxos de informação entre regiões cerebrais.
Trabalho conjunto com João R. Sato (Federal do ABC) e Betsabé G. Blas Achic (UFPE).


We show the stability and ultimate boundedness (in mean square sense) of well known financial models for interest rates. As a consequence we derive the existence of invariant measure and recurrence properties of these solutions. The main technique involves the use of Lyapunov function methods developed by R. Khasminskii and Y. Miahara.

In this talk, we want to discuss the advantages of some resampling penalties in a general density estimation framework. We will see how they can be calibrate when the data are independent, or at least mixing. We will also discuss the slope heuristic and justify that it can also be used in a non necessarily independent framework. We obtain for all these methods asymptotically optimal oracle inequalities under a few conditions on the collections of models.

Since MCMC algorithms have been available there has been an explosion of applications using parametric Bayesian statistical models. These of course need prior probabilistic inputs to run, but various ways of setting default priors have been advanced which at least appear to stabilize the numerical algorithms. But to what extent can we believe that the ensuing inferences are reliable and are not too sensitive to the way we initialize this process? In this talk I will demonstrate that however much data we collect there are some aspects of typical Bayesian models we can never learn about. So inference about these quantities just reflects the prior we choose. On the other hand, provided we are careful it is straightforward to demonstrate that, under broad assumptions many types of inference are robust. Indeed simple bounds can be calculated, based on the values of typical summaries calculated from numerical methods which reflect how different inferences would be, given different candidate priors. The talk will be illustrated throughout by familiar examples. This is a joint work with Fabio Rigat and Ali Daneshkhah.